Totalvi

Performs mapping to the reference by totalvi model: https://docs.scvi-tools.org/en/stable/tutorials/notebooks/scarches_scvi_tools.html#Reference-mapping-with-TOTALVI

Info

ID: totalvi
Namespace: integrate

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 1.0.1 -latest \
  -main-script target/nextflow/integrate/totalvi/main.nf \
  --help

Run command

Example of params.yaml
# Inputs
input: # please fill in - example: "path/to/file"
reference: # please fill in - example: "path/to/file"
force_retrain: false
query_modality: "rna"
# query_proteins_modality: "foo"
reference_modality: "rna"
reference_proteins_modality: "prot"
# input_layer: "foo"
obs_batch: "sample_id"
# var_input: "foo"

# Outputs
# output: "$id.$key.output.output"
obsm_output: "X_integrated_totalvi"
obsm_normalized_rna_output: "X_totalvi_normalized_rna"
obsm_normalized_protein_output: "X_totalvi_normalized_protein"
# reference_model_path: "$id.$key.reference_model_path.reference_model_path"
# query_model_path: "$id.$key.query_model_path.query_model_path"

# Learning parameters
max_epochs: 400
max_query_epochs: 200
weight_decay: 0.0

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
  -r 1.0.1 -latest \
  -profile docker \
  -main-script target/nextflow/integrate/totalvi/main.nf \
  -params-file params.yaml
Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Inputs

Name Description Attributes
--input Input h5mu file with query data to integrate with reference. file, required
--reference Input h5mu file with reference data to train the TOTALVI model. file, required
--force_retrain If true, retrain the model and save it to reference_model_path boolean_true
--query_modality string, default: "rna"
--query_proteins_modality Name of the modality in the input (query) h5mu file containing protein data string
--reference_modality string, default: "rna"
--reference_proteins_modality Name of the modality containing proteins in the reference string, default: "prot"
--input_layer Input layer to use. If None, X is used string
--obs_batch Column name discriminating between your batches. string, default: "sample_id"
--var_input .var column containing highly variable genes. By default, do not subset genes. string

Outputs

Name Description Attributes
--output Output h5mu file. file, required
--obsm_output In which .obsm slot to store the resulting integrated embedding. string, default: "X_integrated_totalvi"
--obsm_normalized_rna_output In which .obsm slot to store the normalized RNA from TOTALVI. string, default: "X_totalvi_normalized_rna"
--obsm_normalized_protein_output In which .obsm slot to store the normalized protein data from TOTALVI. string, default: "X_totalvi_normalized_protein"
--reference_model_path Directory with the reference model. If not exists, trained model will be saved there file, default: "totalvi_model_reference"
--query_model_path Directory, where the query model will be saved file, default: "totalvi_model_query"

Learning parameters

Name Description Attributes
--max_epochs Number of passes through the dataset integer, default: 400
--max_query_epochs Number of passes through the dataset, when fine-tuning model for query integer, default: 200
--weight_decay Weight decay, when fine-tuning model for query double, default: 0

Authors

  • Vladimir Shitov