Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.
Argument groups
Inputs
Name
Description
Attributes
--input
Input h5mu file
file, required
--modality
string, default: "rna"
--input_layer
Input layer to use. If None, X is used
string
--obs_batch
Column name discriminating between your batches.
string, default: "sample_id"
--var_input
.var column containing highly variable genes. By default, do not subset genes.
string
--obs_labels
Key in adata.obs for label information. Categories will automatically be converted into integer categories and saved to adata.obs[’_scvi_labels’]. If None, assigns the same label to all the data.
string
--obs_size_factor
Key in adata.obs for size factor information. Instead of using library size as a size factor, the provided size factor column will be used as offset in the mean of the likelihood. Assumed to be on linear scale.
string
--obs_categorical_covariate
Keys in adata.obs that correspond to categorical data. These covariates can be added in addition to the batch covariate and are also treated as nuisance factors (i.e., the model tries to minimize their effects on the latent space). Thus, these should not be used for biologically-relevant factors that you do not want to correct for.
List of string, multiple_sep: ";"
--obs_continuous_covariate
Keys in adata.obs that correspond to continuous data. These covariates can be added in addition to the batch covariate and are also treated as nuisance factors (i.e., the model tries to minimize their effects on the latent space). Thus, these should not be used for biologically-relevant factors that you do not want to correct for.
List of string, multiple_sep: ";"
Outputs
Name
Description
Attributes
--output
Output h5mu file.
file, required
--output_model
Folder where the state of the trained model will be saved to.
file
--output_compression
The compression format to be used on the output h5mu object.
string, example: "gzip"
--obsm_output
In which .obsm slot to store the resulting integrated embedding.
string, default: "X_scvi_integrated"
SCVI options
Name
Description
Attributes
--n_hidden_nodes
Number of nodes per hidden layer.
integer, default: 128
--n_dimensions_latent_space
Dimensionality of the latent space.
integer, default: 30
--n_hidden_layers
Number of hidden layers used for encoder and decoder neural-networks.
integer, default: 2
--dropout_rate
Dropout rate for the neural networks.
double, default: 0.1
--dispersion
Set the behavior for the dispersion for negative binomial distributions: - gene: dispersion parameter of negative binomial is constant per gene across cells - gene-batch: dispersion can differ between different batches - gene-label: dispersion can differ between different labels - gene-cell: dispersion can differ for every gene in every cell
string, default: "gene"
--gene_likelihood
Model used to generate the expression data from a count-based likelihood distribution. - nb: Negative binomial distribution - zinb: Zero-inflated negative binomial distribution - poisson: Poisson distribution
string, default: "nb"
Variational auto-encoder model options
Name
Description
Attributes
--use_layer_normalization
Neural networks for which to enable layer normalization.
string, default: "both"
--use_batch_normalization
Neural networks for which to enable batch normalization.
string, default: "none"
--encode_covariates
Whether to concatenate covariates to expression in encoder
boolean_false
--deeply_inject_covariates
Whether to concatenate covariates into output of hidden layers in encoder/decoder. This option only applies when n_layers > 1. The covariates are concatenated to the input of subsequent hidden layers.
boolean_true
--use_observed_lib_size
Use observed library size for RNA as scaling factor in mean of conditional distribution.
boolean_true
Early stopping arguments
Name
Description
Attributes
--early_stopping
Whether to perform early stopping with respect to the validation set.
boolean
--early_stopping_monitor
Metric logged during validation set epoch.
string, default: "elbo_validation"
--early_stopping_patience
Number of validation epochs with no improvement after which training will be stopped.
integer, default: 45
--early_stopping_min_delta
Minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement.
double, default: 0
Learning parameters
Name
Description
Attributes
--max_epochs
Number of passes through the dataset, defaults to (20000 / number of cells) * 400 or 400; whichever is smallest.
integer
--reduce_lr_on_plateau
Whether to monitor validation loss and reduce learning rate when validation set lr_scheduler_metric plateaus.
boolean, default: TRUE
--lr_factor
Factor to reduce learning rate.
double, default: 0.6
--lr_patience
Number of epochs with no improvement after which learning rate will be reduced.
double, default: 30
Data validition
Name
Description
Attributes
--n_obs_min_count
Minimum number of cells threshold ensuring that every obs_batch category has sufficient observations (cells) for model training.
integer, default: 0
--n_var_min_count
Minimum number of genes threshold ensuring that every var_input filter has sufficient observations (genes) for model training.