Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.
Argument groups
Inputs
Name
Description
Attributes
--id
ID of the sample.
string, required, example: "foo"
--input
Path to the sample.
file, required, example: "input.h5mu"
Outputs
Name
Description
Attributes
--output
Destination path to the output.
file, required, example: "output.h5mu"
Sample ID options
Options for adding the id to .obs on the MuData object. Having a sample id present in a requirement of several components for this pipeline.
Name
Description
Attributes
--add_id_to_obs
Add the value passed with –id to .obs.
boolean, default: TRUE
--add_id_obs_output
.Obs column to add the sample IDs to. Required and only used when –add_id_to_obs is set to ‘true’
string, default: "sample_id"
--add_id_make_observation_keys_unique
Join the id to the .obs index (.obs_names). Only used when –add_id_to_obs is set to ‘true’.
boolean, default: TRUE
RNA filtering options
Name
Description
Attributes
--rna_min_counts
Minimum number of counts captured per cell.
integer, example: 200
--rna_max_counts
Maximum number of counts captured per cell.
integer, example: 5000000
--rna_min_genes_per_cell
Minimum of non-zero values per cell.
integer, example: 200
--rna_max_genes_per_cell
Maximum of non-zero values per cell.
integer, example: 1500000
--rna_min_cells_per_gene
Minimum of non-zero values per gene.
integer, example: 3
--rna_min_fraction_mito
Minimum fraction of UMIs that are mitochondrial.
double, example: 0
--rna_max_fraction_mito
Maximum fraction of UMIs that are mitochondrial.
double, example: 0.2
CITE-seq filtering options
Name
Description
Attributes
--prot_min_counts
Minimum number of counts per cell.
integer, example: 3
--prot_max_counts
Minimum number of counts per cell.
integer, example: 5000000
--prot_min_proteins_per_cell
Minimum of non-zero values per cell.
integer, example: 200
--prot_max_proteins_per_cell
Maximum of non-zero values per cell.
integer, example: 100000000
--prot_min_cells_per_protein
Minimum of non-zero values per protein.
integer, example: 3
Highly variable gene detection
Name
Description
Attributes
--filter_with_hvg_var_output
In which .var slot to store a boolean array corresponding to the highly variable genes.
string, default: "filter_with_hvg"
--filter_with_hvg_obs_batch_key
If specified, highly-variable genes are selected within each batch separately and merged. This simple process avoids the selection of batch-specific genes and acts as a lightweight batch correction method.
string, default: "sample_id"
Mitochondrial Gene Detection
Name
Description
Attributes
--var_name_mitochondrial_genes
In which .var slot to store a boolean array corresponding the mitochondrial genes.
string
--obs_name_mitochondrial_fraction
When specified, write the fraction of counts originating from mitochondrial genes (based on –mitochondrial_gene_regex) to an .obs column with the specified name. Requires –var_name_mitochondrial_genes.
string
--var_gene_names
.var column name to be used to detect mitochondrial genes instead of .var_names (default if not set). Gene names matching with the regex value from –mitochondrial_gene_regex will be identified as a mitochondrial gene.
string, example: "gene_symbol"
--mitochondrial_gene_regex
Regex string that identifies mitochondrial genes from –var_gene_names. By default will detect human and mouse mitochondrial genes from a gene symbol.
string, default: "^[mM][tT]-"
QC metrics calculation options
Name
Description
Attributes
--var_qc_metrics
Keys to select a boolean (containing only True or False) column from .var. For each cell, calculate the proportion of total values for genes which are labeled ‘True’, compared to the total sum of the values for all genes. Defaults to the combined values specified for –var_name_mitochondrial_genes and –filter_with_hvg_var_output.
List of string, example: "ercc,highly_variable", multiple_sep: ","
--top_n_vars
Number of top vars to be used to calculate cumulative proportions. If not specified, proportions are not calculated. --top_n_vars 20,50 finds cumulative proportion to the 20th and 50th most expressed vars.
List of integer, default: 50, 100, 200, 500, multiple_sep: ","