Qc
A pipeline to add basic qc statistics to a MuData
Info
ID: qc
Namespace: qc
Links
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 0.12.0 -latest \
-main-script ./workflows/qc/qc/main.nf \
--help
Run command
Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: "input.h5mu"
modality: "rna"
# layer: "raw_counts"
# Outputs
# output: "$id.$key.output.h5mu"
# Mitochondrial Gene Detection
# var_name_mitochondrial_genes: "foo"
# obs_name_mitochondrial_fraction: "foo"
# var_gene_names: "gene_symbol"
mitochondrial_gene_regex: "^[mM][tT]-"
# QC metrics calculation options
# var_qc_metrics: ["ercc", "highly_variable"]
top_n_vars: [50, 100, 200, 500]
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 0.12.0 -latest \
-profile docker \
-main-script ./workflows/qc/qc/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Inputs
Name | Description | Attributes |
---|---|---|
--id |
ID of the sample. | string , required, example: "foo" |
--input |
Path to the sample. | file , required, example: "input.h5mu" |
--modality |
Which modality to process. | string , default: "rna" |
--layer |
Layer to calculate qc metrics for. | string , example: "raw_counts" |
Mitochondrial Gene Detection
Name | Description | Attributes |
---|---|---|
--var_name_mitochondrial_genes |
In which .var slot to store a boolean array corresponding the mitochondrial genes. | string |
--obs_name_mitochondrial_fraction |
.Obs slot to store the fraction of reads found to be mitochondrial. Defaults to ‘fraction_’ suffixed by the value of –var_name_mitochondrial_genes | string |
--var_gene_names |
.var column name to be used to detect mitochondrial genes instead of .var_names (default if not set). Gene names matching with the regex value from –mitochondrial_gene_regex will be identified as a mitochondrial gene. | string , example: "gene_symbol" |
--mitochondrial_gene_regex |
Regex string that identifies mitochondrial genes from –var_gene_names. By default will detect human and mouse mitochondrial genes from a gene symbol. | string , default: "^[mM][tT]-" |
QC metrics calculation options
Name | Description | Attributes |
---|---|---|
--var_qc_metrics |
Keys to select a boolean (containing only True or False) column from .var. For each cell, calculate the proportion of total values for genes which are labeled ‘True’, compared to the total sum of the values for all genes. Defaults to the value from –var_name_mitochondrial_genes. | List of string , example: "ercc,highly_variable" , multiple_sep: "," |
--top_n_vars |
Number of top vars to be used to calculate cumulative proportions. If not specified, proportions are not calculated. --top_n_vars 20,50 finds cumulative proportion to the 20th and 50th most expressed vars. |
List of integer , default: 50, 100, 200, 500 , multiple_sep: "," |
Outputs
Name | Description | Attributes |
---|---|---|
--output |
Destination path to the output. | file , required, example: "output.h5mu" |
Visualisation
flowchart LR v0(Input) v2(toSortedList) v4(flatMap) v7(filter) v13(grep_annotation_column) v15(join) v19(mix) v18(filter) v25(calculate_qc_metrics) v27(join) v35(publish) v37(join) v41(toSortedList) v43(Output) v18-->v19 v0-->v2 v2-->v4 v4-->v7 v4-->v18 v7-->v15 v7-->v13 v13-->v15 v15-->v19 v19-->v27 v19-->v25 v25-->v27 v27-->v37 v27-->v35 v35-->v37 v37-->v41 v41-->v43