Rna multisample

Processing unimodal multi-sample RNA transcriptomics data.

Info

ID: rna_multisample
Namespace: workflows/rna

Links

Source

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 2.1.1 -latest \
  -main-script target/nextflow/workflows/rna/rna_multisample/main.nf \
  --help

Run command

Example of params.yaml

# Inputs
id: # please fill in - example: "concatenated"
input: # please fill in - example: "dataset.h5mu"
modality: "rna"
# layer: "foo"

# Output
# output: "$id.$key.output.h5mu"

# Filtering highly variable features
highly_variable_features_var_output: "filter_with_hvg"
highly_variable_features_obs_batch_key: "sample_id"
highly_variable_features_flavor: "seurat"
# highly_variable_features_n_top_features: 123

# QC metrics calculation options
var_qc_metrics: ["filter_with_hvg"]
top_n_vars: [50, 100, 200, 500]
output_obs_num_nonzero_vars: "num_nonzero_vars"
output_obs_total_counts_vars: "total_counts"
output_var_num_nonzero_obs: "num_nonzero_obs"
output_var_total_counts_obs: "total_counts"
output_var_obs_mean: "obs_mean"
output_var_pct_dropout: "pct_dropout"

# RNA Scaling options
enable_scaling: false
scaling_output_layer: "scaled"
# scaling_max_value: 123.0
scaling_zero_center: true

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"

# Arguments

nextflow run openpipelines-bio/openpipeline \
  -r 2.1.1 -latest \
  -profile docker \
  -main-script target/nextflow/workflows/rna/rna_multisample/main.nf \
  -params-file params.yaml

Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Inputs

Name	Description	Attributes
`--id`	ID of the concatenated file	`string`, required, example: `"concatenated"`
`--input`	Path to the samples.	`file`, required, example: `"dataset.h5mu"`
`--modality`	Modality to process.	`string`, default: `"rna"`
`--layer`	Input layer to use. If not specified, .X is used.	`string`

Output

Name	Description	Attributes
`--output`	Destination path to the output.	`file`, required, example: `"output.h5mu"`

Filtering highly variable features

Name	Description	Attributes
`--highly_variable_features_var_output`	In which .var slot to store a boolean array corresponding to the highly variable features.	`string`, default: `"filter_with_hvg"`
`--highly_variable_features_obs_batch_key`	If specified, highly-variable features are selected within each batch separately and merged. This simple process avoids the selection of batch-specific features and acts as a lightweight batch correction method. For all flavors, featues are first sorted by how many batches they are highly variable. For dispersion-based flavors ties are broken by normalized dispersion. If flavor = ‘seurat_v3’, ties are broken by the median (across batches) rank based on within-batch normalized variance.	`string`, default: `"sample_id"`
`--highly_variable_features_flavor`	Choose the flavor for identifying highly variable features. For the dispersion based methods in their default workflows, Seurat passes the cutoffs whereas Cell Ranger passes n_top_features.	`string`, default: `"seurat"`
`--highly_variable_features_n_top_features`	Number of highly-variable features to keep. Mandatory if filter_with_hvg_flavor is set to ‘seurat_v3’.	`integer`

QC metrics calculation options

Name	Description	Attributes
`--var_qc_metrics`	Keys to select a boolean (containing only True or False) column from .var. For each cell, calculate the proportion of total values for genes which are labeled ‘True’, compared to the total sum of the values for all genes.	List of `string`, default: `"filter_with_hvg"`, example: `"ercc,highly_variable"`, multiple_sep: `","`
`--top_n_vars`	Number of top vars to be used to calculate cumulative proportions. If not specified, proportions are not calculated. `--top_n_vars 20,50` finds cumulative proportion to the 20th and 50th most expressed vars.	List of `integer`, default: `50, 100, 200, 500`, multiple_sep: `","`
`--output_obs_num_nonzero_vars`	Name of column in .obs describing, for each observation, the number of stored values (including explicit zeroes). In other words, the name of the column that counts for each row the number of columns that contain data.	`string`, default: `"num_nonzero_vars"`
`--output_obs_total_counts_vars`	Name of the column for .obs describing, for each observation (row), the sum of the stored values in the columns.	`string`, default: `"total_counts"`
`--output_var_num_nonzero_obs`	Name of column describing, for each feature, the number of stored values (including explicit zeroes). In other words, the name of the column that counts for each column the number of rows that contain data.	`string`, default: `"num_nonzero_obs"`
`--output_var_total_counts_obs`	Name of the column in .var describing, for each feature (column), the sum of the stored values in the rows.	`string`, default: `"total_counts"`
`--output_var_obs_mean`	Name of the column in .obs providing the mean of the values in each row.	`string`, default: `"obs_mean"`
`--output_var_pct_dropout`	Name of the column in .obs providing for each feature the percentage of observations the feature does not appear on (i.e. is missing). Same as `--num_nonzero_obs` but percentage based.	`string`, default: `"pct_dropout"`

RNA Scaling options

Options for enabling scaling of the log-normalized data to unit variance and zero mean. The scaled data will be output a different layer and representation with reduced dimensions will be created and stored in addition to the non-scaled data.

Name	Description	Attributes
`--enable_scaling`	Enable scaling for the RNA modality.	`boolean_true`
`--scaling_output_layer`	Output layer where the scaled log-normalized data will be stored.	`string`, default: `"scaled"`
`--scaling_max_value`	Clip (truncate) data to this value after scaling. If not specified, do not clip.	`double`
`--scaling_zero_center`	If set, omit zero-centering variables, which allows to handle sparse input efficiently.”	`boolean_false`

Authors

Dries De Maeyer (author)
Robrecht Cannoodt (author, maintainer)
Dries Schaumont (author)

Visualisation

flowchart TB
    v0(Channel.fromList)
    v2(filter)
    v10(filter)
    v18(normalize_total)
    v25(cross)
    v35(cross)
    v41(filter)
    v49(log1p)
    v56(cross)
    v66(cross)
    v72(filter)
    v80(delete_layer)
    v87(cross)
    v97(cross)
    v106(branch)
    v133(concat)
    v111(scale)
    v118(cross)
    v128(cross)
    v134(filter)
    v142(highly_variable_features_scanpy)
    v149(cross)
    v159(cross)
    v165(filter)
    v288(concat)
    v177(branch)
    v204(concat)
    v189(cross)
    v199(cross)
    v208(branch)
    v235(concat)
    v220(cross)
    v230(cross)
    v236(filter)
    v266(concat)
    v251(cross)
    v261(cross)
    v273(cross)
    v283(cross)
    v295(cross)
    v302(cross)
    v314(cross)
    v321(cross)
    v325(Output)
    subgraph group_rna_qc [rna_qc]
        v182(grep_mitochondrial_genes)
        v213(grep_ribosomal_genes)
        v244(calculate_qc_metrics)
    end
    v106-->v133
    v133-->v134
    v177-->v204
    v208-->v235
    v235-->v236
    v0-->v2
    v2-->v10
    v10-->v18
    v18-->v25
    v10-->v25
    v10-->v35
    v41-->v49
    v49-->v56
    v41-->v56
    v41-->v66
    v72-->v80
    v80-->v87
    v72-->v87
    v72-->v97
    v106-->v111
    v111-->v118
    v106-->v118
    v106-->v128
    v128-->v133
    v134-->v142
    v142-->v149
    v134-->v149
    v134-->v159
    v177-->v182
    v182-->v189
    v177-->v189
    v177-->v199
    v199-->v204
    v208-->v213
    v213-->v220
    v208-->v220
    v208-->v230
    v230-->v235
    v236-->v244
    v244-->v251
    v236-->v251
    v236-->v261
    v261-->v266
    v266-->v273
    v165-->v273
    v165-->v283
    v283-->v288
    v288-->v295
    v2-->v295
    v295-->v302
    v2-->v302
    v2-->v314
    v314-->v321
    v2-->v321
    v321-->v325
    v35-->v41
    v18-->v35
    v66-->v72
    v49-->v66
    v80-->v97
    v97-->v106
    v111-->v128
    v159-->v165
    v142-->v159
    v165-->v177
    v182-->v199
    v204-->v208
    v213-->v230
    v244-->v261
    v266-->v283
    v288-->v314
    style group_rna_qc fill:#F0F0F0,stroke:#969696;
    style v0 fill:#e3dcea,stroke:#7a4baa;
    style v2 fill:#e3dcea,stroke:#7a4baa;
    style v10 fill:#e3dcea,stroke:#7a4baa;
    style v18 fill:#e3dcea,stroke:#7a4baa;
    style v25 fill:#e3dcea,stroke:#7a4baa;
    style v35 fill:#e3dcea,stroke:#7a4baa;
    style v41 fill:#e3dcea,stroke:#7a4baa;
    style v49 fill:#e3dcea,stroke:#7a4baa;
    style v56 fill:#e3dcea,stroke:#7a4baa;
    style v66 fill:#e3dcea,stroke:#7a4baa;
    style v72 fill:#e3dcea,stroke:#7a4baa;
    style v80 fill:#e3dcea,stroke:#7a4baa;
    style v87 fill:#e3dcea,stroke:#7a4baa;
    style v97 fill:#e3dcea,stroke:#7a4baa;
    style v106 fill:#e3dcea,stroke:#7a4baa;
    style v133 fill:#e3dcea,stroke:#7a4baa;
    style v111 fill:#e3dcea,stroke:#7a4baa;
    style v118 fill:#e3dcea,stroke:#7a4baa;
    style v128 fill:#e3dcea,stroke:#7a4baa;
    style v134 fill:#e3dcea,stroke:#7a4baa;
    style v142 fill:#e3dcea,stroke:#7a4baa;
    style v149 fill:#e3dcea,stroke:#7a4baa;
    style v159 fill:#e3dcea,stroke:#7a4baa;
    style v165 fill:#e3dcea,stroke:#7a4baa;
    style v288 fill:#e3dcea,stroke:#7a4baa;
    style v177 fill:#e3dcea,stroke:#7a4baa;
    style v204 fill:#e3dcea,stroke:#7a4baa;
    style v182 fill:#e3dcea,stroke:#7a4baa;
    style v189 fill:#e3dcea,stroke:#7a4baa;
    style v199 fill:#e3dcea,stroke:#7a4baa;
    style v208 fill:#e3dcea,stroke:#7a4baa;
    style v235 fill:#e3dcea,stroke:#7a4baa;
    style v213 fill:#e3dcea,stroke:#7a4baa;
    style v220 fill:#e3dcea,stroke:#7a4baa;
    style v230 fill:#e3dcea,stroke:#7a4baa;
    style v236 fill:#e3dcea,stroke:#7a4baa;
    style v266 fill:#e3dcea,stroke:#7a4baa;
    style v244 fill:#e3dcea,stroke:#7a4baa;
    style v251 fill:#e3dcea,stroke:#7a4baa;
    style v261 fill:#e3dcea,stroke:#7a4baa;
    style v273 fill:#e3dcea,stroke:#7a4baa;
    style v283 fill:#e3dcea,stroke:#7a4baa;
    style v295 fill:#e3dcea,stroke:#7a4baa;
    style v302 fill:#e3dcea,stroke:#7a4baa;
    style v314 fill:#e3dcea,stroke:#7a4baa;
    style v321 fill:#e3dcea,stroke:#7a4baa;
    style v325 fill:#e3dcea,stroke:#7a4baa;