flowchart TB v55(flatMap) v370(mix) v372(mix) v388(merge) v404(branch) v478(concat) v482(branch) v556(concat) v565(publish) v593(Output) subgraph group_split_modalities_workflow [split_modalities_workflow] v0(Channel.fromList) v11(filter) v27(split_modalities_component) end subgraph group_rna_multisample [rna_multisample] v78(normalize_total) v98(log1p) v118(delete_layer) v138(highly_variable_features_scanpy) v150(filter) v168(grep_annotation_column) v181(mix) v190(calculate_qc_metrics) v210(publish) end subgraph group_prot_multisample [prot_multisample] v263(clr) v275(filter) v293(grep_annotation_column) v306(mix) v315(calculate_qc_metrics) v335(publish) end subgraph group_dimensionality_reduction_rna [dimensionality_reduction_rna] v417(pca) v437(find_neighbors) v457(umap) end subgraph group_dimensionality_reduction_prot [dimensionality_reduction_prot] v495(pca) v515(find_neighbors) v535(umap) end v370-->v372 v404-->v478 v482-->v556 v0-->v11 v11-->v27 v27-->v55 v55-->v78 v78-->v98 v98-->v118 v118-->v138 v138-->v150 v150-->v168 v168-->v181 v150-->v181 v181-->v190 v190-->v210 v210-->v370 v55-->v263 v263-->v275 v275-->v293 v293-->v306 v275-->v306 v306-->v315 v315-->v335 v335-->v370 v55-->v372 v372-->v388 v388-->v404 v404-->v417 v417-->v437 v437-->v457 v457-->v478 v478-->v482 v482-->v495 v495-->v515 v515-->v535 v535-->v556 v556-->v565 v565-->v593 style group_split_modalities_workflow fill:#F0F0F0,stroke:#969696; style group_rna_multisample fill:#F0F0F0,stroke:#969696; style group_prot_multisample fill:#F0F0F0,stroke:#969696; style group_dimensionality_reduction_rna fill:#F0F0F0,stroke:#969696; style group_dimensionality_reduction_prot fill:#F0F0F0,stroke:#969696; style v0 fill:#e3dcea,stroke:#7a4baa; style v11 fill:#e3dcea,stroke:#7a4baa; style v27 fill:#e3dcea,stroke:#7a4baa; style v55 fill:#e3dcea,stroke:#7a4baa; style v78 fill:#e3dcea,stroke:#7a4baa; style v98 fill:#e3dcea,stroke:#7a4baa; style v118 fill:#e3dcea,stroke:#7a4baa; style v138 fill:#e3dcea,stroke:#7a4baa; style v150 fill:#e3dcea,stroke:#7a4baa; style v168 fill:#e3dcea,stroke:#7a4baa; style v181 fill:#e3dcea,stroke:#7a4baa; style v190 fill:#e3dcea,stroke:#7a4baa; style v210 fill:#e3dcea,stroke:#7a4baa; style v370 fill:#e3dcea,stroke:#7a4baa; style v263 fill:#e3dcea,stroke:#7a4baa; style v275 fill:#e3dcea,stroke:#7a4baa; style v293 fill:#e3dcea,stroke:#7a4baa; style v306 fill:#e3dcea,stroke:#7a4baa; style v315 fill:#e3dcea,stroke:#7a4baa; style v335 fill:#e3dcea,stroke:#7a4baa; style v372 fill:#e3dcea,stroke:#7a4baa; style v388 fill:#e3dcea,stroke:#7a4baa; style v404 fill:#e3dcea,stroke:#7a4baa; style v478 fill:#e3dcea,stroke:#7a4baa; style v417 fill:#e3dcea,stroke:#7a4baa; style v437 fill:#e3dcea,stroke:#7a4baa; style v457 fill:#e3dcea,stroke:#7a4baa; style v482 fill:#e3dcea,stroke:#7a4baa; style v556 fill:#e3dcea,stroke:#7a4baa; style v495 fill:#e3dcea,stroke:#7a4baa; style v515 fill:#e3dcea,stroke:#7a4baa; style v535 fill:#e3dcea,stroke:#7a4baa; style v565 fill:#e3dcea,stroke:#7a4baa; style v593 fill:#e3dcea,stroke:#7a4baa;
Process batches
This workflow serves as an entrypoint into the ‘full_pipeline’ in order to re-run the multisample processing and the integration setup.
Info
ID: process_batches
Namespace: workflows/multiomics
Links
An input .h5mu file will first be split in order to run the multisample processing per modality. Next, the modalities are merged again and the integration setup pipeline is executed. Please note that this workflow assumes that samples from multiple pipelines are already concatenated.
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 1.0.1 -latest \
-main-script target/nextflow/workflows/multiomics/process_batches/main.nf \
--help
Run command
Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: ["input.h5mu"]
# rna_layer: "foo"
# prot_layer: "foo"
# Outputs
# output: "$id.$key.output.h5mu"
# Highly variable features detection
highly_variable_features_var_output: "filter_with_hvg"
highly_variable_features_obs_batch_key: "sample_id"
# QC metrics calculation options
var_qc_metrics: ["filter_with_hvg"]
top_n_vars: [50, 100, 200, 500]
# PCA options
pca_overwrite: false
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 1.0.1 -latest \
-profile docker \
-main-script target/nextflow/workflows/multiomics/process_batches/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Inputs
Name | Description | Attributes |
---|---|---|
--id |
ID of the sample. | string , required, example: "foo" |
--input |
Path to the sample. | List of file , required, example: "input.h5mu" , multiple_sep: ";" |
--rna_layer |
Input layer for the gene expression modality. If not specified, .X is used. | string |
--prot_layer |
Input layer for the antibody capture modality. If not specified, .X is used. | string |
Outputs
Name | Description | Attributes |
---|---|---|
--output |
Destination path to the output. | file , required, example: "output.h5mu" |
Highly variable features detection
Name | Description | Attributes |
---|---|---|
--highly_variable_features_var_output |
In which .var slot to store a boolean array corresponding to the highly variable genes. | string , default: "filter_with_hvg" |
--highly_variable_features_obs_batch_key |
If specified, highly-variable genes are selected within each batch separately and merged. This simple process avoids the selection of batch-specific genes and acts as a lightweight batch correction method. | string , default: "sample_id" |
QC metrics calculation options
Name | Description | Attributes |
---|---|---|
--var_qc_metrics |
Keys to select a boolean (containing only True or False) column from .var. For each cell, calculate the proportion of total values for genes which are labeled ‘True’, compared to the total sum of the values for all genes. | List of string , default: "filter_with_hvg" , example: "ercc,highly_variable" , multiple_sep: ";" |
--top_n_vars |
Number of top vars to be used to calculate cumulative proportions. If not specified, proportions are not calculated. --top_n_vars 20,50 finds cumulative proportion to the 20th and 50th most expressed vars. |
List of integer , default: 50, 100, 200, 500 , multiple_sep: ";" |
PCA options
Name | Description | Attributes |
---|---|---|
--pca_overwrite |
Allow overwriting slots for PCA output. | boolean_true |