Process batches
This workflow serves as an entrypoint into the ‘full_pipeline’ in order to re-run the multisample processing and the integration setup.
Info
ID: process_batches
Namespace: workflows/multiomics
Links
An input .h5mu file will first be split in order to run the multisample processing per modality. Next, the modalities are merged again and the integration setup pipeline is executed. Please note that this workflow assumes that samples from multiple pipelines are already concatenated.
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 1.0.1 -latest \
-main-script target/nextflow/workflows/multiomics/process_batches/main.nf \
--help
Run command
Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: ["input.h5mu"]
# rna_layer: "foo"
# prot_layer: "foo"
# Outputs
# output: "$id.$key.output.h5mu"
# Highly variable features detection
highly_variable_features_var_output: "filter_with_hvg"
highly_variable_features_obs_batch_key: "sample_id"
# QC metrics calculation options
var_qc_metrics: ["filter_with_hvg"]
top_n_vars: [50, 100, 200, 500]
# PCA options
pca_overwrite: false
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 1.0.1 -latest \
-profile docker \
-main-script target/nextflow/workflows/multiomics/process_batches/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Inputs
Name | Description | Attributes |
---|---|---|
--id |
ID of the sample. | string , required, example: "foo" |
--input |
Path to the sample. | List of file , required, example: "input.h5mu" , multiple_sep: ";" |
--rna_layer |
Input layer for the gene expression modality. If not specified, .X is used. | string |
--prot_layer |
Input layer for the antibody capture modality. If not specified, .X is used. | string |
Outputs
Name | Description | Attributes |
---|---|---|
--output |
Destination path to the output. | file , required, example: "output.h5mu" |
Highly variable features detection
Name | Description | Attributes |
---|---|---|
--highly_variable_features_var_output |
In which .var slot to store a boolean array corresponding to the highly variable genes. | string , default: "filter_with_hvg" |
--highly_variable_features_obs_batch_key |
If specified, highly-variable genes are selected within each batch separately and merged. This simple process avoids the selection of batch-specific genes and acts as a lightweight batch correction method. | string , default: "sample_id" |
QC metrics calculation options
Name | Description | Attributes |
---|---|---|
--var_qc_metrics |
Keys to select a boolean (containing only True or False) column from .var. For each cell, calculate the proportion of total values for genes which are labeled ‘True’, compared to the total sum of the values for all genes. | List of string , default: "filter_with_hvg" , example: "ercc,highly_variable" , multiple_sep: ";" |
--top_n_vars |
Number of top vars to be used to calculate cumulative proportions. If not specified, proportions are not calculated. --top_n_vars 20,50 finds cumulative proportion to the 20th and 50th most expressed vars. |
List of integer , default: 50, 100, 200, 500 , multiple_sep: ";" |
PCA options
Name | Description | Attributes |
---|---|---|
--pca_overwrite |
Allow overwriting slots for PCA output. | boolean_true |