Multisample

This workflow serves as an entrypoint into the ‘full_pipeline’ in order to re-run the multisample processing and the integration setup.

Info

ID: multisample
Namespace: multiomics

An input .h5mu file will first be split in order to run the multisample processing per modality. Next, the modalities are merged again and the integration setup pipeline is executed. Please note that this workflow assumes that samples from multiple pipelines are already concatenated.

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 0.12.0 -latest \
  -main-script ./workflows/multiomics/multisample/main.nf \
  --help

Run command

Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: "input.h5mu"

# Outputs
# output: "$id.$key.output.h5mu"

# Highly variable gene detection
filter_with_hvg_var_output: "filter_with_hvg"
filter_with_hvg_obs_batch_key: "sample_id"

# QC metrics calculation options
var_qc_metrics: ["filter_with_hvg"]
top_n_vars: [50, 100, 200, 500]

# PCA options
pca_overwrite: false

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
  -r 0.12.0 -latest \
  -profile docker \
  -main-script ./workflows/multiomics/multisample/main.nf \
  -params-file params.yaml
Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Inputs

Name Description Attributes
--id ID of the sample. string, required, example: "foo"
--input Path to the sample. file, required, example: "input.h5mu"

Outputs

Name Description Attributes
--output Destination path to the output. file, required, example: "output.h5mu"

Highly variable gene detection

Name Description Attributes
--filter_with_hvg_var_output In which .var slot to store a boolean array corresponding to the highly variable genes. string, default: "filter_with_hvg"
--filter_with_hvg_obs_batch_key If specified, highly-variable genes are selected within each batch separately and merged. This simple process avoids the selection of batch-specific genes and acts as a lightweight batch correction method. string, default: "sample_id"

QC metrics calculation options

Name Description Attributes
--var_qc_metrics Keys to select a boolean (containing only True or False) column from .var. For each cell, calculate the proportion of total values for genes which are labeled ‘True’, compared to the total sum of the values for all genes. List of string, default: "filter_with_hvg", example: "ercc,highly_variable", multiple_sep: ","
--top_n_vars Number of top vars to be used to calculate cumulative proportions. If not specified, proportions are not calculated. --top_n_vars 20,50 finds cumulative proportion to the 20th and 50th most expressed vars. List of integer, default: 50, 100, 200, 500, multiple_sep: ","

PCA options

Name Description Attributes
--pca_overwrite Allow overwriting slots for PCA output. boolean_true

Authors

  • Dries Schaumont (author, maintainer)

Visualisation

flowchart LR
    v0(Input)
    v2(toSortedList)
    v4(flatMap)
    v7(filter)
    v12(split_modalities)
    v14(join)
    v21(concat)
    v17(filter)
    v19(test_wf:run_wf:split_modalities_workflow:splitStub)
    v22(flatMap)
    v24(filter)
    v27(toSortedList)
    v29(flatMap)
    v31(toSortedList)
    v33(Output)
    v39(normalize_total)
    v41(join)
    v50(log1p)
    v52(join)
    v61(delete_layer)
    v63(join)
    v72(filter_with_hvg)
    v74(join)
    v83(rna_calculate_qc_metrics)
    v85(join)
    v149(concat)
    v90(filter)
    v93(toSortedList)
    v95(flatMap)
    v96(toSortedList)
    v98(Output)
    v104(clr)
    v106(join)
    v112(filter)
    v118(grep_annotation_column)
    v120(join)
    v124(mix)
    v123(filter)
    v130(calculate_qc_metrics)
    v132(join)
    v140(publish)
    v142(join)
    v146(filter)
    v150(groupTuple)
    v156(merge)
    v158(join)
    v162(filter)
    v166(toSortedList)
    v168(flatMap)
    v175(pca)
    v177(join)
    v186(find_neighbors)
    v188(join)
    v197(umap)
    v199(join)
    v205(concat)
    v204(filter)
    v206(filter)
    v210(toSortedList)
    v212(flatMap)
    v219(pca)
    v221(join)
    v230(find_neighbors)
    v232(join)
    v241(test_wf:run_wf:integration_setup_workflow:initialize_integration_prot:umap:umap_process1)
    v243(join)
    v249(concat)
    v248(filter)
    v257(test_wf:run_wf:publish:publish_process1)
    v259(join)
    v264(toSortedList)
    v266(Output)
    v21-->v22
    v95-->v96
    v123-->v124
    v149-->v150
    v204-->v205
    v205-->v206
    v205-->v248
    v248-->v249
    v0-->v2
    v2-->v4
    v4-->v7
    v4-->v17
    v7-->v14
    v7-->v12
    v12-->v14
    v14-->v21
    v17-->v19
    v19-->v21
    v22-->v24
    v22-->v90
    v22-->v146
    v24-->v27
    v27-->v29
    v29-->v31
    v31-->v33
    v29-->v41
    v29-->v39
    v39-->v41
    v41-->v52
    v41-->v50
    v50-->v52
    v52-->v63
    v52-->v61
    v61-->v63
    v63-->v74
    v63-->v72
    v72-->v74
    v74-->v85
    v74-->v83
    v83-->v85
    v85-->v149
    v90-->v93
    v93-->v95
    v96-->v98
    v95-->v106
    v95-->v104
    v104-->v106
    v106-->v112
    v106-->v123
    v112-->v120
    v112-->v118
    v118-->v120
    v120-->v124
    v124-->v132
    v124-->v130
    v130-->v132
    v132-->v142
    v132-->v140
    v140-->v142
    v142-->v149
    v146-->v149
    v150-->v158
    v150-->v156
    v156-->v158
    v158-->v162
    v158-->v204
    v162-->v166
    v166-->v168
    v168-->v177
    v168-->v175
    v175-->v177
    v177-->v188
    v177-->v186
    v186-->v188
    v188-->v199
    v188-->v197
    v197-->v199
    v199-->v205
    v206-->v210
    v210-->v212
    v212-->v221
    v212-->v219
    v219-->v221
    v221-->v232
    v221-->v230
    v230-->v232
    v232-->v243
    v232-->v241
    v241-->v243
    v243-->v249
    v249-->v259
    v249-->v257
    v257-->v259
    v259-->v264
    v264-->v266