Initialize integration
Run calculations that output information required for most integration methods: PCA, nearest neighbour and UMAP.
Info
ID: initialize_integration
Namespace: integration/initialize_integration
Links
Example commands
You can run the pipeline using nextflow run.
View help
You can use --help as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 0.9.0 -latest \
-main-script workflows/multiomics/integration/initialize_integration/main.nf \
--helpRun command
Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: "dataset.h5mu"
layer: "log_normalized"
modality: "rna"
# Outputs
# output: "$id.$key.output.h5mu"
# PCA options
obsm_pca: "X_pca"
# var_pca_feature_selection: "foo"
# Neighbour calculation
uns_neighbors: "neighbors"
obsp_neighbor_distances: "distances"
obsp_neighbor_connectivities: "connectivities"
# Umap options
obsm_umap: "X_umap"
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"nextflow run openpipelines-bio/openpipeline \
-r 0.9.0 -latest \
-profile docker \
-main-script workflows/multiomics/integration/initialize_integration/main.nf \
-params-file params.yaml
Note
Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.
Argument groups
Inputs
| Name | Description | Attributes |
|---|---|---|
--id |
ID of the sample. | string, required, example: "foo" |
--input |
Path to the sample. | file, required, example: "dataset.h5mu" |
--layer |
use specified layer for expression values instead of the .X object from the modality. | string, default: "log_normalized" |
--modality |
Which modality to process. | string, default: "rna" |
Outputs
| Name | Description | Attributes |
|---|---|---|
--output |
Destination path to the output. | file, required, example: "output.h5mu" |
PCA options
| Name | Description | Attributes |
|---|---|---|
--obsm_pca |
In which .obsm slot to store the resulting PCA embedding. | string, default: "X_pca" |
--var_pca_feature_selection |
Column name in .var matrix that will be used to select which genes to run the PCA on. | string |
Neighbour calculation
| Name | Description | Attributes |
|---|---|---|
--uns_neighbors |
In which .uns slot to store various neighbor output objects. | string, default: "neighbors" |
--obsp_neighbor_distances |
In which .obsp slot to store the distance matrix between the resulting neighbors. | string, default: "distances" |
--obsp_neighbor_connectivities |
In which .obsp slot to store the connectivities matrix between the resulting neighbors. | string, default: "connectivities" |
Umap options
| Name | Description | Attributes |
|---|---|---|
--obsm_umap |
In which .obsm slot to store the resulting UMAP embedding. | string, default: "X_umap" |
Visualisation
flowchart LR
p0(Input)
p2(toSortedList)
p4(flatMap)
p12(pca)
p22(find_neighbors)
p32(umap)
p40(Output)
p0-->p2
p2-->p4
p4-->p12
p12-->p22
p22-->p32
p32-->p40