Initialize integration
Run calculations that output information required for most integration methods: PCA, nearest neighbour and UMAP.
Info
ID: initialize_integration
Namespace: multiomics/integration
Links
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 0.12.0 -latest \
-main-script ./workflows/multiomics/integration/initialize_integration/main.nf \
--help
Run command
Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: "dataset.h5mu"
layer: "log_normalized"
modality: "rna"
# Outputs
# output: "$id.$key.output.h5mu"
# PCA options
obsm_pca: "X_pca"
# var_pca_feature_selection: "foo"
pca_overwrite: false
# Neighbour calculation
uns_neighbors: "neighbors"
obsp_neighbor_distances: "distances"
obsp_neighbor_connectivities: "connectivities"
# Umap options
obsm_umap: "X_umap"
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 0.12.0 -latest \
-profile docker \
-main-script ./workflows/multiomics/integration/initialize_integration/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Inputs
Name | Description | Attributes |
---|---|---|
--id |
ID of the sample. | string , required, example: "foo" |
--input |
Path to the sample. | file , required, example: "dataset.h5mu" |
--layer |
use specified layer for expression values instead of the .X object from the modality. | string , default: "log_normalized" |
--modality |
Which modality to process. | string , default: "rna" |
Outputs
Name | Description | Attributes |
---|---|---|
--output |
Destination path to the output. | file , required, example: "output.h5mu" |
PCA options
Name | Description | Attributes |
---|---|---|
--obsm_pca |
In which .obsm slot to store the resulting PCA embedding. | string , default: "X_pca" |
--var_pca_feature_selection |
Column name in .var matrix that will be used to select which genes to run the PCA on. | string |
--pca_overwrite |
Allow overwriting slots for PCA output. | boolean_true |
Neighbour calculation
Name | Description | Attributes |
---|---|---|
--uns_neighbors |
In which .uns slot to store various neighbor output objects. | string , default: "neighbors" |
--obsp_neighbor_distances |
In which .obsp slot to store the distance matrix between the resulting neighbors. | string , default: "distances" |
--obsp_neighbor_connectivities |
In which .obsp slot to store the connectivities matrix between the resulting neighbors. | string , default: "connectivities" |
Umap options
Name | Description | Attributes |
---|---|---|
--obsm_umap |
In which .obsm slot to store the resulting UMAP embedding. | string , default: "X_umap" |
Visualisation
flowchart LR v0(Input) v2(toSortedList) v4(flatMap) v11(pca) v13(join) v22(find_neighbors) v24(join) v33(umap) v35(join) v40(toSortedList) v42(Output) v0-->v2 v2-->v4 v4-->v13 v4-->v11 v11-->v13 v13-->v24 v13-->v22 v22-->v24 v24-->v35 v24-->v33 v33-->v35 v35-->v40 v40-->v42