flowchart TB
v0(Channel.fromList)
v2(filter)
v10(filter)
v18(scvi)
v25(cross)
v35(cross)
v41(filter)
v195(concat)
v50(filter)
v65(cross)
v75(cross)
v84(branch)
v111(concat)
v96(cross)
v106(cross)
v115(branch)
v142(concat)
v127(cross)
v137(cross)
v146(branch)
v173(concat)
v158(cross)
v168(cross)
v180(cross)
v190(cross)
v202(cross)
v209(cross)
v221(cross)
v228(cross)
v232(Output)
subgraph group_neighbors_leiden_umap [neighbors_leiden_umap]
v58(find_neighbors)
v89(leiden)
v120(move_obsm_to_obs)
v151(umap)
end
v84-->v111
v115-->v142
v146-->v173
v0-->v2
v2-->v10
v10-->v18
v18-->v25
v10-->v25
v10-->v35
v41-->v50
v50-->v58
v58-->v65
v50-->v65
v50-->v75
v84-->v89
v89-->v96
v84-->v96
v84-->v106
v106-->v111
v115-->v120
v120-->v127
v115-->v127
v115-->v137
v137-->v142
v146-->v151
v151-->v158
v146-->v158
v146-->v168
v168-->v173
v173-->v180
v41-->v180
v41-->v190
v190-->v195
v195-->v202
v2-->v202
v202-->v209
v2-->v209
v2-->v221
v221-->v228
v2-->v228
v228-->v232
v35-->v41
v18-->v35
v58-->v75
v75-->v84
v89-->v106
v111-->v115
v120-->v137
v142-->v146
v151-->v168
v173-->v190
v195-->v221
style group_neighbors_leiden_umap fill:#F0F0F0,stroke:#969696;
style v0 fill:#e3dcea,stroke:#7a4baa;
style v2 fill:#e3dcea,stroke:#7a4baa;
style v10 fill:#e3dcea,stroke:#7a4baa;
style v18 fill:#e3dcea,stroke:#7a4baa;
style v25 fill:#e3dcea,stroke:#7a4baa;
style v35 fill:#e3dcea,stroke:#7a4baa;
style v41 fill:#e3dcea,stroke:#7a4baa;
style v195 fill:#e3dcea,stroke:#7a4baa;
style v50 fill:#e3dcea,stroke:#7a4baa;
style v58 fill:#e3dcea,stroke:#7a4baa;
style v65 fill:#e3dcea,stroke:#7a4baa;
style v75 fill:#e3dcea,stroke:#7a4baa;
style v84 fill:#e3dcea,stroke:#7a4baa;
style v111 fill:#e3dcea,stroke:#7a4baa;
style v89 fill:#e3dcea,stroke:#7a4baa;
style v96 fill:#e3dcea,stroke:#7a4baa;
style v106 fill:#e3dcea,stroke:#7a4baa;
style v115 fill:#e3dcea,stroke:#7a4baa;
style v142 fill:#e3dcea,stroke:#7a4baa;
style v120 fill:#e3dcea,stroke:#7a4baa;
style v127 fill:#e3dcea,stroke:#7a4baa;
style v137 fill:#e3dcea,stroke:#7a4baa;
style v146 fill:#e3dcea,stroke:#7a4baa;
style v173 fill:#e3dcea,stroke:#7a4baa;
style v151 fill:#e3dcea,stroke:#7a4baa;
style v158 fill:#e3dcea,stroke:#7a4baa;
style v168 fill:#e3dcea,stroke:#7a4baa;
style v180 fill:#e3dcea,stroke:#7a4baa;
style v190 fill:#e3dcea,stroke:#7a4baa;
style v202 fill:#e3dcea,stroke:#7a4baa;
style v209 fill:#e3dcea,stroke:#7a4baa;
style v221 fill:#e3dcea,stroke:#7a4baa;
style v228 fill:#e3dcea,stroke:#7a4baa;
style v232 fill:#e3dcea,stroke:#7a4baa;
Scvi leiden
Run scvi integration followed by neighbour calculations, leiden clustering and run umap on the result.
Info
ID: scvi_leiden
Namespace: workflows/integration
Links
Example commands
You can run the pipeline using nextflow run.
View help
You can use --help as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 2.1.1 -latest \
-main-script target/nextflow/workflows/integration/scvi_leiden/main.nf \
--helpRun command
Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: "dataset.h5mu"
# layer: "foo"
modality: "rna"
# Outputs
# output: "$id.$key.output.h5mu"
# output_model: "$id.$key.output_model"
# Neighbour calculation
uns_neighbors: "scvi_integration_neighbors"
obsp_neighbor_distances: "scvi_integration_distances"
obsp_neighbor_connectivities: "scvi_integration_connectivities"
# Scvi integration options
obs_batch: # please fill in - example: "foo"
obsm_output: "X_scvi_integrated"
# var_input: "foo"
# early_stopping: true
early_stopping_monitor: "elbo_validation"
early_stopping_patience: 45
early_stopping_min_delta: 0.0
# max_epochs: 123
reduce_lr_on_plateau: true
lr_factor: 0.6
lr_patience: 30.0
# Clustering options
obs_cluster: "scvi_integration_leiden"
leiden_resolution: [1.0]
# Umap options
obsm_umap: "X_scvi_umap"
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
# Argumentsnextflow run openpipelines-bio/openpipeline \
-r 2.1.1 -latest \
-profile docker \
-main-script target/nextflow/workflows/integration/scvi_leiden/main.nf \
-params-file params.yaml
Note
Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.
Argument groups
Inputs
| Name | Description | Attributes |
|---|---|---|
--id |
ID of the sample. | string, required, example: "foo" |
--input |
Path to the sample. | file, required, example: "dataset.h5mu" |
--layer |
use specified layer for expression values instead of the .X object from the modality. | string |
--modality |
Which modality to process. | string, default: "rna" |
Outputs
| Name | Description | Attributes |
|---|---|---|
--output |
Destination path to the output. | file, required, example: "output.h5mu" |
--output_model |
Folder where the state of the trained model will be saved to. | file, required, example: "output_dir" |
Neighbour calculation
| Name | Description | Attributes |
|---|---|---|
--uns_neighbors |
In which .uns slot to store various neighbor output objects. | string, default: "scvi_integration_neighbors" |
--obsp_neighbor_distances |
In which .obsp slot to store the distance matrix between the resulting neighbors. | string, default: "scvi_integration_distances" |
--obsp_neighbor_connectivities |
In which .obsp slot to store the connectivities matrix between the resulting neighbors. | string, default: "scvi_integration_connectivities" |
Scvi integration options
| Name | Description | Attributes |
|---|---|---|
--obs_batch |
Column name discriminating between your batches. | string, required |
--obsm_output |
In which .obsm slot to store the resulting integrated embedding. | string, default: "X_scvi_integrated" |
--var_input |
.var column containing highly variable genes. By default, do not subset genes. | string |
--early_stopping |
Whether to perform early stopping with respect to the validation set. | boolean |
--early_stopping_monitor |
Metric logged during validation set epoch. | string, default: "elbo_validation" |
--early_stopping_patience |
Number of validation epochs with no improvement after which training will be stopped. | integer, default: 45 |
--early_stopping_min_delta |
Minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement. | double, default: 0 |
--max_epochs |
Number of passes through the dataset, defaults to (20000 / number of cells) * 400 or 400; whichever is smallest. | integer |
--reduce_lr_on_plateau |
Whether to monitor validation loss and reduce learning rate when validation set lr_scheduler_metric plateaus. |
boolean, default: TRUE |
--lr_factor |
Factor to reduce learning rate. | double, default: 0.6 |
--lr_patience |
Number of epochs with no improvement after which learning rate will be reduced. | double, default: 30 |
Clustering options
| Name | Description | Attributes |
|---|---|---|
--obs_cluster |
Prefix for the .obs keys under which to add the cluster labels. Newly created columns in .obs will be created from the specified value for ‘–obs_cluster’ suffixed with an underscore and one of the resolutions resolutions specified in ‘–leiden_resolution’. | string, default: "scvi_integration_leiden" |
--leiden_resolution |
Control the coarseness of the clustering. Higher values lead to more clusters. | List of double, default: 1, multiple_sep: ";" |
Umap options
| Name | Description | Attributes |
|---|---|---|
--obsm_umap |
In which .obsm slot to store the resulting UMAP embedding. | string, default: "X_scvi_umap" |