Scanorama leiden

Run scanorama integration followed by neighbour calculations, leiden clustering and run umap on the result.

Info

ID: scanorama_leiden
Namespace: workflows/integration

Links

Source

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 2.1.1 -latest \
  -main-script target/nextflow/workflows/integration/scanorama_leiden/main.nf \
  --help

Run command

Example of params.yaml

# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: "dataset.h5mu"
layer: "log_normalized"
modality: "rna"

# Outputs
# output: "$id.$key.output.h5mu"

# Neighbour calculation
uns_neighbors: "scanorama_integration_neighbors"
obsp_neighbor_distances: "scanorama_integration_distances"
obsp_neighbor_connectivities: "scanorama_integration_connectivities"

# Scanorama integration options
obs_batch: "sample_id"
obsm_input: "X_pca"
obsm_output: "X_scanorama"
knn: 20
batch_size: 5000
sigma: 15.0
approx: true
alpha: 0.1

# Clustering options
obs_cluster: "scanorama_integration_leiden"
leiden_resolution: [1.0]

# Umap options
obsm_umap: "X_leiden_scanorama_umap"

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"

# Arguments

nextflow run openpipelines-bio/openpipeline \
  -r 2.1.1 -latest \
  -profile docker \
  -main-script target/nextflow/workflows/integration/scanorama_leiden/main.nf \
  -params-file params.yaml

Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Inputs

Name	Description	Attributes
`--id`	ID of the sample.	`string`, required, example: `"foo"`
`--input`	Path to the sample.	`file`, required, example: `"dataset.h5mu"`
`--layer`	use specified layer for expression values instead of the .X object from the modality.	`string`, default: `"log_normalized"`
`--modality`	Which modality to process.	`string`, default: `"rna"`

Outputs

Name	Description	Attributes
`--output`	Destination path to the output.	`file`, required, example: `"output.h5mu"`

Neighbour calculation

Name	Description	Attributes
`--uns_neighbors`	In which .uns slot to store various neighbor output objects.	`string`, default: `"scanorama_integration_neighbors"`
`--obsp_neighbor_distances`	In which .obsp slot to store the distance matrix between the resulting neighbors.	`string`, default: `"scanorama_integration_distances"`
`--obsp_neighbor_connectivities`	In which .obsp slot to store the connectivities matrix between the resulting neighbors.	`string`, default: `"scanorama_integration_connectivities"`

Scanorama integration options

Name	Description	Attributes
`--obs_batch`	Column name discriminating between your batches.	`string`, default: `"sample_id"`
`--obsm_input`	.obsm slot that points to embedding to run scanorama on.	`string`, default: `"X_pca"`
`--obsm_output`	The name of the field in adata.obsm where the integrated embeddings will be stored after running this function. Defaults to X_scanorama.	`string`, default: `"X_scanorama"`
`--knn`	Number of nearest neighbors to use for matching.	`integer`, default: `20`
`--batch_size`	The batch size used in the alignment vector computation. Useful when integrating very large (>100k samples) datasets. Set to large value that runs within available memory.	`integer`, default: `5000`
`--sigma`	Correction smoothing parameter on Gaussian kernel.	`double`, default: `15`
`--approx`	Use approximate nearest neighbors with Python annoy; greatly speeds up matching runtime.	`boolean`, default: `TRUE`
`--alpha`	Alignment score minimum cutoff	`double`, default: `0.1`

Clustering options

Name	Description	Attributes
`--obs_cluster`	Prefix for the .obs keys under which to add the cluster labels. Newly created columns in .obs will be created from the specified value for ‘–obs_cluster’ suffixed with an underscore and one of the resolutions specified in ‘–leiden_resolution’.	`string`, default: `"scanorama_integration_leiden"`
`--leiden_resolution`	Control the coarseness of the clustering. Higher values lead to more clusters.	List of `double`, default: `1`, multiple_sep: `";"`

Umap options

Name	Description	Attributes
`--obsm_umap`	In which .obsm slot to store the resulting UMAP embedding.	`string`, default: `"X_leiden_scanorama_umap"`

Authors

Mauro Saporita (author)
Povilas Gibas (author)

Visualisation

flowchart TB
    v0(Channel.fromList)
    v2(filter)
    v10(filter)
    v18(scanorama)
    v25(cross)
    v35(cross)
    v41(filter)
    v195(concat)
    v50(filter)
    v65(cross)
    v75(cross)
    v84(branch)
    v111(concat)
    v96(cross)
    v106(cross)
    v115(branch)
    v142(concat)
    v127(cross)
    v137(cross)
    v146(branch)
    v173(concat)
    v158(cross)
    v168(cross)
    v180(cross)
    v190(cross)
    v202(cross)
    v209(cross)
    v221(cross)
    v228(cross)
    v232(Output)
    subgraph group_neighbors_leiden_umap [neighbors_leiden_umap]
        v58(find_neighbors)
        v89(leiden)
        v120(move_obsm_to_obs)
        v151(umap)
    end
    v84-->v111
    v115-->v142
    v146-->v173
    v0-->v2
    v2-->v10
    v10-->v18
    v18-->v25
    v10-->v25
    v10-->v35
    v41-->v50
    v50-->v58
    v58-->v65
    v50-->v65
    v50-->v75
    v84-->v89
    v89-->v96
    v84-->v96
    v84-->v106
    v106-->v111
    v115-->v120
    v120-->v127
    v115-->v127
    v115-->v137
    v137-->v142
    v146-->v151
    v151-->v158
    v146-->v158
    v146-->v168
    v168-->v173
    v173-->v180
    v41-->v180
    v41-->v190
    v190-->v195
    v195-->v202
    v2-->v202
    v202-->v209
    v2-->v209
    v2-->v221
    v221-->v228
    v2-->v228
    v228-->v232
    v35-->v41
    v18-->v35
    v58-->v75
    v75-->v84
    v89-->v106
    v111-->v115
    v120-->v137
    v142-->v146
    v151-->v168
    v173-->v190
    v195-->v221
    style group_neighbors_leiden_umap fill:#F0F0F0,stroke:#969696;
    style v0 fill:#e3dcea,stroke:#7a4baa;
    style v2 fill:#e3dcea,stroke:#7a4baa;
    style v10 fill:#e3dcea,stroke:#7a4baa;
    style v18 fill:#e3dcea,stroke:#7a4baa;
    style v25 fill:#e3dcea,stroke:#7a4baa;
    style v35 fill:#e3dcea,stroke:#7a4baa;
    style v41 fill:#e3dcea,stroke:#7a4baa;
    style v195 fill:#e3dcea,stroke:#7a4baa;
    style v50 fill:#e3dcea,stroke:#7a4baa;
    style v58 fill:#e3dcea,stroke:#7a4baa;
    style v65 fill:#e3dcea,stroke:#7a4baa;
    style v75 fill:#e3dcea,stroke:#7a4baa;
    style v84 fill:#e3dcea,stroke:#7a4baa;
    style v111 fill:#e3dcea,stroke:#7a4baa;
    style v89 fill:#e3dcea,stroke:#7a4baa;
    style v96 fill:#e3dcea,stroke:#7a4baa;
    style v106 fill:#e3dcea,stroke:#7a4baa;
    style v115 fill:#e3dcea,stroke:#7a4baa;
    style v142 fill:#e3dcea,stroke:#7a4baa;
    style v120 fill:#e3dcea,stroke:#7a4baa;
    style v127 fill:#e3dcea,stroke:#7a4baa;
    style v137 fill:#e3dcea,stroke:#7a4baa;
    style v146 fill:#e3dcea,stroke:#7a4baa;
    style v173 fill:#e3dcea,stroke:#7a4baa;
    style v151 fill:#e3dcea,stroke:#7a4baa;
    style v158 fill:#e3dcea,stroke:#7a4baa;
    style v168 fill:#e3dcea,stroke:#7a4baa;
    style v180 fill:#e3dcea,stroke:#7a4baa;
    style v190 fill:#e3dcea,stroke:#7a4baa;
    style v202 fill:#e3dcea,stroke:#7a4baa;
    style v209 fill:#e3dcea,stroke:#7a4baa;
    style v221 fill:#e3dcea,stroke:#7a4baa;
    style v228 fill:#e3dcea,stroke:#7a4baa;
    style v232 fill:#e3dcea,stroke:#7a4baa;