Scanorama leiden
Run scanorama integration followed by neighbour calculations, leiden clustering and run umap on the result.
Info
ID: scanorama_leiden
Namespace: workflows/integration
Links
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 1.0.1 -latest \
-main-script target/nextflow/workflows/integration/scanorama_leiden/main.nf \
--help
Run command
Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
input: # please fill in - example: "dataset.h5mu"
layer: "log_normalized"
modality: "rna"
# Outputs
# output: "$id.$key.output.h5mu"
# Neighbour calculation
uns_neighbors: "scanorama_integration_neighbors"
obsp_neighbor_distances: "scanorama_integration_distances"
obsp_neighbor_connectivities: "scanorama_integration_connectivities"
# Scanorama integration options
obs_batch: "sample_id"
obsm_input: "X_pca"
obsm_output: "X_scanorama"
knn: 20
batch_size: 5000
sigma: 15
approx: true
alpha: 0.1
# Clustering options
obs_cluster: "scanorama_integration_leiden"
leiden_resolution: [1]
# Umap options
obsm_umap: "X_leiden_scanorama_umap"
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 1.0.1 -latest \
-profile docker \
-main-script target/nextflow/workflows/integration/scanorama_leiden/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Inputs
Name | Description | Attributes |
---|---|---|
--id |
ID of the sample. | string , required, example: "foo" |
--input |
Path to the sample. | file , required, example: "dataset.h5mu" |
--layer |
use specified layer for expression values instead of the .X object from the modality. | string , default: "log_normalized" |
--modality |
Which modality to process. | string , default: "rna" |
Outputs
Name | Description | Attributes |
---|---|---|
--output |
Destination path to the output. | file , required, example: "output.h5mu" |
Neighbour calculation
Name | Description | Attributes |
---|---|---|
--uns_neighbors |
In which .uns slot to store various neighbor output objects. | string , default: "scanorama_integration_neighbors" |
--obsp_neighbor_distances |
In which .obsp slot to store the distance matrix between the resulting neighbors. | string , default: "scanorama_integration_distances" |
--obsp_neighbor_connectivities |
In which .obsp slot to store the connectivities matrix between the resulting neighbors. | string , default: "scanorama_integration_connectivities" |
Scanorama integration options
Name | Description | Attributes |
---|---|---|
--obs_batch |
Column name discriminating between your batches. | string , default: "sample_id" |
--obsm_input |
.obsm slot that points to embedding to run scanorama on. | string , default: "X_pca" |
--obsm_output |
The name of the field in adata.obsm where the integrated embeddings will be stored after running this function. Defaults to X_scanorama. | string , default: "X_scanorama" |
--knn |
Number of nearest neighbors to use for matching. | integer , default: 20 |
--batch_size |
The batch size used in the alignment vector computation. Useful when integrating very large (>100k samples) datasets. Set to large value that runs within available memory. | integer , default: 5000 |
--sigma |
Correction smoothing parameter on Gaussian kernel. | double , default: 15 |
--approx |
Use approximate nearest neighbors with Python annoy; greatly speeds up matching runtime. | boolean , default: TRUE |
--alpha |
Alignment score minimum cutoff | double , default: 0.1 |
Clustering options
Name | Description | Attributes |
---|---|---|
--obs_cluster |
Prefix for the .obs keys under which to add the cluster labels. Newly created columns in .obs will be created from the specified value for ‘–obs_cluster’ suffixed with an underscore and one of the resolutions specified in ‘–leiden_resolution’. | string , default: "scanorama_integration_leiden" |
--leiden_resolution |
Control the coarseness of the clustering. Higher values lead to more clusters. | List of double , default: 1 , multiple_sep: ";" |
Umap options
Name | Description | Attributes |
---|---|---|
--obsm_umap |
In which .obsm slot to store the resulting UMAP embedding. | string , default: "X_leiden_scanorama_umap" |