Scanorama

Use Scanorama to integrate different experiments

Info

ID: scanorama
Namespace: integrate

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 1.0.1 -latest \
  -main-script target/nextflow/integrate/scanorama/main.nf \
  --help

Run command

Example of params.yaml
# Arguments
input: # please fill in - example: "path/to/file"
modality: "rna"
# output: "$id.$key.output.h5ad"
# output_compression: "gzip"
obs_batch: "batch"
obsm_input: "X_pca"
obsm_output: "X_scanorama"
knn: 20
batch_size: 5000
sigma: 15
approx: true
alpha: 0.1

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
  -r 1.0.1 -latest \
  -profile docker \
  -main-script target/nextflow/integrate/scanorama/main.nf \
  -params-file params.yaml
Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument group

Arguments

Name Description Attributes
--input Input h5mu file file, required
--modality string, default: "rna"
--output Output .h5mu file file, required, default: "output.h5ad"
--output_compression The compression format to be used on the output h5mu object. string, example: "gzip"
--obs_batch Column name discriminating between your batches. string, default: "batch"
--obsm_input Basis obsm slot to run scanorama on. string, default: "X_pca"
--obsm_output The name of the field in adata.obsm where the integrated embeddings will be stored after running this function. Defaults to X_scanorama. string, default: "X_scanorama"
--knn Number of nearest neighbors to use for matching. integer, default: 20
--batch_size The batch size used in the alignment vector computation. Useful when integrating very large (>100k samples) datasets. Set to large value that runs within available memory. integer, default: 5000
--sigma Correction smoothing parameter on Gaussian kernel. double, default: 15
--approx Use approximate nearest neighbors with Python annoy; greatly speeds up matching runtime. boolean, default: TRUE
--alpha Alignment score minimum cutoff double, default: 0.1

Authors

  • Dries De Maeyer (author)

  • Dries Schaumont (maintainer)