Reference

An overview of the workflows and modules in OpenPipelines

Workflows

Name Namespace Description
BD Rhapsody Workflows/ingestion BD Rhapsody Sequence Analysis CWL pipeline v2.2.1
Bbknn leiden Workflows/integration Run bbknn followed by leiden clustering and run umap on the result.
Cell Ranger mapping Workflows/ingestion A pipeline for running Cell Ranger mapping.
Cell Ranger multi Workflows/ingestion A pipeline for running Cell Ranger multi.
Cell Ranger post-processing Workflows/ingestion Post-processing Cell Ranger datasets.
Convert to MuData Workflows/ingestion A pipeline to convert different file formats to .h5mu.
Demux Workflows/ingestion A generic pipeline for running bcl2fastq, bcl-convert or Cell Ranger mkfastq.
Dimensionality reduction Workflows/multiomics Run calculations that output information required for most integration methods: PCA, nearest neighbour and UMAP.
GDO Singlesample Workflows/gdo Processing unimodal single-sample guide-derived oligonucleotide (GDO) data.
Harmony integration followed by KNN label transfer Workflows/annotation Cell type annotation workflow by performing harmony integration of reference and query dataset followed by KNN label transfer.
Harmony leiden Workflows/integration Run harmony integration followed by neighbour calculations, leiden clustering and run umap on the result.
Make reference Workflows/ingestion Build a transcriptomics reference into one of many formats
Neighbors leiden umap Workflows/multiomics Performs neighborhood search, leiden clustering and run umap on an integrated embedding.
Process batches Workflows/multiomics This workflow serves as an entrypoint into the ‘full_pipeline’ in order to re-run the multisample processing and the integration setup.
Process samples Workflows/multiomics A pipeline to analyse multiple multiomics samples.
Prot multisample Workflows/prot Processing unimodal multi-sample ADT data.
Prot singlesample Workflows/prot Processing unimodal single-sample CITE-seq data.
Qc Workflows/qc A pipeline to add basic qc statistics to a MuData
Rna multisample Workflows/rna Processing unimodal multi-sample RNA transcriptomics data.
Rna singlesample Workflows/rna Processing unimodal single-sample RNA transcriptomics data.
Scanorama leiden Workflows/integration Run scanorama integration followed by neighbour calculations, leiden clustering and run umap on the result.
Scgpt leiden Workflows/integration Run scGPT integration (cell embedding generation) followed by neighbour calculations, leiden clustering and run umap on the result.
Scvi leiden Workflows/integration Run scvi integration followed by neighbour calculations, leiden clustering and run umap on the result.
Split h5mu Workflows/multiomics Split the samples of a single modality from a .h5mu (multimodal) sample into seperate .h5mu files based on the values of an .obs column of this modality
Split modalities Workflows/multiomics A pipeline to split a multimodal mudata files into several unimodal mudata files.
Totalvi leiden Workflows/integration Run totalVI integration followed by neighbour calculations, leiden clustering and run umap on the result.
scANVI - scArches workflow Workflows/annotation Cell type annotation workflow using ScanVI with scArches for reference mapping.
scGPT Annotation Workflows/annotation Cell type annotation workflow using scGPT.
scVI Annotation Workflows/annotation Cell type annotation workflow that performs scVI integration of reference and query dataset followed by KNN label transfer.
No matching items

Modules

Name Namespace Description
Add id Metadata Add id of .obs.
Align query reference Feature annotation Alignment of a query and reference dataset by: * Alignment of layers * Harmonization of .obs field names for batch and cell type labels * Harmonization of .var field name for gene names * Sanitation of gene names * Cross-checking of genes * Assignment of an id to the query and reference datasets
Bbknn Neighbors BBKNN network generation
Bcftools Genetic demux Filter the variants called by freebayes or cellSNP
Bcl convert Demux Convert bcl files to fastq files using bcl-convert.
Bcl2fastq Demux Convert bcl files to fastq files using bcl2fastq
Bd rhapsody Mapping BD Rhapsody Sequence Analysis CWL pipeline v2.2.1 This pipeline performs analysis of single-cell multiomic sequence read (FASTQ) data.
Binning Scgpt Conversion of (pre-processed) expression count data into relative values (bins) to address scale differences across sequencing batches
Bpcells regress out Transform Regress out the effects of confounding variables using a linear least squares regression model with BPCells
Build bdrhap reference Reference The Reference Files Generator creates an archive containing Genome Index and Transcriptome annotation files needed for the BD Rhapsody Sequencing Analysis Pipeline.
Build cellranger arc reference Reference Build a Cell Ranger-arc and -atac compatible reference folder from user-supplied genome FASTA and gene GTF files.
Build cellranger reference Reference Build a Cell Ranger-compatible reference folder from user-supplied genome FASTA and gene GTF files.
Build star reference Reference Create a reference for STAR from a set of fasta files.
Calculate atac qc metrics Qc Add basic ATAC quality control metrics to an .h5mu file.
Calculate qc metrics Qc Add basic quality control metrics to an .h5mu file.
Cell type annotation Scgpt Annotate gene expression data with cell type classes through the scGPT model
Cellbender remove background Correction Eliminating technical artifacts from high-throughput single-cell RNA sequencing data.
Cellbender remove background v0 2 Correction Eliminating technical artifacts from high-throughput single-cell RNA sequencing data.
Cellranger atac count Mapping Align fastq files using Cell Ranger ATAC count.
Cellranger atac mkfastq Demux Demultiplex raw sequencing data for ATAC experiments
Cellranger count Mapping Align fastq files using Cell Ranger count.
Cellranger count split Mapping Split 10x Cell Ranger output directory into separate output fields.
Cellranger mkfastq Demux Demultiplex raw sequencing data
Cellranger mkgtf Reference Make a GTF file - filter by a specific attribute.
Cellranger multi Mapping Align fastq files using Cell Ranger multi.
Cellsnp Genetic demux cellSNP aims to pileup the expressed alleles in single-cell or bulk RNA-seq data.
Celltypist Annotate Automated cell type annotation tool for scRNA-seq datasets on the basis of logistic regression classifiers optimised by the stochastic gradient descent algorithm.
Cellxgene census Query Query cells from a CellxGene Census or custom TileDBSoma object.
Clr Transform Perform CLR normalization on CITE-seq data (Stoeckius et al., 2017)
Compress h5mu Compression Compress a MuData file.
Concatenate h5mu Dataflow Concatenate observations from samples in several (uni- and/or multi-modal) MuData files into a single file
Cross check genes Scgpt Cross-check genes with pre-trained scGPT model
Delete layer Transform Delete an anndata layer from one or more modalities
Delimit fraction Filter Turns a column containing values between 0 and 1 into a boolean column based on thresholds
Demuxlet Genetic demux Demuxlet is a software tool to deconvolute sample identity and identify multiplets when multiple samples are pooled by barcoded single cell sequencing.
Densmap Dimred A modification of UMAP that adds an extra cost term in order to preserve information about the relative local density of the data.
Do filter Filter Remove observations and variables based on specified .obs and .var columns
Download file Download Download a file
Dsc pileup Genetic demux Dsc-pileup is a software tool to pileup reads and corresponding base quality for each overlapping SNPs and each barcode.
Embedding Scgpt Generation of cell embeddings for the integration of single cell transcriptomic count data using scGPT
Fastqc Qc Fastqc component, please see https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
Filter 10xh5 Process 10xh5 Filter a 10x h5 dataset
Filter with counts Filter Filter scRNA-seq data based on the primary QC metrics.
Filter with scrublet Filter Doublet detection using the Scrublet method (Wolock, Lopez and Klein, 2019).
Find neighbors Neighbors Compute a neighborhood graph of observations [McInnes18].
Freebayes Genetic demux Freebayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs
Freemuxlet Genetic demux Freemuxlet is a software tool to deconvolute sample identity and identify multiplets when multiple samples are pooled by barcoded single cell sequencing.
From 10xh5 to h5mu Convert Converts a 10x h5 into an h5mu file
From 10xmtx to h5mu Convert Converts a 10x mtx into an h5mu file
From bd to 10x molecular barcode tags Convert Convert the molecular barcode sequence SAM tag from BD format (MA) to 10X format (UB)
From bdrhap to h5mu Convert Convert the output of a BD Rhapsody pipeline v2.x to a MuData h5 file
From cellranger multi to h5mu Convert Converts the output from cellranger multi to a single .h5mu file.
From h5ad to h5mu Convert Converts a single layer h5ad file into a single MuData object
From h5ad to seurat Convert Converts an h5ad file into a Seurat file
From h5mu to h5ad Convert Converts a h5mu file into a h5ad file
From h5mu to seurat Convert Converts an h5mu file into a Seurat file.
Grep annotation column Metadata Perform a regex lookup on a column from the annotation matrices .obs or .var.
Harmonypy Integrate Performs Harmony integration based as described in https://github.com/immunogenomics/harmony.
Highly variable features scanpy Feature annotation Annotate highly variable features [Satija15] [Zheng17] [Stuart19].
Htseq count Mapping Quantify gene expression for subsequent testing for differential expression.
Htseq count to h5mu Mapping Convert the htseq table to a h5mu
Intersect obs Filter Create an intersection between two or more modalities.
Join csv Metadata Join a csv containing metadata to the .obs or .var field of a mudata file.
Join uns to obs Metadata Join a data frame of length 1 (1 row index value) in .uns containing metadata to the .obs of a mudata file.
Knn Labels transfer This component performs label transfer from reference to query using a K-Neirest Neighbors classifier
Leiden Cluster Cluster cells using the [Leiden algorithm] [Traag18] implemented in the [Scanpy framework] [Wolf18].
Lianapy Interpret Performs LIANA integration based as described in https://github.com/saezlab/liana-py
Log1p Transform Logarithmize the data matrix.
Lsi Dimred Runs Latent Semantic Indexing.
Make params Files Looks for files in a directory and turn it in a params file.
Make reference Reference Preprocess and build a transcriptome reference.
Merge Dataflow Combine one or more single-modality .h5mu files together into one .h5mu file
Mermaid Report Generates a network from mermaid code
Move layer Transform Move a data matrix stored at the .layers or .X attributes in a MuData object to another layer.
Move obsm to obs Metadata Move a matrix from .obsm to .obs.
Multi star Mapping Align fastq files using STAR.
Multi star to h5mu Mapping Convert the output of multi_star to a h5mu
Multiqc Qc MultiQC aggregates results from bioinformatics analyses across many samples into a single report.
Normalize total Transform Normalize counts per cell.
Onclass Annotate OnClass is a python package for single-cell cell type annotation.
Pad tokenize Scgpt Tokenize and pad a batch of data for scGPT integration zero-shot inference or fine-tuning
Pca Dimred Computes PCA coordinates, loadings and variance decomposition.
Popv Annotate Performs popular major vote cell typing on single cell sequence data using multiple algorithms.
Publish Transfer Publish an artifact and optionally rename with parameters
Random forest annotation Annotate Automated cell type annotation tool for scRNA-seq datasets on the basis of random forest.
Regress out Transform Regress out (mostly) unwanted sources of variation.
Remove modality Filter Remove a modality from a .h5mu file
Samtools Genetic demux Filter the BAM according to the instruction of scSplit via Samtools.
Samtools sort Mapping Sort and (optionally) index alignments.
Scale Transform Scale data to unit variance and zero mean
Scanorama Integrate Use Scanorama to integrate different experiments
Scanvi Annotate scANVI () is a semi-supervised model for single-cell transcriptomics data.
Scarches Integrate Performs reference mapping with scArches
Score genes cell cycle scanpy Feature annotation Calculates the score associated to S phase and G2M phase and annotates the cell cycle phase for each cell, as implemented by scanpy.
Scsplit Genetic demux scsplit is a genotype-free demultiplexing methode of pooled single-cell RNA-seq, using a hidden state model for identifying genetically distinct samples within a mixed population.
Scvelo Velocity ID: scvelo
Namespace: velocity
Scvi Integrate Performs scvi integration as done in the human lung cell atlas https://github.com/LungCellAtlas/HLCA
Souporcell Genetic demux souporcell is a method for clustering mixed-genotype scRNAseq experiments by individual.
Split h5mu Dataflow Split the samples of a single modality from a .h5mu (multimodal) sample into seperate .h5mu files based on the values of an .obs column of this modality.
Split h5mu train test Dataflow Split mudata object into training and testing (and validation) datasets based on observations into separate mudata objects.
Split modalities Dataflow Split the modalities from a single .h5mu multimodal sample into seperate .h5mu files.
Star align Mapping Align fastq files using STAR.
Star align v273a Mapping Align fastq files using STAR.
Subset h5mu Filter Create a subset of a mudata file by selecting the first number of observations
Subset obsp Filter Create a subset of an .obsp field in a mudata file, by filtering the columns based on the values of an .obs column.
Svm annotation Annotate Automated cell type annotation tool for scRNA-seq datasets on the basis of SVMs.
Sync test resources Download Sync test resources to the local filesystem
Tar extract Compression Extract files from a tar archive
Tfidf Transform Perform TF-IDF normalization of the data (typically, ATAC).
Totalvi Integrate Performs mapping to the reference by totalvi model: https://docs.scvi-tools.org/en/stable/tutorials/notebooks/scarches_scvi_tools.html#Reference-mapping-with-TOTALVI
Tsne Dimred t-SNE (t-Distributed Stochastic Neighbor Embedding) is a dimensionality reduction technique used to visualize high-dimensional data in a low-dimensional space, revealing patterns and clusters by preserving local data similarities
Umap Dimred UMAP (Uniform Manifold Approximation and Projection) is a manifold learning technique suitable for visualizing high-dimensional data.
Velocyto Velocity Runs the velocity analysis on a BAM file, outputting a loom file.
Velocyto to h5mu Convert Convert a velocyto loom file to a h5mu file.
Vireo Genetic demux Vireo is primarily designed for demultiplexing cells into donors by modelling of expressed alleles.
Xgboost Labels transfer Performs label transfer from reference to query using XGBoost classifier
No matching items