Cell Ranger multi

A pipeline for running Cell Ranger multi.

Info

ID: cellranger_multi
Namespace: workflows/ingestion

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 1.0.1 -latest \
  -main-script target/nextflow/workflows/ingestion/cellranger_multi/main.nf \
  --help

Run command

Example of params.yaml
# Inputs
id: # please fill in - example: "foo"
# input: ["sample_S1_L001_R1_001.fastq.gz", "sample_S1_L001_R2_001.fastq.gz"]
gex_reference: # please fill in - example: "reference_genome.tar.gz"
# vdj_reference: "reference_vdj.tar.gz"
# feature_reference: "feature_reference.csv"
# vdj_inner_enrichment_primers: "enrichment_primers.txt"

# Outputs
# output_raw: "$id.$key.output_raw.output_raw"
# output_h5mu: "$id.$key.output_h5mu.h5mu"
uns_metrics: "metrics_cellranger"

# Feature type-specific input files
# gex_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]
# abc_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]
# cgc_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]
# mux_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]
# vdj_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]
# vdj_t_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]
# vdj_t_gd_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]
# vdj_b_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]
# agc_input: ["mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz"]

# Cell multiplexing parameters
# cell_multiplex_sample_id: "foo"
# cell_multiplex_oligo_ids: "foo"
# cell_multiplex_description: "foo"

# Gene expression arguments
# gex_expect_cells: 3000
gex_chemistry: "auto"
gex_secondary_analysis: false
gex_generate_bam: true
gex_include_introns: true

# Library arguments
# library_id: ["mysample1"]
# library_type: ["Gene Expression"]
# library_subsample: ["0.5"]
# library_lanes: ["1-4"]

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
  -r 1.0.1 -latest \
  -profile docker \
  -main-script target/nextflow/workflows/ingestion/cellranger_multi/main.nf \
  -params-file params.yaml
Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Inputs

Name Description Attributes
--id ID of the sample. string, required, example: "foo"
--input The fastq.gz files to align. Can also be a single directory containing fastq.gz files. List of file, example: "sample_S1_L001_R1_001.fastq.gz", "sample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--gex_reference Genome refence index built by Cell Ranger mkref. file, required, example: "reference_genome.tar.gz"
--vdj_reference VDJ refence index built by Cell Ranger mkref. file, example: "reference_vdj.tar.gz"
--feature_reference Path to the Feature reference CSV file, declaring Feature Barcode constructs and associated barcodes. Required only for Antibody Capture or CRISPR Guide Capture libraries. See https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/feature-bc-analysis#feature-ref for more information. file, example: "feature_reference.csv"
--vdj_inner_enrichment_primers V(D)J Immune Profiling libraries: if inner enrichment primers other than those provided in the 10x Genomics kits are used, they need to be specified here as a text file with one primer per line. file, example: "enrichment_primers.txt"

Feature type-specific input files

Helper functionality to allow feature type-specific input files, without the need to specify library_type or library_id. The library_id will be inferred from the input paths.

Name Description Attributes
--gex_input The FASTQ files to be analyzed for Gene Expression. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--abc_input The FASTQ files to be analyzed for Antibody Capture. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--cgc_input The FASTQ files to be analyzed for CRISPR Guide Capture. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--mux_input The FASTQ files to be analyzed for Multiplexing Capture. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--vdj_input The FASTQ files to be analyzed for VDJ. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--vdj_t_input The FASTQ files to be analyzed for VDJ-T. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--vdj_t_gd_input The FASTQ files to be analyzed for VDJ-T-GD. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--vdj_b_input The FASTQ files to be analyzed for VDJ-B. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"
--agc_input The FASTQ files to be analyzed for Antigen Capture. FASTQ files should conform to the naming conventions of bcl2fastq and mkfastq: [Sample Name]_S[Sample Index]_L00[Lane Number]_[Read Type]_001.fastq.gz List of file, example: "mysample_S1_L001_R1_001.fastq.gz", "mysample_S1_L001_R2_001.fastq.gz", multiple_sep: ";"

Outputs

Name Description Attributes
--output_raw The raw output folder. file, required, example: "output_dir"
--output_h5mu The converted h5mu file. file, required, example: "output.h5mu"
--uns_metrics Name of the .uns slot under which to QC metrics (if any). string, default: "metrics_cellranger"

Cell multiplexing parameters

Arguments related to cell multiplexing.

Name Description Attributes
--cell_multiplex_sample_id A name to identify a multiplexed sample. Must be alphanumeric with hyphens and/or underscores, and less than 64 characters. Required for Cell Multiplexing libraries. string
--cell_multiplex_oligo_ids The Cell Multiplexing oligo IDs used to multiplex this sample. If multiple CMOs were used for a sample, separate IDs with a pipe (e.g., CMO301|CMO302). Required for Cell Multiplexing libraries. string
--cell_multiplex_description A description for the sample. string

Gene expression arguments

Arguments relevant to the analysis of gene expression data.

Name Description Attributes
--gex_expect_cells Expected number of recovered cells, used as input to cell calling algorithm. integer, example: 3000
--gex_chemistry Assay configuration. - auto: autodetect mode - threeprime: Single Cell 3’ - fiveprime: Single Cell 5’ - SC3Pv1: Single Cell 3’ v1 - SC3Pv2: Single Cell 3’ v2 - SC3Pv3: Single Cell 3’ v3 - SC3Pv3LT: Single Cell 3’ v3 LT - SC3Pv3HT: Single Cell 3’ v3 HT - SC5P-PE: Single Cell 5’ paired-end - SC5P-R2: Single Cell 5’ R2-only - SC-FB: Single Cell Antibody-only 3’ v2 or 5’ See https://kb.10xgenomics.com/hc/en-us/articles/115003764132-How-does-Cell-Ranger-auto-detect-chemistry- for more information. string, default: "auto"
--gex_secondary_analysis Whether or not to run the secondary analysis e.g. clustering. boolean, default: FALSE
--gex_generate_bam Whether to generate a BAM file. boolean, default: TRUE
--gex_include_introns Include intronic reads in count (default=true unless –target-panel is specified in which case default=false) boolean, default: TRUE

Library arguments

Name Description Attributes
--library_id The Illumina sample name to analyze. This must exactly match the ‘Sample Name’ part of the FASTQ files specified in the --input argument. List of string, example: "mysample1", multiple_sep: ";"
--library_type The underlying feature type of the library. Possible values: “Gene Expression”, “VDJ”, “VDJ-T”, “VDJ-B”, “Antibody Capture”, “CRISPR Guide Capture”, “Multiplexing Capture” List of string, example: "Gene Expression", multiple_sep: ";"
--library_subsample Optional. The rate at which reads from the provided FASTQ files are sampled. Must be strictly greater than 0 and less than or equal to 1. List of string, example: "0.5", multiple_sep: ";"
--library_lanes Lanes associated with this sample. Defaults to using all lanes. List of string, example: "1-4", multiple_sep: ";"

Authors

  • Dries Schaumont (author)

Visualisation

flowchart TB
    v0(Channel.fromList)
    v17(cellranger_multi_component)
    v37(from_cellranger_multi_to_h5mu)
    v64(Output)
    v0-->v17
    v17-->v37
    v37-->v64
    style v0 fill:#e3dcea,stroke:#7a4baa;
    style v17 fill:#e3dcea,stroke:#7a4baa;
    style v37 fill:#e3dcea,stroke:#7a4baa;
    style v64 fill:#e3dcea,stroke:#7a4baa;