Build cellranger arc reference

Build a Cell Ranger-arc and -atac compatible reference folder from user-supplied genome FASTA and gene GTF files.

Info

ID: build_cellranger_arc_reference
Namespace: reference

Creates a new folder named after the genome.

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 2.1.0 -latest \
  -main-script target/nextflow/reference/build_cellranger_arc_reference/main.nf \
  --help

Run command

Example of params.yaml
# Arguments
genome_fasta: # please fill in - example: "genome_sequence.fa.gz"
annotation_gtf: # please fill in - example: "annotation.gtf.gz"
# motifs_file: "JASPAR2024_CORE_non-redundant_pfms_jaspar.txt.modified"
non_nuclear_contigs: ["chrM"]
# output: "$id.$key.output"
genome: # please fill in - example: "output"
# organism: "foo"
# subset_regex: "(ERCC-00002|chr1)"

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
  -r 2.1.0 -latest \
  -profile docker \
  -main-script target/nextflow/reference/build_cellranger_arc_reference/main.nf \
  -params-file params.yaml
Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument group

Arguments

Name Description Attributes
--genome_fasta Reference genome fasta. file, required, example: "genome_sequence.fa.gz"
--annotation_gtf Reference annotation. file, required, example: "annotation.gtf.gz"
--motifs_file Transcription factor motifs in JASPAR format. See https://support.10xgenomics.com/single-cell-multiome-atac-gex/software/pipelines/latest/advanced/references file, example: "JASPAR2024_CORE_non-redundant_pfms_jaspar.txt.modified"
--non_nuclear_contigs Name(s) of contig(s) that do not have any chromatin structure, for example, mitochondria or plastids. These contigs are excluded from peak calling since the entire contig will be “open” due to a lack of chromatin structure. Leave empty if there are no such contigs. List of string, default: "chrM", example: "chrM", multiple_sep: ";"
--output Output folder file, required, example: "cellranger_reference"
--genome Name of the genome. This will be the name of the intermediate output folder string, required, default: "output", example: "GRCh38"
--organism Name of the organism. This is displayed in the web summary but is otherwise not used in the analysis. string
--subset_regex Will subset the reference chromosomes using the given regex. string, example: "(ERCC-00002|chr1)"

Authors

  • Vladimir Shitov (author)