Build cellranger arc reference
Build a Cell Ranger-arc and -atac compatible reference folder from user-supplied genome FASTA and gene GTF files.
Info
ID: build_cellranger_arc_reference
Namespace: reference
Links
Creates a new folder named after the genome.
Example commands
You can run the pipeline using nextflow run.
View help
You can use --help as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 2.1.1 -latest \
-main-script target/nextflow/reference/build_cellranger_arc_reference/main.nf \
--helpRun command
Example of params.yaml
# Arguments
genome_fasta: # please fill in - example: "genome_sequence.fa.gz"
annotation_gtf: # please fill in - example: "annotation.gtf.gz"
# motifs_file: "JASPAR2024_CORE_non-redundant_pfms_jaspar.txt.modified"
non_nuclear_contigs: ["chrM"]
# output: "$id.$key.output"
genome: # please fill in - example: "output"
# organism: "foo"
# subset_regex: "(ERCC-00002|chr1)"
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"nextflow run openpipelines-bio/openpipeline \
-r 2.1.1 -latest \
-profile docker \
-main-script target/nextflow/reference/build_cellranger_arc_reference/main.nf \
-params-file params.yaml
Note
Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.
Argument group
Arguments
| Name | Description | Attributes |
|---|---|---|
--genome_fasta |
Reference genome fasta. | file, required, example: "genome_sequence.fa.gz" |
--annotation_gtf |
Reference annotation. | file, required, example: "annotation.gtf.gz" |
--motifs_file |
Transcription factor motifs in JASPAR format. See https://support.10xgenomics.com/single-cell-multiome-atac-gex/software/pipelines/latest/advanced/references | file, example: "JASPAR2024_CORE_non-redundant_pfms_jaspar.txt.modified" |
--non_nuclear_contigs |
Name(s) of contig(s) that do not have any chromatin structure, for example, mitochondria or plastids. These contigs are excluded from peak calling since the entire contig will be “open” due to a lack of chromatin structure. Leave empty if there are no such contigs. | List of string, default: "chrM", example: "chrM", multiple_sep: ";" |
--output |
Output folder | file, required, example: "cellranger_reference" |
--genome |
Name of the genome. This will be the name of the intermediate output folder | string, required, default: "output", example: "GRCh38" |
--organism |
Name of the organism. This is displayed in the web summary but is otherwise not used in the analysis. | string |
--subset_regex |
Will subset the reference chromosomes using the given regex. | string, example: "(ERCC-00002|chr1)" |