BD Rhapsody
A generic pipeline for running BD Rhapsody WTA or Targeted mapping, with support for AbSeq, VDJ and/or SMK.
Info
ID: bd_rhapsody
Namespace: workflows/ingestion
Links
A wrapper for the BD Rhapsody Analysis CWL v1.10.1 pipeline.
This pipeline can be used for a targeted analysis (with --mode targeted
) or for a whole transcriptome analysis (with --mode wta
).
- If mode is
"targeted"
, then either the--reference
or--abseq_reference
parameters must be defined. - If mode is
"wta"
, then--reference
and--transcriptome_annotation
must be defined,--abseq_reference
and--supplemental_reference
is optional.
The reference_genome and transcriptome_annotation files can be generated with the make_reference pipeline. Alternatively, BD also provides standard references which can be downloaded from these locations:
- Human: http://bd-rhapsody-public.s3-website-us-east-1.amazonaws.com/Rhapsody-WTA/GRCh38-PhiX-gencodev29/
- Mouse: http://bd-rhapsody-public.s3-website-us-east-1.amazonaws.com/Rhapsody-WTA/GRCm38-PhiX-gencodevM19/
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 1.0.1 -latest \
-main-script target/nextflow/workflows/ingestion/bd_rhapsody/main.nf \
--help
Run command
Example of params.yaml
# Inputs
mode: # please fill in - example: "wta"
id: # please fill in - example: "foo"
input: # please fill in - example: ["input.fastq.gz"]
reference: # please fill in - example: ["reference_genome.tar.gz|reference.fasta"]
# transcriptome_annotation: "transcriptome.gtf"
# abseq_reference: ["abseq_reference.fasta"]
# supplemental_reference: ["supplemental_reference.fasta"]
sample_prefix: "sample"
# Outputs
# output_raw: "$id.$key.output_raw.output_raw"
# output_h5mu: "$id.$key.output_h5mu.h5mu"
# Putative cell calling settings
# putative_cell_call: "mRNA"
# exact_cell_count: 10000
disable_putative_calling: false
# Subsample arguments
# subsample: 0.01
# subsample_seed: 3445
# Multiplex arguments
# sample_tags_version: "human"
# tag_names: ["4-mySample", "9-myOtherSample", "6-alsoThisSample"]
# VDJ arguments
# vdj_version: "human"
# CWL-runner arguments
parallel: true
timestamps: false
dryrun: false
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 1.0.1 -latest \
-profile docker \
-main-script target/nextflow/workflows/ingestion/bd_rhapsody/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Inputs
Name | Description | Attributes |
---|---|---|
--mode |
Whether to run a whole transcriptome analysis (WTA) or a targeted analysis. | string , required, example: "wta" |
--id |
ID of the sample. | string , required, example: "foo" |
--input |
Path to your read files in the FASTQ.GZ format. You may specify as many R1/R2 read pairs as you want. | List of file , required, example: "input.fastq.gz" , multiple_sep: ";" |
--reference |
Refence to map to. For --mode wta , this is the path to STAR index as a tar.gz file. For --mode targeted , this is the path to mRNA reference file for pre-designed, supplemental, or custom panel, in FASTA format |
List of file , required, example: "reference_genome.tar.gz|reference.fasta" , multiple_sep: ";" |
--transcriptome_annotation |
Path to GTF annotation file (only for --mode wta ). |
file , example: "transcriptome.gtf" |
--abseq_reference |
Path to the AbSeq reference file in FASTA format. Only needed if BD AbSeq Ab-Oligos are used. | List of file , example: "abseq_reference.fasta" , multiple_sep: ";" |
--supplemental_reference |
Path to the supplemental reference file in FASTA format. Only needed if there are additional transgene sequences used in the experiment (only for --mode wta ). |
List of file , example: "supplemental_reference.fasta" , multiple_sep: ";" |
--sample_prefix |
Specify a run name to use as the output file base name. Use only letters, numbers, or hyphens. Do not use special characters or spaces. | string , default: "sample" |
Outputs
Name | Description | Attributes |
---|---|---|
--output_raw |
The BD Rhapsody output folder as it comes out of the BD Rhapsody pipeline | file , required, example: "output_dir" |
--output_h5mu |
The converted h5mu file. | file , required, example: "output.h5mu" |
Putative cell calling settings
Name | Description | Attributes |
---|---|---|
--putative_cell_call |
Specify the dataset to be used for putative cell calling. For putative cell calling using an AbSeq dataset, please provide an AbSeq_Reference fasta file above. | string , example: "mRNA" |
--exact_cell_count |
Exact cell count - Set a specific number (>=1) of cells as putative, based on those with the highest error-corrected read count | integer , example: 10000 |
--disable_putative_calling |
Disable Refined Putative Cell Calling - Determine putative cells using only the basic algorithm (minimum second derivative along the cumulative reads curve). The refined algorithm attempts to remove false positives and recover false negatives, but may not be ideal for certain complex mixtures of cell types. Does not apply if Exact Cell Count is set. | boolean_true |
Subsample arguments
Name | Description | Attributes |
---|---|---|
--subsample |
A number >1 or fraction (0 < n < 1) to indicate the number or percentage of reads to subsample. | double , example: 0.01 |
--subsample_seed |
A seed for replicating a previous subsampled run. | integer , example: 3445 |
Multiplex arguments
Name | Description | Attributes |
---|---|---|
--sample_tags_version |
Specify if multiplexed run. | string , example: "human" |
--tag_names |
Tag_Names (optional) - Specify the tag number followed by ‘-’ and the desired sample name to appear in Sample_Tag_Metrics.csv. Do not use the special characters: &, (), [], {}, <>, ?, | | List of string , example: "4-mySample", "9-myOtherSample", "6-alsoThisSample" , multiple_sep: ";" |
VDJ arguments
Name | Description | Attributes |
---|---|---|
--vdj_version |
Specify if VDJ run. | string , example: "human" |
CWL-runner arguments
Name | Description | Attributes |
---|---|---|
--parallel |
Run jobs in parallel. | boolean , default: TRUE |
--timestamps |
Add timestamps to the errors, warnings, and notifications. | boolean_true |
--dryrun |
If true, the output directory will only contain the CWL input files, but the pipeline itself will not be executed. | boolean_true |