Demuxlet

Demuxlet is a software tool to deconvolute sample identity and identify multiplets when multiple samples are pooled by barcoded single cell sequencing.

Info

ID: demuxlet
Namespace: genetic_demux

If external genotyping data for each sample is available (e.g. from SNP arrays), demuxlet would be recommended. Be careful that the parameters on the github is not in line with the newest help version

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 1.0.2 -latest \
  -main-script target/nextflow/genetic_demux/demuxlet/main.nf \
  --help

Run command

Example of params.yaml
# Input
# sam: "path/to/file"
tag_group: "CB"
tag_umi: "UB"
# plp: "foo"
# vcf: "path/to/file"
field: "GT"
geno_error_offset: 0.1
geno_error_coeff: 0.0
r2_info: "R2"
min_mac: 1
min_call_rate: 0.5
alpha: "0.5"
doublet_prior: 0.5
# sm: "foo"
# sm_list: "foo"
sam_verbose: 1000000
vcf_verbose: 1000
cap_bq: 20
min_bq: 13
min_mq: 20
min_td: 0
excl_flag: 3844
# group_list: "foo"
min_total: 0
min_snp: 0
min_umi: 0

# Output
# output: "$id.$key.output.output"
# out: "demuxlet"

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
  -r 1.0.2 -latest \
  -profile docker \
  -main-script target/nextflow/genetic_demux/demuxlet/main.nf \
  -params-file params.yaml
Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Input

Name Description Attributes
--sam Input SAM/BAM/CRAM file. Must be sorted by coordinates and indexed. file
--tag_group Tag representing readgroup or cell barcodes, in the case to partition the BAM file into multiple groups. For 10x genomics, use CB. string, default: "CB"
--tag_umi Tag representing UMIs. For 10x genomiucs, use UB. string, default: "UB"
--plp Input pileup format. If the value is a string, it will be considered as the path of the plp file. If the value is boolean true, it will perform dscpileup. string
--vcf Input VCF/BCF file, containing the individual genotypes (GT), posterior probability (GP), or genotype likelihood (PL). file
--field FORMAT field to extract the genotype, likelihood, or posterior from string, default: "GT"
--geno_error_offset Offset of genotype error rate. [error] = [offset] + [1-offset][coeff][1-r2] double, default: 0.1
--geno_error_coeff Slope of genotype error rate. [error] = [offset] + [1-offset][coeff][1-r2] double, default: 0
--r2_info INFO field name representing R2 value. Used for representing imputation quality. string, default: "R2"
--min_mac Minimum minor allele frequency. integer, default: 1
--min_call_rate Minimum call rate. double, default: 0.5
--alpha Grid of alpha to search for (default is 0.1, 0.2, 0.3, 0.4, 0.5) string, default: "0.5"
--doublet_prior Prior of doublet double, default: 0.5
--sm List of sample IDs to compare to (default: use all). string
--sm_list File containing the list of sample IDs to compare. string
--sam_verbose Verbose message frequency for SAM/BAM/CRAM. integer, default: 1000000
--vcf_verbose Verbose message frequency for VCF/BCF. integer, default: 1000
--cap_bq Maximum base quality (higher BQ will be capped). integer, default: 20
--min_bq Minimum base quality to consider (lower BQ will be skipped). integer, default: 13
--min_mq Minimum mapping quality to consider (lower MQ will be ignored). integer, default: 20
--min_td Minimum distance to the tail (lower will be ignored). integer, default: 0
--excl_flag SAM/BAM FLAGs to be excluded. integer, default: 3844
--group_list List of tag readgroup/cell barcode to consider in this run. All other barcodes will be ignored. This is useful for parallelized run. string
--min_total Minimum number of total reads for a droplet/cell to be considered. integer, default: 0
--min_snp Minimum number of SNPs with coverage for a droplet/cell to be considered. integer, default: 0
--min_umi Minimum number of UMIs for a droplet/cell to be considered. integer, default: 0

Output

Name Description Attributes
--output Output directory file, example: "demux"
--out demuxlet output file prefix string, example: "demuxlet"

Authors

  • Xichen Wu (author)