Rna singlesample

Processing unimodal single-sample RNA transcriptomics data.

Info

ID: rna_singlesample
Namespace: multiomics

Links

Source

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 0.10.0 -latest \
  -main-script ./workflows/multiomics/rna_singlesample/main.nf \
  --help

Run command

Example of params.yaml

# Input
id: # please fill in - example: "foo"
input: # please fill in - example: "dataset.h5mu"

# Output
# output: "$id.$key.output.h5mu"

# Filtering options
# min_counts: 200
# max_counts: 5000000
# min_genes_per_cell: 200
# max_genes_per_cell: 1500000
# min_cells_per_gene: 3
# min_fraction_mito: 0
# max_fraction_mito: 0.2

# Mitochondrial gene detection
# var_name_mitochondrial_genes: "foo"
# var_gene_names: "gene_symbol"
mitochondrial_gene_regex: "^[mM][tT]-"

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"

nextflow run openpipelines-bio/openpipeline \
  -r 0.10.0 -latest \
  -profile docker \
  -main-script ./workflows/multiomics/rna_singlesample/main.nf \
  -params-file params.yaml

Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Input

Name	Description	Attributes
`--id`	ID of the sample.	`string`, required, example: `"foo"`
`--input`	Path to the sample.	`file`, required, example: `"dataset.h5mu"`

Output

Name	Description	Attributes
`--output`	Destination path to the output.	`file`, required, example: `"output.h5mu"`

Filtering options

Name	Description	Attributes
`--min_counts`	Minimum number of counts captured per cell.	`integer`, example: `200`
`--max_counts`	Maximum number of counts captured per cell.	`integer`, example: `5000000`
`--min_genes_per_cell`	Minimum of non-zero values per cell.	`integer`, example: `200`
`--max_genes_per_cell`	Maximum of non-zero values per cell.	`integer`, example: `1500000`
`--min_cells_per_gene`	Minimum of non-zero values per gene.	`integer`, example: `3`
`--min_fraction_mito`	Minimum fraction of UMIs that are mitochondrial.	`double`, example: `0`
`--max_fraction_mito`	Maximum fraction of UMIs that are mitochondrial.	`double`, example: `0.2`

Mitochondrial gene detection

Name	Description	Attributes
`--var_name_mitochondrial_genes`	In which .var slot to store a boolean array corresponding the mitochondrial genes.	`string`
`--var_gene_names`	.var column name to be used to detect mitochondrial genes instead of .var_names (default if not set). Gene names matching with the regex value from –mitochondrial_gene_regex will be identified as a mitochondrial gene.	`string`, example: `"gene_symbol"`
`--mitochondrial_gene_regex`	Regex string that identifies mitochondrial genes from –var_gene_names. By default will detect human and mouse mitochondrial genes from a gene symbol.	`string`, default: `"^[mM][tT]-"`

Authors

Dries De Maeyer (author)
Robrecht Cannoodt (author, maintainer)
Dries Schaumont (author)

Visualisation

flowchart LR
    p0(Input)
    p3(toSortedList)
    p5(flatMap)
    p12(filter_with_counts)
    p14(join)
    p22(do_filter)
    p24(join)
    p32(filter_with_scrublet)
    p34(join)
    p42(Output)
    p0-->p3
    p3-->p5
    p5-->p14
    p5-->p12
    p12-->p14
    p14-->p24
    p14-->p22
    p22-->p24
    p24-->p34
    p24-->p32
    p32-->p34
    p34-->p42