Popv

Performs popular major vote cell typing on single cell sequence data using multiple algorithms.

Info

ID: popv
Namespace: annotate

Links

Source

Note that this is a one-shot version of PopV.

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 2.1.1 -latest \
  -main-script target/nextflow/annotate/popv/main.nf \
  --help

Run command

Example of params.yaml

# Inputs
input: # please fill in - example: "input.h5mu"
modality: "rna"
# input_layer: "foo"
# input_obs_batch: "foo"
# input_var_subset: "foo"
# input_obs_label: "foo"
unknown_celltype_label: "unknown"

# Reference
reference: # please fill in - example: "TS_Bladder_filtered.h5ad"
# reference_layer: "foo"
reference_obs_label: "cell_ontology_class"
reference_obs_batch: "donor_assay"

# Outputs
# output: "$id.$key.output.h5mu"
# output_compression: "gzip"

# Arguments
methods: # please fill in - example: ["knn_on_scvi", "scanvi"]

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"

nextflow run openpipelines-bio/openpipeline \
  -r 2.1.1 -latest \
  -profile docker \
  -main-script target/nextflow/annotate/popv/main.nf \
  -params-file params.yaml

Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Inputs

Arguments related to the input (aka query) dataset.

Name	Description	Attributes
`--input`	Input h5mu file.	`file`, required, example: `"input.h5mu"`
`--modality`	Which modality to process.	`string`, default: `"rna"`
`--input_layer`	Which layer to use. If no value is provided, the counts are assumed to be in the `.X` slot. Otherwise, count data is expected to be in `.layers[input_layer]`.	`string`
`--input_obs_batch`	Key in obs field of input adata for batch information. If no value is provided, batch label is assumed to be unknown.	`string`
`--input_var_subset`	Subset the input object with this column.	`string`
`--input_obs_label`	Key in obs field of input adata for label information. This is only used for training scANVI. Unlabelled cells should be set to `"unknown_celltype_label"`.	`string`
`--unknown_celltype_label`	If `input_obs_label` is specified, cells with this value will be treated as unknown and will be predicted by the model.	`string`, default: `"unknown"`

Reference

Arguments related to the reference dataset.

Name	Description	Attributes
`--reference`	User-provided reference tissue. The data that will be used as reference to call cell types.	`file`, required, example: `"TS_Bladder_filtered.h5ad"`
`--reference_layer`	Which layer to use. If no value is provided, the counts are assumed to be in the `.X` slot. Otherwise, count data is expected to be in `.layers[reference_layer]`.	`string`
`--reference_obs_label`	Key in obs field of reference AnnData with cell-type information.	`string`, default: `"cell_ontology_class"`
`--reference_obs_batch`	Key in obs field of input adata for batch information.	`string`, default: `"donor_assay"`

Outputs

Output arguments.

Name	Description	Attributes
`--output`	Output h5mu file.	`file`, required, example: `"output.h5mu"`
`--output_compression`		`string`, example: `"gzip"`

Arguments

Other arguments.

Name	Description	Attributes
`--methods`	Methods to call cell types. By default, runs to knn_on_scvi and scanvi.	List of `string`, required, example: `"knn_on_scvi", "scanvi"`, multiple_sep: `";"`

Authors

Matthias Beyens (author)
Robrecht Cannoodt (author)