Calculate atac qc metrics

Add basic ATAC quality control metrics to an .h5mu file.

Info

ID: calculate_atac_qc_metrics
Namespace: qc

The metrics are comparable to what scanpy.pp.calculate_qc_metrics output, although they have slightly different names:

Obs metrics (name in this component -> name in scanpy): - n_features_per_cell -> n_genes_by_counts - total_fragment_counts -> total_counts

Example commands

You can run the pipeline using nextflow run.

View help

You can use --help as a parameter to get an overview of the possible parameters.

nextflow run openpipelines-bio/openpipeline \
  -r 2.1.0 -latest \
  -main-script target/nextflow/qc/calculate_atac_qc_metrics/main.nf \
  --help

Run command

Example of params.yaml
# Inputs
input: # please fill in - example: "input.h5mu"
# fragments_path: "fragments.tsv.gz"
modality: "atac"
# layer: "raw_counts"
n_fragments_for_nucleosome_signal: 100000
nuc_signal_threshold: 2.0
n_tss: 3000
tss_threshold: 1.5

# Outputs
# output: "$id.$key.output.h5mu"
# output_compression: "gzip"

# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"

# Arguments
nextflow run openpipelines-bio/openpipeline \
  -r 2.1.0 -latest \
  -profile docker \
  -main-script target/nextflow/qc/calculate_atac_qc_metrics/main.nf \
  -params-file params.yaml
Note

Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.

Argument groups

Inputs

Name Description Attributes
--input Input h5mu file file, required, example: "input.h5mu"
--fragments_path Path to the fragments file. If not provided and not present in the input h5mu file, the nucleosome signal and TSS enrichment score will not be calculated. file, example: "fragments.tsv.gz"
--modality string, default: "atac"
--layer Layer at .layers to use for the calculation. If not provided, .X is used. string, example: "raw_counts"
--n_fragments_for_nucleosome_signal Number of fragments to use per cell for nucleosome signal calculation. Takes very long to calculate, for a test run lower value (e.g. 10e3) is recommended. See https://www.sc-best-practices.org/chromatin_accessibility/quality_control.html#nucleosome-signal for more information integer, default: 100000
--nuc_signal_threshold Threshold for nucleosome signal. Cells with nucleosome signal above this threshold will be marked as low quality (“NS_FAIL”), otherwise they will be marked “NS_PASS”. double, default: 2
--n_tss Number of the transcription start sites to calculate TSS enrichment score. See https://www.sc-best-practices.org/chromatin_accessibility/quality_control.html#tss-enrichment for more information integer, default: 3000
--tss_threshold Threshold for TSS enrichment score. Cells with TSS enrichment score below this threshold will be marked as low quality (“TSS_FAIL”) otherwise they will be marked as “TSS_PASS”. double, default: 1.5

Outputs

Name Description Attributes
--output Output h5mu file. file, example: "output.h5mu"
--output_compression The compression format to be used on the output h5mu object. string, example: "gzip"

Authors

  • Vladimir Shitov (author)