Tfidf
    Perform TF-IDF normalization of the data (typically, ATAC).
  
Info
ID: tfidf
Namespace: transform
Links
TF-IDF stands for “term frequency - inverse document frequency”. It is a technique from natural language processing analysis. In the context of ATAC data, “terms” are the features (genes) and “documents” are the observations (cells). TF-IDF normalization is applied to single-cell ATAC-seq data to highlight the importance of specific genomic regions (typically peaks) across different cells while down-weighting regions that are commonly accessible across many cells.
Example commands
You can run the pipeline using nextflow run.
View help
You can use --help as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
  -r 2.1.1 -latest \
  -main-script target/nextflow/transform/tfidf/main.nf \
  --helpRun command
Example of params.yaml
# Arguments
input: # please fill in - example: "input.h5mu"
modality: "atac"
# input_layer: "foo"
# output: "$id.$key.output"
# output_compression: "gzip"
output_layer: "tfidf"
scale_factor: 10000
log_idf: true
log_tf: true
log_tfidf: false
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"nextflow run openpipelines-bio/openpipeline \
  -r 2.1.1 -latest \
  -profile docker \
  -main-script target/nextflow/transform/tfidf/main.nf \
  -params-file params.yaml
Note
Replace -profile docker with -profile podman or -profile singularity depending on the desired backend.
Argument group
Arguments
| Name | Description | Attributes | 
|---|---|---|
| --input | Input h5mu file | file, required, example:"input.h5mu" | 
| --modality | string, default:"atac" | |
| --input_layer | Input layer to use. By default, X is normalized | string | 
| --output | Output h5mu file. | file, required | 
| --output_compression | The compression format to be used on the output h5mu object. | string, example:"gzip" | 
| --output_layer | Output layer to use. | string, default:"tfidf" | 
| --scale_factor | Scale factor to multiply the TF-IDF matrix by. | integer, default:10000 | 
| --log_idf | Whether to log-transform IDF term. | boolean, default:TRUE | 
| --log_tf | Whether to log-transform TF term. | boolean, default:TRUE | 
| --log_tfidf | Whether to log-transform TF*IDF term (False by default). Can only be used when log_tf and log_idf are False. | boolean, default:FALSE |