Score genes cell cycle scanpy
Calculates the score associated to S phase and G2M phase and annotates the cell cycle phase for each cell, as implemented by scanpy.
Info
ID: score_genes_cell_cycle_scanpy
Namespace: feature_annotation
Links
The score is the average expression of a set of genes subtracted with the average expression of a reference set of genes
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 2.1.0 -latest \
-main-script target/nextflow/feature_annotation/score_genes_cell_cycle_scanpy/main.nf \
--help
Run command
Example of params.yaml
# Inputs
input: # please fill in - example: "input_file.h5mu"
modality: "rna"
# input_layer: "log_normalized"
# var_gene_names: "gene_names"
# Gene list inputs
# s_genes: ["gene1", "gene2", "gene3"]
# s_genes_file: "s_gene_list.txt"
# g2m_genes: ["gene1", "gene2", "gene3"]
# g2m_genes_file: "g2m_gene_list.txt"
# gene_pool: ["gene1", "gene2", "gene3"]
# gene_pool_file: "gene_pool.txt"
# Outputs
# output: "$id.$key.output.h5mu"
# output_compression: "gzip"
obs_phase: "phase"
obs_s_score: "S_score"
obs_g2m_score: "G2M_score"
# Arguments
n_bins: 25
random_state: 0
allow_missing_genes: false
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 2.1.0 -latest \
-profile docker \
-main-script target/nextflow/feature_annotation/score_genes_cell_cycle_scanpy/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Inputs
Name | Description | Attributes |
---|---|---|
--input |
Input h5mu file | file , required, example: "input_file.h5mu" |
--modality |
string , default: "rna" |
|
--input_layer |
The layer of the adata object containing normalized expression values. If not provided, the X attribute of the adata object will be used. | string , example: "log_normalized" |
--var_gene_names |
The name of the column in the var attribute of the adata object that contains the gene names (symbols). If not provided, the index of the var attribute will be used. | string , example: "gene_names" |
Gene list inputs
The gene list inputs can be provided as a list of gene symbols or as a file containing a list of gene symbols. The gene list file should be formatted as a single column with gene symbols.
Make sure that the gene list inputs are consistent with the gene names in the adata object as provided by the –var_gene_names argument.
Name | Description | Attributes |
---|---|---|
--s_genes |
List of gene symbols for scoring s phase genes. | List of string , example: "gene1", "gene2", "gene3" , multiple_sep: ";" |
--s_genes_file |
Path to a .txt file containing the gene list of s phase genes to be scored. The gene list file should be formatted as a single column with gene symbols. | file , example: "s_gene_list.txt" |
--g2m_genes |
List of gene symbols for scoring g2m phase genes. | List of string , example: "gene1", "gene2", "gene3" , multiple_sep: ";" |
--g2m_genes_file |
Path to a .txt file containing the gene list of g2m phase genes to be scored. The gene list file should be formatted as a single column with gene symbols. | file , example: "g2m_gene_list.txt" |
--gene_pool |
List of gene symbols for sampling the reference set. Default is all genes. | List of string , example: "gene1", "gene2", "gene3" , multiple_sep: ";" |
--gene_pool_file |
File with genes for sampling the reference set. Default is all genes. The gene pool file should be formatted as a single column with gene symbols. | file , example: "gene_pool.txt" |
Outputs
Name | Description | Attributes |
---|---|---|
--output |
Output h5mu file | file , required, example: "output_file.h5mu" |
--output_compression |
The compression format to be used on the output h5mu object. | string , example: "gzip" |
--obs_phase |
The name of the column in the obs attribute of the adata object that will store the cell cycle phase annotation. | string , default: "phase" |
--obs_s_score |
The name of the column in the obs attribute of the adata object that will store the s phase score. | string , default: "S_score" |
--obs_g2m_score |
The name of the column in the obs attribute of the adata object that will store the g2m phase score. | string , default: "G2M_score" |
Arguments
Name | Description | Attributes |
---|---|---|
--n_bins |
Number of expression level bins for sampling. | integer , default: 25 |
--random_state |
The random seed for sampling. | integer , default: 0 |
--allow_missing_genes |
If true, missing genes in the gene list will be ignored. | boolean , default: FALSE |