Cellxgene census
Query CellxGene Census or user-specified TileDBSoma object, and eventually fetch cell and gene metadata or/and expression counts.
Info
ID: cellxgene_census
Namespace: query
Links
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 0.12.0 -latest \
-main-script target/nextflow/query/cellxgene_census/main.nf \
--help
Run command
Example of params.yaml
# Inputs
input_database: "CellxGene"
modality: "rna"
cellxgene_release: "2023-05-15"
# Outputs
# output: "$id.$key.output.h5mu"
# output_compression: "gzip"
# Query
species: "homo_sapiens"
# cell_query: "is_primary_data == True and cell_type_ontology_term_id in ['CL:0000136', 'CL:1000311', 'CL:0002616'] and suspension_type == 'cell'"
# cells_filter_columns: ["dataset_id", "tissue", "assay", "disease", "cell_type"]
# min_cells_filter_columns: 100
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 0.12.0 -latest \
-profile docker \
-main-script target/nextflow/query/cellxgene_census/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Inputs
Arguments related to the input (aka query) dataset.
Name | Description | Attributes |
---|---|---|
--input_database |
Full input database S3 prefix URL. Default: CellxGene Census | string , default: "CellxGene" , example: "s3://" |
--modality |
Which modality to store the output in. | string , default: "rna" |
--cellxgene_release |
CellxGene Census release date. More information: https://chanzuckerberg.github.io/cellxgene-census/cellxgene_census_docsite_data_release_info.html | string , default: "2023-05-15" |
Query
Arguments related to the query.
Name | Description | Attributes |
---|---|---|
--species |
Specie(s) of interest. If not specified, Homo Sapiens will be queried. | string , default: "homo_sapiens" , example: "homo_sapiens" |
--cell_query |
The query for selecting the cells as defined by the cellxgene census schema. | string , example: "is_primary_data == True and cell_type_ontology_term_id in ['CL:0000136', 'CL:1000311', 'CL:0002616'] and suspension_type == 'cell'" |
--cells_filter_columns |
The query for selecting the cells as defined by the cellxgene census schema. | List of string , example: "dataset_id", "tissue", "assay", "disease", "cell_type" , multiple_sep: ":" |
--min_cells_filter_columns |
Minimum of amount of summed cells_filter_columns cells | double , example: 100 |
Outputs
Output arguments.
Name | Description | Attributes |
---|---|---|
--output |
Output h5mu file. | file , required, example: "output.h5mu" |
--output_compression |
string , example: "gzip" |