Samtools sort
Sort and (optionally) index alignments.
Info
ID: samtools_sort
Namespace: mapping
Links
Reads are sorted by leftmost coordinates, or by read name when --sort_by_read_names
is used.
An appropriate @HD-SO
sort order header tag will be added or an existing one updated if necessary.
Note that to generate an index file (by specifying --output_bai
), the default coordinate sort must be used. Thus the --sort_by_read_names
and --sort_by <TAG>
options are incompatible with --output_bai
.
Example commands
You can run the pipeline using nextflow run
.
View help
You can use --help
as a parameter to get an overview of the possible parameters.
nextflow run openpipelines-bio/openpipeline \
-r 1.0.2 -latest \
-main-script target/nextflow/mapping/samtools_sort/main.nf \
--help
Run command
Example of params.yaml
# Arguments
minimizer_cluster: false
# minimizer_kmer: 20
sort_by_read_names: false
# sort_by: "foo"
no_pg: false
# Input
input: # please fill in - example: "input.bam"
# Output
# output_bam: "$id.$key.output_bam.bam"
# output_bai: "$id.$key.output_bai.bai"
# output_format: "bam"
# compression: 5
# Nextflow input-output arguments
publish_dir: # please fill in - example: "output/"
# param_list: "my_params.yaml"
nextflow run openpipelines-bio/openpipeline \
-r 1.0.2 -latest \
-profile docker \
-main-script target/nextflow/mapping/samtools_sort/main.nf \
-params-file params.yaml
Note
Replace -profile docker
with -profile podman
or -profile singularity
depending on the desired backend.
Argument groups
Input
Name | Description | Attributes |
---|---|---|
--input |
Path to the SAM/BAM/CRAM files containing the mapped reads. | file , required, example: "input.bam" |
Output
Name | Description | Attributes |
---|---|---|
--output_bam |
Filename to output the counts to. | file , required, example: "output.bam" |
--output_bai |
BAI-format index for BAM file. | file , example: "output.bam.bai" |
--output_format |
The output format. By default, samtools tries to select a format based on the -o filename extension; if output is to standard output or no format can be deduced, bam is selected. | string , example: "bam" |
--compression |
Compression level, from 0 (uncompressed) to 9 (best | integer , example: 5 |
Arguments
Name | Description | Attributes |
---|---|---|
--minimizer_cluster |
Sort unmapped reads (those in chromosome “*“) by their sequence minimiser (Schleimer et al., 2003; Roberts et al., 2004), also reverse complementing as appropriate. This has the effect of collating some similar data together, improving the compressibility of the unmapped sequence. The minimiser kmer size is adjusted using the -K option. Note data compressed in this manner may need to be name collated prior to conversion back to fastq. Mapped sequences are sorted by chromosome and position. | boolean_true |
--minimizer_kmer |
Sets the kmer size to be used in the -M option. | integer , example: 20 |
--sort_by_read_names |
Sort by read names (i.e., the QNAME field) rather than by chromosomal coordinates. | boolean_true |
--sort_by |
Sort first by this value in the alignment tag, then by position or name (if also using -n). | string |
--no_pg |
Do not add a @PG line to the header of the output file. | boolean_true |