Read mapping
Map
Map single or paired-end short reads to one or multiple genomes in the pangenome. One SAM or BAM file is generated for each genome included in the analysis.
Parameters
<databaseDirectory> |
Path to the database root directory. |
<genomeNumbers> |
A text file containing genome numbers to map reads against in each line. |
<shortReadFiles> |
One or two short-read archives in FASTQ format, which can be gz/bz2 compressed. |
Options
|
Number of threads for MAFFT and IQ-tree, default is the number of cores or 8, whichever is lower. |
|
Path to the output files (default is the database path). |
|
In case of multiple “best” hits, return none, all best hits or a random best hit (Default: random). |
|
Return all hits rather than only the best. |
|
Find the best mapping location in the complete pangenome
|
|
The mapping_summary.txt file from a previous mapping run (random-best competitive mode) for a better estimation of coverage in a metagenomic setting. |
|
Writes the alignment files in BAM or SAM format or don’t write any output files (default: SAM). |
|
Gap open penalty (range: [-50..-1], default: -20). |
|
Gap extension penalty (range: [-5..-1], default: -3). |
|
Process the fastq file as an interleaved paired-end archive. |
|
Check unmapped genomes. |
Options that influence the mapping sensitivity
|
Four settings that automatically set the parameters controlling the sensitivity, ranging from least to most sensitive. |
|
The minimum acceptable identity of the alignment
|
|
The length of bound of banded alignment
|
|
The minimum acceptable length of alignment after soft-clipping
|
|
The maximum number of locations of candidate hits to examine
|
|
The maximum acceptable length of alignment
|
|
The maximum acceptable length of fragment
|
|
The number of kmers sampled from read
|
|
The stringency of soft-clipping (default: 1).
|
Example input files
FASTQ file
@SRR13153715.1 1/1
TGGTCATACAGCAAAGCATAATTGTCACCATTACTATGGCAATCAAGCCAGCTATAAAACCTAGCCAAATGTACCATGGCCATTTTATATACTGCTCATACTTTCCAAGTTCTTGGAGATCGAT
+
EEEEEEEEEEEEEEEAEEEE/EEEEE/AEEEEEEEEEEEEEE/EE/EEE/<EEEEEEE/EEEEEEEEEEEEEAEEEEEAEEEEEAEEAEEEEEEA<AAAEEAEEA<EE/EEEEAEAEA/EEAA/
Genome numbers file
1
2
5
Example commands
$ pantools map arabidopsis_DB genome_numbers.txt ERR031564_1.fastq
$ pantools map --include=1-5 --sensitivity=sensitive arabidopsis_DB genome_numbers.txt ERR031564_1.fastq
$ pantools map --competitive -m=all-bests arabidopsis_DB genome_numbers.txt ERR031564_1.fastq
$ pantools map --interleaved arabidopsis_DB genome_numbers.txt interleaved_reads.fastq
$ pantools map arabidopsis_DB genome_numbers.txt ERR031564_1.fastq ERR031564_2.fastq
Output files
mapping_summary.txt, number of mapped and unmapped reads per genome
One SAM or BAM file is generated for each genome included in the analysis.