Read mapping
Map
Map single or paired-end short reads to one or multiple genomes in the pangenome. One SAM or BAM file is generated for each genome included in the analysis.
Required arguments
--database_path/-dp Path to the pangenome database.-1 The first short-read archive in FASTQ format, which can be
gz/bz2 compressed. This file can be precessed interleaved by -il
option.--genome-numbers/-gn A text file containing genome numbers to
map reads against in each line.Optional arguments
-2 The second short-read archive in FASTQ format, which can be
gz/bz2 compressed.--out-format/-of SAM BAM none Writes the alignment
files in BAM or SAM format or don’t write any output files.--output-path/-op (default value: Database path determined
by -dp) : Path to the output files.--threads/-tn (default value: 1) : The number of parallel
working threads.--interleaved/-il Process the fastq file as an interleaved
paired-end archive.--raw-abundance-file/-raf The mapping_summary.txt file from
a previous mapping run (random-best competitive mode) for a better
estimation of coverage in a metagenomic setting.--alignment-mode or -am The alignment mode:-1 : Competitive, none-bests
-2 : Competitive, random-best
-3 : Competitive, all-bests
1 : Normal, none-bests
2 : Normal, random-best (default)
3 : Normal, all-bests
0 : Normal, all-hits
Optional arguments that influence the mapping sensitivity
--very-fast/--fast/--sensitive/--very-sensitive Four
settings that automatically set the parameters controlling the
sensitivity, ranging from least to most sensitive.--min-mapping-identity*/-mmi (default value: 0.5, valid
range: [0..1)) : The minimum acceptable identity of the alignment.--num-kmer-samples/-nks (default value: 15, valid
range: [1..r-k+1]) : The number of kmers sampled from read.--min-hit-length/-mhl (default value: 13, valid range:
[10..100]) : The minimum acceptable length of alignment after
soft-clipping.--max-alignment-length/-mal (default value: 1000, valid
range: [50..5000]) : The maximum acceptable length of alignment.--max-fragment-length/-mfl (default value: 2000, valid
range: [50..5000]) : The maximum acceptable length of fragment.--max-num-locations/-mnl (default value: 15, valid
range: [1..100]) : The maximum number of location of candidate hits
to examine.--alignment-band/-ab (default value: 5, valid range:
[1..100]) : The length of bound of banded alignment.--clipping-stringency/-ci (default value: 1) : The
stringency of soft-clipping.0 : no soft clipping
1 : low
2 : medium
3 : high
Example input files
FASTQ file
@SRR13153715.1 1/1
TGGTCATACAGCAAAGCATAATTGTCACCATTACTATGGCAATCAAGCCAGCTATAAAACCTAGCCAAATGTACCATGGCCATTTTATATACTGCTCATACTTTCCAAGTTCTTGGAGATCGAT
+
EEEEEEEEEEEEEEEAEEEE/EEEEE/AEEEEEEEEEEEEEE/EE/EEE/<EEEEEEE/EEEEEEEEEEEEEAEEEEEAEEEEEAEEAEEEEEEA<AAAEEAEEA<EE/EEEEAEAEA/EEAA/
Genome numbers file
1
2
5
Example commands
$ pantools map -dp arabidopsis_DB -1 ERR031564_1.fastq --reference 1-5
$ pantools map -dp arabidopsis_DB -1 ERR031564_1.fastq -gn genome_numbers.txt
$ pantools map -dp arabidopsis_DB -1 interleaved_reads.fastq --interleaved -gn genome_numbers.txt
$ pantools map -dp arabidopsis_DB -1 ERR031564_1.fastq -2 ERR031564_2.fastq -gn genome_numbers.txt
Output files
mapping_summary.txt, number of mapped and unmapped reads per genome
One SAM or BAM file is generated for each genome included in the analysis.