Support functions
Analysis support functions used to manipulate output files for better interpretability.
Order matrix
Transforms the CSV table to easy to read file by ordering the values in
ascending order from low to high or descending order when
--descending
is included in the command. If phenotype information is
included in the header, a separate file with the range of found values
is created for each phenotype. If this information is not present (only
genome numbers in the header), use
rename_matrix to change the headers.
- Parameters
<databaseDirectory>
Path to the database root directory.
<matrixFile>
A CSV formatted matrix file.
- Options
--include
/-i
Only include a selection of genomes.
--exclude
/-e
Exclude a selection of genomes.
--descending
Order the matrix in descending order.
- Example commands
$ pantools order_matrix bacteria_DB bacteria_DB/ANI/fastANI/ANI_distance_matrix.csv $ pantools order_matrix --descending bacteria_DB bacteria_DB/ANI/fastANI/ANI_distance_matrix.csv
- Output
Output is written to the same directory as the selected input file
‘old file name’ + ‘_ORDERED’, ordered values of the original matrix file.
When phenotype information is present in the header
‘old file name’ + ‘_PHENOTYPE’, range of values per phenotype.
Rename matrix
Update the headers (first row and leftmost column) of CSV formatted matrix
files generated by PanTools. This might be useful in cases where you
already performed the analysis but want a different phylogenetic tree;
with less genomes or different phenotype information. If no --phenotype
is
included, headers are changed to only contain genome numbers or sequence
identifiers.
- Parameters
<databaseDirectory>
Path to the database root directory.
<matrixFile>
A matrix file with numerical values.
- Options
--include
/-i
Only include a selection of genomes in the new matrix file.
--exclude
/-e
Exclude a selection of genomes from the new matrix file.
--phenotype
/-p
A phenotype name, used to include phenotype information into the headers.
--[no-]numbers
In- or exclude genome numbers from the headers. Numbers are included by default.
--genome
/--sequence
The headers contain genome numbers or sequence identifiers.
--phasing
Include phasing information into the headers.
- Example commands
$ pantools rename_matrix pecto_DB pecto_DB/ANI/fastANI/ANI_distance_matrix.csv $ pantools rename_matrix --no-numbers --phenotype=species pecto_DB pecto_DB/ANI/fastANI/ANI_distance_matrix.csv
- Output
Output is written to the same directory as the selected input file.
‘old file name’ + ‘_RENAMED’, the original matrix file with changed headers.