Support functions

Analysis support functions used to manipulate output files for better interpretability.

Order matrix

Transforms the CSV table to easy to read file by ordering the values in ascending order from low to high or descending order when --descending is included in the command. If phenotype information is included in the header, a separate file with the range of found values is created for each phenotype. If this information is not present (only genome numbers in the header), use rename_matrix to change the headers.

Parameters

<databaseDirectory>

Path to the database root directory.

<matrixFile>

A CSV formatted matrix file.

Options

--include/-i

Only include a selection of genomes.

--exclude/-e

Exclude a selection of genomes.

--descending

Order the matrix in descending order.

Example commands
$ pantools order_matrix bacteria_DB bacteria_DB/ANI/fastANI/ANI_distance_matrix.csv
$ pantools order_matrix --descending bacteria_DB bacteria_DB/ANI/fastANI/ANI_distance_matrix.csv
Output

Output is written to the same directory as the selected input file

  • old file name’ + ‘_ORDERED’, ordered values of the original matrix file.

When phenotype information is present in the header

  • old file name’ + ‘_PHENOTYPE’, range of values per phenotype.


Rename matrix

Update the headers (first row and leftmost column) of CSV formatted matrix files generated by PanTools. This might be useful in cases where you already performed the analysis but want a different phylogenetic tree; with less genomes or different phenotype information. If no --phenotype is included, headers are changed to only contain genome numbers or sequence identifiers.

Parameters

<databaseDirectory>

Path to the database root directory.

<matrixFile>

A matrix file with numerical values.

Options

--include/-i

Only include a selection of genomes in the new matrix file.

--exclude/-e

Exclude a selection of genomes from the new matrix file.

--phenotype/-p

A phenotype name, used to include phenotype information into the headers.

--[no-]numbers

In- or exclude genome numbers from the headers. Numbers are included by default.

--genome/--sequence

The headers contain genome numbers or sequence identifiers.

--phasing

Include phasing information into the headers.

Example commands
$ pantools rename_matrix pecto_DB pecto_DB/ANI/fastANI/ANI_distance_matrix.csv
$ pantools rename_matrix --no-numbers --phenotype=species pecto_DB pecto_DB/ANI/fastANI/ANI_distance_matrix.csv
Output

Output is written to the same directory as the selected input file.

  • old file name’ + ‘_RENAMED’, the original matrix file with changed headers.