Differences between pangenome and panproteome

PanTools offers functionalities to build and analyze a pangenome or panproteome.

A pangenome is constructed from genome and annotation files. First, genome sequences are k-merized and compressed into a De Bruijn graph. Genes and other annotation features from annotation files are integrated into the pangenome as ‘gene’, ‘mRNA’ and ‘CDS’ nodes. Gene start and stop positions are annotated in the graph as relationships and connect the annotation layer to the nucleotide layer. The protein sequences can be clustered into homology groups and connect homologous proteins from different genomes.

A panproteome is built from protein sequences only, ignoring the underlying genome structure. Again, the protein sequences are clustered into homology groups which serve as main input for many functionalities.

In addition to the single layer in panproteomes and three layers in pangenomes, a functional layer can be included in both databases. This layer consists of multiple functional annotation databases (e.g. GO, PFAM) and connects proteins with a shared function.

Since there is only a protein layer and functional layer present in panproteomes, not all functions can be utilized. See the table below for which functions can be used for pangenomes and panproteomes.

../_images/layers.png

Fig. 8 Schematic of genome, annotation, and protein layer of a pangenome database. Figure taken from Efficient inference of homologs in large eukaryotic pan-proteomes

Available functions

Construct a pangenome

Function

Pangenome

Panproteome

Build pangenome

YES

NO

Build panproteome

NO

YES

Add annotations

YES

NO

Add genomes

YES

NO

Group

YES

YES

Optimal grouping

YES

YES

Change grouping

YES

YES

BUSCO protein

YES

YES

Add phenotype

YES

YES

Add functional annotations

YES

YES

Add antiSMASH

YES

NO

Remove nodes

YES

YES

Move or remove grouping

YES

YES

Pangenome characterization

Function

Pangenome

Panproteome

Statistics

YES

YES

Gene classification

YES

YES

Core unique thresholds

YES

YES

Grouping overview

YES

YES

Pangenome structure for homology groups

YES

YES

Pangenome structure for k-mers

YES

NO

K-mer classification

YES

NO

Functional classification

YES

YES

Functional annotation overview

YES

YES

Explore the pangenome

Function

Pangenome

Panproteome

Locate genes

YES

NO

mRNAs connected to function

YES

NO

Find gene

YES

NO

GO enrichment

YES

YES

Show GO

YES

YES

Compare GO

YES

YES

Compare BGC

YES

NO

Alignment of homology group

YES

YES

Alignment of multiple homology groups

YES

YES

Alignment of genomic regions

YES

NO

Order matrix

YES

YES

Rename matrix

YES

YES

Retrieve genomes

YES

NO

Retrieve regions

YES

NO

Retrieve features

YES

NO

Phylogeny

Function

Pangenome

Panproteome

Core SNP tree

YES

YES

K-mer distance tree

YES

NO

Gene distance tree

YES

YES

ANI tree

YES

NO

MLSA

YES

NO

Rename phylogeny

YES

YES

Create tree template

YES

YES

Read mapping

Function

Pangenome

Panproteome

Map

YES

NO