Welcome to PhaLP: the database of Phage Lytic Proteins.

Phage lytic proteins are currently the most advanced alternative antibacterials under clinical investigation. These enzymes originate from bacteriophages and rapidly degrade bacterial peptidoglycan, resulting in immediate cell death. They offer a necessary response to the alarming threat of antibiotic resistance on global health care systems. One of the most important features of phage lytic proteins is that they can be considered as a novel class of antibiotics, coined enzybiotics, with the potential of targeting any specific bacterial pathogen. A growing community of companies and researchers is therefore investigating their applications and engineering their properties to kill a broad diversity of bacteria. Considering the high attrition rates during clinical evaluation, it is essential to get a higher number of candidates in the preclinical pipeline for a guaranteed translation of phage lytic proteins into diverse new therapies. To be successful, it is crucial to make well-considered selections of phage lytic proteins during early research stages.
The PhaLP database provides an extensive, high-quality and up-to-date collection of data that is highly searchable by researchers in and outside the field. It serves as a portal to interact with the current diversity available in biological databases. Next to basic sequence data, the PhaLP database provides information on the protein sequence, coding domain sequence (CDS), phage, its host(s), conserved domains, enzymatic activity, gene ontologies, 3D structures, experimental evidence, etc.


To interact with the PhaLP database, two user interfaces are provided:

  • To explore the database or quickly find a specific entry, a basic browser is available. It is an interactive table with basic information on each protein entry and the phage encoding it. Entries can be filtered and sorted on each column. By clicking on the accession number, you reach an overview page where all data that is linked to that entry is available (example).

  • For more advanced searching, a BioMart is available. This allows you to search on almost every field of every table contained in the database. You can also select which fields are visualized in the final table. After choosing your settings, the resulting table can be browsed and downloaded as a tab-separated values file. This can be easily imported in excel or any analysis software of your choice. A tutorial on how to use the BioMart is available here.


The latest PhaLP version as well as all previous versions are available as a MySQL dump file. This allows for more advanced querying or to integrate PhaLP in a custom pipeline.

Biological background

Phage lytic proteins are essential for the successful completion of the lytic life cycle of bacteriophages (panel A). There are two types of natural phage lytic proteins: virion-associated lysins (VALs) and endolysins. They are required in two stages (stages 1 and 6) of the life cycle. Initially, phages need to cross the bacterial cell wall and thereby overcome the major structural component of the cell wall: peptidoglycan. In the infection stage (stage 1) the virion particle encounters a bacterial cell and, after binding, needs to inject its genomic material into the cell. To achieve this, so called VALs make a small pore in the peptidoglycan layer allowing for the phage genome to cross (stage 2; panel B, left). After genome replication (stage 3), virion production (stage 4) and assembly (stage 5), the phage progeny is ready to be set free. Meanwhile, endolysins have been accumulating in the cytosol and holins in the cytoplasmic membrane. Once the holin concentration has reached a certain threshold, they will multimerize and form pores, allowing the endolysins to enter the periplasmic space (panel B, right). Here, the endolysins will degrade the peptidoglycan and compromise the structural integrity of the cell with lysis and cell death as a result (stage 6).

Database architecture

The MySQL-based PhaLP database integrates nine data types (proteins, phages, hosts, conserved domains, coding sequences (CDSs), gene ontologies (GOs), enzymatic activities (ECs), tertiary structures, experimental evidence) originating from multiple sources databases (UniProt, UniParc, NCBI taxonomy, Virus-Host DB, InterPro, GenBank, QuickGO, ExPASy ENZYME database, PDB and PubMed). The EER diagram below provides an overview of these data types (a description, the corresponding MySQL tables, the number of entries for each data type in PhaLP v 2019_10 and the data source) and their mutual relationships.

For efficient storage and querying, the nine data types are stored in fourteen tables on the MySQL level. The more detailed EER diagram below gives an overview of the fourteen MySQL tables, their mutual relationships and their columns.