Phage lytic proteins are a clinically advanced class of novel enzyme-based antibiotics, so-called enzybiotics. A growing community of researchers develops phage lytic proteins with the perspective of their use as enzybiotics. A successful translation of enzybiotics to the market requires well-considered selections of phage lytic proteins in early research stages. Here, we introduce PhaLP, a database of phage lytic proteins, which serves as an open portal to facilitate the development of phage lytic proteins. PhaLP is a comprehensive, easily accessible and automatically updated database (currently 16,095 entries). Capitalizing on the rich content of PhaLP, we have mapped the high diversity of natural phage lytic proteins and conducted analyses at three levels to gain insight in their host-specific evolution. First, we provide an overview of the modular diversity. Secondly, datamining and interpretable machine learning approaches were adopted to reveal host-specific design rules for domain architectures in endolysins. Lastly, the evolution of phage lytic proteins on the protein sequence level was explored, revealing host-specific clusters. In sum, PhaLP can act as a starting point for the broad community of enzybiotic researchers, while the steadily improving evolutionary insights will serve as a natural inspiration for protein engineers.
Keywords: biological database; conserved protein domains; endolysins; machine learning; phage lytic proteins; protein architectures.