Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily

PLoS One. 2015 Aug 11;10(8):e0135507. doi: 10.1371/journal.pone.0135507. eCollection 2015.

Abstract

Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Binding Sites
  • Carrier Proteins / chemistry
  • Carrier Proteins / metabolism
  • Fatty Acids / chemistry
  • Fatty Acids / metabolism
  • Lipocalins / chemistry*
  • Lipocalins / classification
  • Lipocalins / genetics*
  • Lipocalins / metabolism
  • Models, Molecular
  • Molecular Sequence Data
  • Multigene Family*
  • Phylogeny*
  • Protein Binding
  • Protein Conformation*
  • Protein Interaction Domains and Motifs
  • Quantitative Structure-Activity Relationship
  • Sequence Alignment

Substances

  • Carrier Proteins
  • Fatty Acids
  • Lipocalins

Grant support

This research is supported by the Department of Biotechnology, Government of India as well as by the Mathematical Biology Program sponsored by Department of Science and Technology, Government of India.