Insights into the Evolution of Hydroxyproline-Rich Glycoproteins from 1000 Plant Transcriptomes

Plant Physiol. 2017 Jun;174(2):904-921. doi: 10.1104/pp.17.00295. Epub 2017 Apr 26.


The carbohydrate-rich cell walls of land plants and algae have been the focus of much interest given the value of cell wall-based products to our current and future economies. Hydroxyproline-rich glycoproteins (HRGPs), a major group of wall glycoproteins, play important roles in plant growth and development, yet little is known about how they have evolved in parallel with the polysaccharide components of walls. We investigate the origins and evolution of the HRGP superfamily, which is commonly divided into three major multigene families: the arabinogalactan proteins (AGPs), extensins (EXTs), and proline-rich proteins. Using motif and amino acid bias, a newly developed bioinformatics pipeline, we identified HRGPs in sequences from the 1000 Plants transcriptome project ( Our analyses provide new insights into the evolution of HRGPs across major evolutionary milestones, including the transition to land and the early radiation of angiosperms. Significantly, data mining reveals the origin of glycosylphosphatidylinositol (GPI)-anchored AGPs in green algae and a 3- to 4-fold increase in GPI-AGPs in liverworts and mosses. The first detection of cross-linking (CL)-EXTs is observed in bryophytes, which suggests that CL-EXTs arose though the juxtaposition of preexisting SPn EXT glycomotifs with refined Y-based motifs. We also detected the loss of CL-EXT in a few lineages, including the grass family (Poaceae), that have a cell wall composition distinct from other monocots and eudicots. A key challenge in HRGP research is tracking individual HRGPs throughout evolution. Using the 1000 Plants output, we were able to find putative orthologs of Arabidopsis pollen-specific GPI-AGPs in basal eudicots.

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Evolution, Molecular*
  • Glycoproteins / chemistry
  • Glycoproteins / genetics
  • Glycoproteins / metabolism*
  • Glycosylphosphatidylinositols
  • Hydroxyproline / metabolism*
  • Likelihood Functions
  • Mucoproteins / metabolism
  • Phylogeny
  • Plant Proteins / chemistry
  • Plant Proteins / genetics*
  • Plant Proteins / metabolism
  • Plants / genetics*
  • Time Factors
  • Transcriptome / genetics*


  • Glycoproteins
  • Glycosylphosphatidylinositols
  • Mucoproteins
  • Plant Proteins
  • arabinogalactan proteins
  • Hydroxyproline