Expanding the nitrogen regulatory protein superfamily: Homology detection at below random sequence identity

Lisa N Kinch; Nick V Grishin

doi:10.1002/prot.10110

Expanding the nitrogen regulatory protein superfamily: Homology detection at below random sequence identity

Proteins. 2002 Jul 1;48(1):75-84. doi: 10.1002/prot.10110.

Authors

Lisa N Kinch¹, Nick V Grishin

Affiliation

¹ Howard Hughes Medical Institute, and Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA.

PMID: 12012339
DOI: 10.1002/prot.10110

Abstract

Nitrogen regulatory (PII) proteins are signal transduction molecules involved in controlling nitrogen metabolism in prokaryots. PII proteins integrate the signals of intracellular nitrogen and carbon status into the control of enzymes involved in nitrogen assimilation. Using elaborate sequence similarity detection schemes, we show that five clusters of orthologs (COGs) and several small divergent protein groups belong to the PII superfamily and predict their structure to be a (betaalphabeta)(2) ferredoxin-like fold. Proteins from the newly emerged PII superfamily are present in all major phylogenetic lineages. The PII homologs are quite diverse, with below random (as low as 1%) pairwise sequence identities between some members of distant groups. Despite this sequence diversity, evidence suggests that the different subfamilies retain the PII trimeric structure important for ligand-binding site formation and maintain a conservation of conservations at residue positions important for PII function. Because most of the orthologous groups within the PII superfamily are composed entirely of hypothetical proteins, our remote homology-based structure prediction provides the only information about them. Analogous to structural genomics efforts, such prediction gives clues to the biological roles of these proteins and allows us to hypothesize about locations of functional sites on model structures or rationalize about available experimental information. For instance, conserved residues in one of the families map in close proximity to each other on PII structure, allowing for a possible metal-binding site in the proteins coded by the locus known to affect sensitivity to divalent metal ions. Presented analysis pushes the limits of sequence similarity searches and exemplifies one of the extreme cases of reliable sequence-based structure prediction. In conjunction with structural genomics efforts to shed light on protein function, our strategies make it possible to detect homology between highly diverse sequences and are aimed at understanding the most remote evolutionary connections in the protein world.

MeSH terms

Amino Acid Sequence
Animals
Bacterial Proteins*
Binding Sites
DNA-Binding Proteins / chemistry*
DNA-Binding Proteins / classification*
DNA-Binding Proteins / physiology
Evolution, Molecular
Hydrophobic and Hydrophilic Interactions
Ligands
Metals / chemistry
Models, Molecular
Molecular Sequence Data
Nitrogen / metabolism
PII Nitrogen Regulatory Proteins
Protein Folding
Protein Structure, Secondary
Sensitivity and Specificity
Sequence Alignment
Sequence Analysis, Protein / methods*
Sequence Homology, Amino Acid
Trans-Activators*
Transcription Factors*

Substances

Bacterial Proteins
DNA-Binding Proteins
Ligands
Metals
PII Nitrogen Regulatory Proteins
Trans-Activators
Transcription Factors
Nitrogen