A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution

Joseph P Bielawski; Ziheng Yang

doi:10.1007/s00239-004-2597-8

A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution

J Mol Evol. 2004 Jul;59(1):121-32. doi: 10.1007/s00239-004-2597-8.

Authors

Joseph P Bielawski¹, Ziheng Yang

Affiliation

¹ Department of Biology, University College London, London, WC1E 6BT, UK. j.bielawski@dal.ca

PMID: 15383915
DOI: 10.1007/s00239-004-2597-8

Abstract

The tailoring of existing genetic systems to new uses is called genetic co-option. Mechanisms of genetic co-option have been difficult to study because of difficulties in identifying functionally important changes. One way to study genetic co-option in protein-coding genes is to identify those amino acid sites that have experienced changes in selective pressure following a genetic co-option event. In this paper we present a maximum likelihood method useful for measuring divergent selective pressures and identifying the amino acid sites affected by divergent selection. The method is based on a codon model of evolution and uses the nonsynonymous-to-synonymous rate ratio (omega) as a measure of selection on the protein, with omega = 1, < 1, and > 1 indicating neutral evolution, purifying selection, and positive selection, respectively. The model allows variation in omega among sites, with a fraction of sites evolving under divergent selective pressures. Divergent selection is indicated by different omega's between clades, such as between paralogous clades of a gene family. We applied the codon model to duplication followed by functional divergence of (i) the epsilon and gamma globin genes and (ii) the eosinophil cationic protein (ECP) and eosinophil-derived neurotoxin (EDN) genes. In both cases likelihood ratio tests suggested the presence of sites evolving under divergent selective pressures. Results of the epsilon and gamma globin analysis suggested that divergent selective pressures might be a consequence of a weakened relationship between fetal hemoglobin and 2,3-diphosphoglycerate. We suggest that empirical Bayesian identification of sites evolving under divergent selective pressures, combined with structural and functional information, can provide a valuable framework for identifying and studying mechanisms of genetic co-option. Limitations of the new method are discussed.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Amino Acids / genetics*
Bayes Theorem
Codon / genetics*
Eosinophil Cationic Protein / genetics
Eosinophil-Derived Neurotoxin / genetics
Evolution, Molecular*
Gene Duplication
Genetic Variation
Globins / genetics
Likelihood Functions
Models, Genetic*
Multigene Family / genetics*
Phylogeny
Selection, Genetic

Substances

Amino Acids
Codon
Globins
Eosinophil-Derived Neurotoxin
Eosinophil Cationic Protein