Probabilistic cross-species inference of orthologous genomic regions created by whole-genome duplication in yeast

Genetics. 2008 Jul;179(3):1681-92. doi: 10.1534/genetics.107.074450. Epub 2008 Jun 18.


Identification of orthologous genes across species becomes challenging in the presence of a whole-genome duplication (WGD). We present a probabilistic method for identifying orthologs that considers all possible orthology/paralogy assignments for a set of genomes with a shared WGD (here five yeast species). This approach allows us to estimate how confident we can be in the orthology assignments in each genomic region. Two inferences produced by this model are indicative of purifying selection acting to prevent duplicate gene loss. First, our model suggests that there are significant differences (up to a factor of seven) in duplicate gene half-life. Second, we observe differences between the genes that the model infers to have been lost soon after WGD and those lost more recently. Gene losses soon after WGD appear uncorrelated with gene expression level and knockout fitness defect. However, later losses are biased toward genes whose paralogs have high expression and large knockout fitness defects, as well as showing biases toward certain functional groups such as ribosomal proteins. We suggest that while duplicate copies of some genes may be lost neutrally after WGD, another set of genes may be initially preserved in duplicate by natural selection for reasons including dosage.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Candida glabrata / genetics
  • Gene Duplication*
  • Genes, Fungal
  • Genome, Fungal / genetics*
  • Models, Genetic
  • Phylogeny
  • Saccharomyces cerevisiae / genetics
  • Sequence Homology, Nucleic Acid*
  • Species Specificity
  • Time Factors
  • Yeasts / genetics*