Inferring Orthologs: Open Questions and Perspectives
- PMID: 26966373
- PMCID: PMC4778853
- DOI: 10.4137/GEI.S37925
Inferring Orthologs: Open Questions and Perspectives
Abstract
With the increasing number of sequenced genomes and their comparisons, the detection of orthologs is crucial for reliable functional annotation and evolutionary analyses of genes and species. Yet, the dynamic remodeling of genome content through gain, loss, transfer of genes, and segmental and whole-genome duplication hinders reliable orthology detection. Moreover, the lack of direct functional evidence and the questionable quality of some available genome sequences and annotations present additional difficulties to assess orthology. This article reviews the existing computational methods and their potential accuracy in the high-throughput era of genome sequencing and anticipates open questions in terms of methodology, reliability, and computation. Appropriate taxon sampling together with combination of methods based on similarity, phylogeny, synteny, and evolutionary knowledge that may help detecting speciation events appears to be the most accurate strategy. This review also raises perspectives on the potential determination of orthology throughout the whole species phylogeny.
Keywords: HGT; evolutionary processes; genome annotation quality; genome trees; multidomains; phylogeny; synteny; taxon sampling.
Figures
an intraspecies duplication of gene g giving rise to two genes g1 and g2 (note that g is no more visible in species S);
a speciation event giving rise to two species A and B with identical contents as S; in particular, g1 and g2 are denoted as g1a and g2a in A and g1b and g2b in B;
we assume that in B, g2b is duplicated and gives rise to g2b1 and g2b2; (Note that g2b is no more visible in B).
– g1 and g2 are homologs because they descend from g. Similarly, g1a and g1b are homologs because they descend from g1;
– g1 and g2 are in-paralogs, because they are duplicated in S;
– Similarly, g2b1 and g2b2 are in-paralogs because they are duplicated in B;
– g1a and g2a are out-paralogs because their ancestors are duplicated in S;
– Similarly, g1b and each of g2b1 and g2b2 are out-paralogs, because their ancestors are duplicated in S;
– g1a and g1b are orthologs because they are in distinct species A and B, respectively, with a common ancestor g1;
– g2a and g2b1 and g2a and g2b2 are orthologs because they are in distinct species A and B, respectively, with the same ancestor g2. g2b1 and g2b2 are also called co-orthologs to g2a.
Similar articles
-
Computational methods for Gene Orthology inference.Brief Bioinform. 2011 Sep;12(5):379-91. doi: 10.1093/bib/bbr030. Epub 2011 Jun 19. Brief Bioinform. 2011. PMID: 21690100 Free PMC article.
-
Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies.BMC Genomics. 2016 Aug 11;17(1):617. doi: 10.1186/s12864-016-2927-4. BMC Genomics. 2016. PMID: 27515907 Free PMC article.
-
QuartetS: a fast and accurate algorithm for large-scale orthology detection.Nucleic Acids Res. 2011 Jul;39(13):e88. doi: 10.1093/nar/gkr308. Epub 2011 May 13. Nucleic Acids Res. 2011. PMID: 21572104 Free PMC article.
-
Orthologs, paralogs, and evolutionary genomics.Annu Rev Genet. 2005;39:309-38. doi: 10.1146/annurev.genet.39.073003.114725. Annu Rev Genet. 2005. PMID: 16285863 Review.
-
Inferring orthology and paralogy.Methods Mol Biol. 2012;855:259-79. doi: 10.1007/978-1-61779-582-4_9. Methods Mol Biol. 2012. PMID: 22407712 Review.
Cited by
-
Orthology and synteny analysis of receptor-like kinases "RLK" and receptor-like proteins "RLP" in legumes.BMC Genomics. 2021 Feb 10;22(1):113. doi: 10.1186/s12864-021-07384-w. BMC Genomics. 2021. PMID: 33568053 Free PMC article.
-
Immunologic Profiling of the Atlantic Salmon Gill by Single Nuclei Transcriptomics.Front Immunol. 2021 May 4;12:669889. doi: 10.3389/fimmu.2021.669889. eCollection 2021. Front Immunol. 2021. PMID: 34017342 Free PMC article.
-
Hybrid Deep Learning Based on a Heterogeneous Network Profile for Functional Annotations of Plasmodium falciparum Genes.Int J Mol Sci. 2021 Sep 16;22(18):10019. doi: 10.3390/ijms221810019. Int J Mol Sci. 2021. PMID: 34576183 Free PMC article.
-
Detection of colinear blocks and synteny and evolutionary analyses based on utilization of MCScanX.Nat Protoc. 2024 Jul;19(7):2206-2229. doi: 10.1038/s41596-024-00968-2. Epub 2024 Mar 15. Nat Protoc. 2024. PMID: 38491145 Review.
-
Genes of the pig, Sus scrofa, reconstructed with EvidentialGene.PeerJ. 2019 Feb 1;7:e6374. doi: 10.7717/peerj.6374. eCollection 2019. PeerJ. 2019. PMID: 30723633 Free PMC article.
References
-
- Fleischmann RD, Adams MD, White O, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269:496–512. - PubMed
-
- Wolfe KH, Shields DC. Molecular evidence for an ancient duplication of the entire yeast genome. Nature. 1997;387:708–713. - PubMed
-
- Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151–1155. - PubMed
Publication types
LinkOut - more resources
Full Text Sources
Other Literature Sources
