Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates

PeerJ. 2017 May 30:5:e3391. doi: 10.7717/peerj.3391. eCollection 2017.


Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dNdS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dNdS values relate to Rate4Site scores is not known. Here we elucidate the relationship between these two rate measurements. We simulate sequences with known dNdS, using either dNdS models or mutation-selection models for simulation. We then infer Rate4Site scores on the simulated alignments, and we compare those scores to either true or inferred dNdS values on the same alignments. We find that Rate4Site scores generally correlate well with true dNdS, and the correlation strengths increase in alignments with greater sequence divergence and more taxa. Moreover, Rate4Site scores correlate very well with inferred (as opposed to true) dNdS values, even for small alignments with little divergence. Finally, we verify this relationship between Rate4Site and dNdS in a variety of empirical datasets. We conclude that codon-level and amino-acid-level analysis frameworks are directly comparable and yield very similar inferences.

Keywords: Amino-acid evolution models; Codon evolution models; Evolutionary rate; Rate variation.