A generalized topological entropy for analyzing the complexity of DNA sequences

PLoS One. 2014 Feb 12;9(2):e88519. doi: 10.1371/journal.pone.0088519. eCollection 2014.

Abstract

Topological entropy is one of the most difficult entropies to be used to analyze the DNA sequences, due to the finite sample and high-dimensionality problems. In order to overcome these problems, a generalized topological entropy is introduced. The relationship between the topological entropy and the generalized topological entropy is compared, which shows the topological entropy is a special case of the generalized entropy. As an application the generalized topological entropy in introns, exons and promoter regions was computed, respectively. The results indicate that the entropy of introns is higher than that of exons, and the entropy of the exons is higher than that of the promoter regions for each chromosome, which suggest that DNA sequence of the promoter regions is more regular than the exons and introns.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Chromosome Mapping
  • Computational Biology / methods*
  • DNA / chemistry
  • Entropy
  • Exons
  • Humans
  • Introns
  • Models, Theoretical
  • Promoter Regions, Genetic
  • Sequence Analysis, DNA / methods*
  • Software

Substances

  • DNA

Grants and funding

This work is supported by the China Natural Science Foundation (Grant No. 11301110, No. 61201084, No. 61102149 and No. 61173085), China Postdoctoral Science Foundation (Grant No. 2013M541346), the State Scholarship Fund from the China Scholarship Council (CSC) (Grant No. 201203070290 and No. 201206125024), the Program for Innovation Research of Science in Harbin Institute of Technology and the Fundamental Research Funds for the Central Universities (Grant No. HEUCF100604). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.