A DNA assembly model of sentence generation

Biosystems. 2011 Oct;106(1):51-6. doi: 10.1016/j.biosystems.2011.06.007. Epub 2011 Jun 25.


Recent results of corpus-based linguistics demonstrate that context-appropriate sentences can be generated by a stochastic constraint satisfaction process. Exploiting the similarity of constraint satisfaction and DNA self-assembly, we explore a DNA assembly model of sentence generation. The words and phrases in a language corpus are encoded as DNA molecules to build a language model of the corpus. Given a seed word, the new sentences are constructed by a parallel DNA assembly process based on the probability distribution of the word and phrase molecules. Here, we present our DNA code word design and report on successful demonstration of their feasibility in wet DNA experiments of a small scale.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • DNA / chemistry*
  • Models, Molecular*
  • Molecular Sequence Data


  • DNA