Design of a genome-wide siRNA library using an artificial neural network

Nat Biotechnol. 2005 Aug;23(8):995-1001. doi: 10.1038/nbt1118. Epub 2005 Jul 17.


The largest gene knock-down experiments performed to date have used multiple short interfering/short hairpin (si/sh)RNAs per gene. To overcome this burden for design of a genome-wide siRNA library, we used the Stuttgart Neural Net Simulator to train algorithms on a data set of 2,182 randomly selected siRNAs targeted to 34 mRNA species, assayed through a high-throughput fluorescent reporter gene system. The algorithm, (BIOPREDsi), reliably predicted activity of 249 siRNAs of an independent test set (Pearson coefficient r = 0.66) and siRNAs targeting endogenous genes at mRNA and protein levels. Neural networks trained on a complementary 21-nucleotide (nt) guide sequence were superior to those trained on a 19-nt sequence. BIOPREDsi was used in the design of a genome-wide siRNA collection with two potent siRNAs per gene. When this collection of 50,000 siRNAs was used to identify genes involved in the cellular response to hypoxia, two of the most potent hits were the key hypoxia transcription factors HIF1A and ARNT.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computer Simulation
  • Computer-Aided Design
  • Gene Library
  • Gene Silencing*
  • Models, Genetic*
  • Models, Statistical
  • Molecular Sequence Data
  • Nerve Net*
  • RNA, Small Interfering / chemistry*
  • RNA, Small Interfering / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, RNA / methods*


  • RNA, Small Interfering