A draft sequence of the rice genome (Oryza sativa L. ssp. indica)

Science. 2002 Apr 5;296(5565):79-92. doi: 10.1126/science.1068037.


We have produced a draft sequence of the rice genome for the most widely cultivated subspecies in China, Oryza sativa L. ssp. indica, by whole-genome shotgun sequencing. The genome was 466 megabases in size, with an estimated 46,022 to 55,615 genes. Functional coverage in the assembled sequences was 92.0%. About 42.2% of the genome was in exact 20-nucleotide oligomer repeats, and most of the transposons were in the intergenic regions between genes. Although 80.6% of predicted Arabidopsis thaliana genes had a homolog in rice, only 49.4% of predicted rice genes had a homolog in A. thaliana. The large proportion of rice genes with no recognizable homologs is due to a gradient in the GC content of rice coding sequences.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Arabidopsis / genetics
  • Base Composition
  • Computational Biology
  • Contig Mapping
  • DNA Transposable Elements
  • DNA, Intergenic
  • DNA, Plant / chemistry
  • DNA, Plant / genetics
  • Databases, Nucleic Acid
  • Exons
  • Gene Duplication
  • Genes, Plant
  • Genome, Plant*
  • Genomics
  • Introns
  • Molecular Sequence Data
  • Oryza / genetics*
  • Plant Proteins / chemistry
  • Plant Proteins / genetics
  • Polymorphism, Genetic
  • Repetitive Sequences, Nucleic Acid
  • Sequence Analysis, DNA*
  • Sequence Homology, Nucleic Acid
  • Software
  • Species Specificity
  • Synteny


  • DNA Transposable Elements
  • DNA, Intergenic
  • DNA, Plant
  • Plant Proteins