The Z curve database: a graphic representation of genome sequences

Bioinformatics. 2003 Mar 22;19(5):593-9. doi: 10.1093/bioinformatics/btg041.

Abstract

Motivation: Genome projects for many prokaryotic and eukaryotic species have been completed and more new genome projects are being underway currently. The availability of a large number of genomic sequences for researchers creates a need to find graphic tools to study genomes in a perceivable form. The Z curve is one of such tools available for visualizing genomes. The Z curve is a unique three-dimensional curve representation for a given DNA sequence in the sense that each can be uniquely reconstructed given the other. The Z curve database for more than 1000 genomes have been established here.

Results: The database contains the Z curves for archaea, bacteria, eukaryota, organelles, phages, plasmids, viroids and viruses, whose genomic sequences are currently available. All the 3-dimensional Z curves and their three component curves are stored in the database. The applications of the Z curve database on comparative genomics, gene prediction, computation of G+C content with a windowless technique, prediction of replication origins and terminations of bacterial and archaeal genomes and study of local deviations from the Chargaff Parity Rule 2 etc. are presented in detail. The Z curve database reported here is a treasure trove in which biologists could find useful biological knowledge.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Animals
  • Base Sequence
  • Computer Graphics
  • DNA / chemistry
  • DNA / classification
  • DNA / genetics*
  • Database Management Systems*
  • Databases, Nucleic Acid*
  • Genome*
  • Humans
  • Information Storage and Retrieval / methods
  • Molecular Sequence Data
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Software
  • User-Computer Interface*

Substances

  • DNA