Improvement of base-calling in multilane automated DNA sequencing by use of electrophoretic calibration standards, data linearization, and trace alignment

Electrophoresis. 2001 Jun;22(10):1906-14. doi: 10.1002/1522-2683(200106)22:10<1906::AID-ELPS1906>3.0.CO;2-5.

Abstract

We present a new method for the linearization and alignment of data traces generated by multilane automated DNA sequencing instruments. Application of this method to data generated with the Visible Genetics Open Gene DNA sequencing system (using MicroCel 700 gel cassettes, with a 25 cm separation distance) allows read lengths of > 1,000 nucleotides to be routinely obtained with high confidence and > 97% accuracy. This represents an increase of 10-15% in average read length, relative to data from this system that have not been processed in the fashion described herein. Most importantly, the linearization and alignment method allows usable sequence to be obtained from a fraction of 10-15% of data sets which, because of original trace misalignment problems, would otherwise have to be discarded. Our method involves adding electrophoretic calibration standards to the DNA sequencing fragments. The calibration standards are labeled with a dye that differs spectrally from the dye attached to the sequencing fragments. The calibration standards are identical in all the lanes. Analysis of the mobilities of the calibration standards allows correction for both systematic and random variation of electrophoretic properties between gel lanes. We have successfully used this method with two-dye and three-dye DNA sequencing instruments.

MeSH terms

  • Algorithms
  • Electrophoresis, Polyacrylamide Gel / methods*
  • Electrophoresis, Polyacrylamide Gel / standards
  • Fluorescent Dyes
  • Reference Standards
  • Sequence Alignment / methods
  • Sequence Alignment / statistics & numerical data
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards
  • Sequence Analysis, DNA / statistics & numerical data

Substances

  • Fluorescent Dyes