Genetic disorders are often caused by nonsynonymous nucleotide changes in one or more genes associated with the disease. Specific amino acid changes, however, can lead to large variability of phenotypic expression. For many genetic disorders this results in an increasing amount of publications describing phenotype-associated mutations in disorder-related genes. Keeping up with this stream of publications is essential for molecular diagnostics and translational research purposes but often impossible due to time constraints: there are simply too many articles to read. To help solve this problem, we have created Mutator, an automated method to extract mutations from full-text articles. Extracted mutations are crossreferenced to sequence data and a scoring method is applied to distinguish false-positives. To analyze stored and new mutation data for their (potential) effect we have developed Validator, a Web-based tool specifically designed for DNA diagnostics. Fabry disease, a monogenetic gene disorder of the GLA gene, was used as a test case. A structure-based sequence alignment of the alpha-amylase superfamily was used to validate results. We have compared our data with existing Fabry mutation data sets obtained from the HGMD and Swiss-Prot databases. Compared to these data sets, Mutator extracted 30% additional mutations from the literature.
Copyright 2010 Wiley-Liss, Inc.