Abstract
We present a statistical method that can swiftly identify, from the literature, sets of genes known to be associated with given diseases. It offers a comprehensive way to treat alias symbols, a statistical method for computing the relevance of the gene to the query, and a novel way to disambiguate gene symbols from other abbreviations. The method is illustrated by finding genes related to breast cancer.
MeSH terms
-
Abstracting and Indexing
-
Animals
-
Database Management Systems
-
Databases, Genetic*
-
Gene Expression Profiling / methods*
-
Genetic Markers / genetics*
-
Genetic Predisposition to Disease / classification
-
Genetic Predisposition to Disease / genetics*
-
Genetic Testing / methods
-
Humans
-
Information Storage and Retrieval / methods*
-
Natural Language Processing*
-
Periodicals as Topic*
-
Vocabulary, Controlled