Background: The variety of DNA microarray formats and datasets presently available offers an unprecedented opportunity to perform insightful comparisons of heterogeneous data. Cross-species studies, in particular, have the power of identifying conserved, functionally important molecular processes. Validation of discoveries can now often be performed in readily available public data which frequently requires cross-platform studies.Cross-platform and cross-species analyses require matching probes on different microarray formats. This can be achieved using the information in microarray annotations and additional molecular biology databases, such as orthology databases. Although annotations and other biological information are stored using modern database models (e.g. relational), they are very often distributed and shared as tables in text files, i.e. flat file databases. This common flat database format thus provides a simple and robust solution to flexibly integrate various sources of information and a basis for the combined analysis of heterogeneous gene expression profiles.
Results: We provide annotationTools, a Bioconductor-compliant R package to annotate microarray experiments and integrate heterogeneous gene expression profiles using annotation and other molecular biology information available as flat file databases. First, annotationTools contains a specialized set of functions for mining this widely used database format in a systematic manner. It thus offers a straightforward solution for annotating microarray experiments. Second, building on these basic functions and relying on the combination of information from several databases, it provides tools to easily perform cross-species analyses of gene expression data.Here, we present two example applications of annotationTools that are of direct relevance for the analysis of heterogeneous gene expression profiles, namely a cross-platform mapping of probes and a cross-species mapping of orthologous probes using different orthology databases. We also show how to perform an explorative comparison of disease-related transcriptional changes in human patients and in a genetic mouse model.
Conclusion: The R package annotationTools provides a simple solution to handle microarray annotation and orthology tables, as well as other flat molecular biology databases. Thereby, it allows easy integration and analysis of heterogeneous microarray experiments across different technological platforms or species.