Background: Despite considerable efforts within the microarray community for standardising data format, content and description, microarray technologies present major challenges in managing, sharing, analysing and re-using the large amount of data generated locally or internationally. Additionally, it is recognised that inconsistent and low quality experimental annotation in public data repositories significantly compromises the re-use of microarray data for meta-analysis. MiMiR, the Microarray data Mining Resource was designed to tackle some of these limitations and challenges. Here we present new software components and enhancements to the original infrastructure that increase accessibility, utility and opportunities for large scale mining of experimental and clinical data.
Results: A user friendly Online Annotation Tool allows researchers to submit detailed experimental information via the web at the time of data generation rather than at the time of publication. This ensures the easy access and high accuracy of meta-data collected. Experiments are programmatically built in the MiMiR database from the submitted information and details are systematically curated and further annotated by a team of trained annotators using a new Curation and Annotation Tool. Clinical information can be annotated and coded with a clinical Data Mapping Tool within an appropriate ethical framework. Users can visualise experimental annotation, assess data quality, download and share data via a web-based experiment browser called MiMiR Online. All requests to access data in MiMiR are routed through a sophisticated middleware security layer thereby allowing secure data access and sharing amongst MiMiR registered users prior to publication. Data in MiMiR can be mined and analysed using the integrated EMAAS open source analysis web portal or via export of data and meta-data into Rosetta Resolver data analysis package.
Conclusion: The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication. MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia. The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies.