Using caching and optimization techniques to improve performance of the Ensembl website

BMC Bioinformatics. 2010 May 11:11:239. doi: 10.1186/1471-2105-11-239.

Abstract

Background: The Ensembl web site has provided access to genomic information for almost 10 years. During this time the amount of data available through Ensembl has grown dramatically. At the same time, the World Wide Web itself has become a dramatically more important component of the scientific workflow and the way that scientists share and access data and scientific information. Since 2000, the Ensembl web interface has had three major updates and numerous smaller updates. These have largely been in response to expanding data types and valuable representations of existing data types. In 2007 it was realised that a radical new approach would be required in order to serve the project's future requirements, and development therefore focused on identifying suitable web technologies for implementation in the 2008 site redesign.

Results: By comparing the Ensembl website to well-known "Web 2.0" sites, we were able to identify two main areas in which cutting-edge technologies could be advantageously deployed: server efficiency and interface latency. We then evaluated the performance of the existing site using browser-based tools and Apache benchmarking, and selected appropriate technologies to overcome any issues found. Solutions included optimization of the Apache web server, introduction of caching technologies and widespread implementation of AJAX code. These improvements were successfully deployed on the Ensembl website in late 2008 and early 2009.

Conclusions: Web 2.0 technologies provide a flexible and efficient way to access the terabytes of data now available from Ensembl, enhancing the user experience through improved website responsiveness and a rich, interactive interface.

MeSH terms

  • Computational Biology / methods*
  • Databases, Factual
  • Genome
  • Internet*
  • Software
  • User-Computer Interface