Ensembl Genomes 2018: An Integrated Omics Infrastructure for Non-Vertebrate Species

Nucleic Acids Res. 2018 Jan 4;46(D1):D802-D808. doi: 10.1093/nar/gkx1011.

Abstract

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including genome sequence, gene models, transcript sequence, genetic variation, and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments and expansions. These include the incorporation of almost 20 000 additional genome sequences and over 35 000 tracks of RNA-Seq data, which have been aligned to genomic sequence and made available for visualization. Other advances since 2015 include the release of the database in Resource Description Framework (RDF) format, a large increase in community-derived curation, a new high-performance protein sequence search, additional cross-references, improved annotation of non-protein-coding genes, and the launch of pre-release and archival sites. Collectively, these changes are part of a continuing response to the increasing quantity of publicly-available genome-scale data, and the consequent need to archive, integrate, annotate and disseminate these using automated, scalable methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Archaea / genetics*
  • Bacteria / genetics*
  • Base Sequence
  • Data Mining
  • Databases, Genetic*
  • Databases, Protein*
  • Eukaryota / genetics*
  • Forecasting
  • Genome
  • Genomics*
  • Molecular Sequence Annotation
  • RNA / genetics
  • User-Computer Interface

Substances

  • RNA