Avian Immunome DB: an example of a user-friendly interface for extracting genetic information

BMC Bioinformatics. 2020 Nov 12;21(1):502. doi: 10.1186/s12859-020-03764-3.

Abstract

Background: Genomic and genetic studies often require a target list of genes before conducting any hypothesis testing or experimental verification. With the ever-growing number of sequenced genomes and a variety of different annotation strategies, comes the potential for ambiguous gene symbols, making it cumbersome to capture the "correct" set of genes. In this article, we present and describe the Avian Immunome DB (AVIMM) for easy gene property extraction as exemplified by avian immune genes. The avian immune system is characterised by a cascade of complex biological processes underlaid by more than 1000 different genes. It is a vital trait to study particularly in birds considering that they are a significant driver in spreading zoonotic diseases. With the completion of phase II of the B10K ("Bird 10,000 Genomes") consortium's whole-genome sequencing effort, we have included 363 annotated bird genomes in addition to other publicly available bird genome data which serve as a valuable foundation for AVIMM.

Construction and content: A relational database with avian immune gene evidence from Gene Ontology, Ensembl, UniProt and the B10K consortium has been designed and set up. The foundation stone or the "seed" for the initial set of avian immune genes is based on the well-studied model organism chicken (Gallus gallus). Gene annotations, different transcript isoforms, nucleotide sequences and protein information, including amino acid sequences, are included. Ambiguous gene names (symbols) are resolved within the database and linked to their canonical gene symbol. AVIMM is supplemented by a command-line interface and a web front-end to query the database.

Utility and discussion: The internal mapping of unique gene symbol identifiers to canonical gene symbols allows for an ambiguous gene property search. The database is organised within core and feature tables, which makes it straightforward to extend for future purposes. The database design is ready to be applied to other taxa or biological processes. Currently, the database contains 1170 distinct avian immune genes with canonical gene symbols and 612 synonyms across 363 bird species. While the command-line interface readily integrates into bioinformatics pipelines, the intuitive web front-end with download functionality offers sophisticated search functionalities and tracks the origin for each record. AVIMM is publicly accessible at https://avimm.ab.mpg.de .

Keywords: Avian; B10K; Genomics; Immunogenomics; Immunology; Immunome; Trait database.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Chickens / genetics*
  • Chickens / immunology
  • Databases, Genetic*
  • Genomics
  • Molecular Sequence Annotation
  • Proteins / chemistry
  • Proteins / genetics
  • User-Computer Interface*

Substances

  • Proteins