Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan;41(Database issue):D684-91.
doi: 10.1093/nar/gks1113. Epub 2012 Nov 21.

EuPathDB: The Eukaryotic Pathogen Database

Free PMC article

EuPathDB: The Eukaryotic Pathogen Database

Cristina Aurrecoechea et al. Nucleic Acids Res. .
Free PMC article


EuPathDB ( resources include 11 databases supporting eukaryotic pathogen genomic and functional genomic data, isolate data and phylogenomics. EuPathDB resources are built using the same infrastructure and provide a sophisticated search strategy system enabling complex interrogations of underlying data. Recent advances in EuPathDB resources include the design and implementation of a new data loading workflow, a new database supporting Piroplasmida (i.e. Babesia and Theileria), the addition of large amounts of new data and data types and the incorporation of new analysis tools. New data include genome sequences and annotation, strand-specific RNA-seq data, splice junction predictions (based on RNA-seq), phosphoproteomic data, high-throughput phenotyping data, single nucleotide polymorphism data based on high-throughput sequencing (HTS) and expression quantitative trait loci data. New analysis tools enable users to search for DNA motifs and define genes based on their genomic colocation, view results from searches graphically (i.e. genes mapped to chromosomes or isolates displayed on a map) and analyze data from columns in result tables (word cloud and histogram summaries of column content). The manuscript herein describes updates to EuPathDB since the previous report published in NAR in 2010.


Figure 1.
Figure 1.
Screen shots of a search strategy in PiroplasmaDB and GBrowse representing HTS (C–E from ToxoDB and F and G from AmoebaDB) (A) A three-step search strategy combining genes with predicted signal peptides, transmembrane domains and microarray expression data. (B) Search strategies may be saved and shared with others using a uniquely generated URL. (C) Peptides from mass spec experiments are mapped to genes and displayed graphically. Mousing over the graphics provides additional information, such as the peptide sequence and any posttranslational modifications. In this image, peptides are from a phophoproteomic experiment. (D) A track representing strand-specific RNA-seq data. Blue indicates reads mapping to the forward strand, whereas red represents those mapping to the reverse strand. (E) Unified splice junction track representing intron-spanning RNA-seq reads from all experiments in the database. (F) A 2 kb region with alignment of DNA sequencing reads to the genome. (G) Zooming in to 100 bp displays the actual sequence allowing data inspection. Highlighted nucleotides represent SNPs.
Figure 2.
Figure 2.
Screen shot from GiardiaDB depicting a genomic segment search. (A) Genomic segment searches (i.e. DNA motif pattern) are available on the home page. (B) DNA motifs may be entered as a standard string of characters or using a regular expression as depicted. (C) DNA segment records are generated dynamically and results are displayed in a search strategy with results represented in a dynamic table below the strategy.
Figure 3.
Figure 3.
Screen shots depicting the genomic colocation query in EuPathDB resources. In this example from GiardiaDB, genes that have a DNA motif located within 500-nt upstream are identified. (A) To identify genes in relation to DNA motifs, a step searching for genes based on the organism of interest is added to the strategy. The genomic colocation option is selected by default when combing different record types, such as DNA motifs and genes. (B) The customizable colocation popup provides a dynamic logic statement that is updated based on the chosen parameters. (C) Results of colocation query. Top of the panel shows the search strategy and the bottom portion includes the results with columns for gene IDs, number of matched motifs in the defined region and match genomic coordinates.
Figure 4.
Figure 4.
Screen shots from PlasmoDB showing in (A) a typical result list from a search strategy, (B) an alternative graphical representation of genes on chromosomes, (C) a word cloud generated by clicking on the column analysis icon for the product description column and (D) a histogram generated by clicking on the column analysis icon for the ortholog count column.

Similar articles

  • EuPathDB: a portal to eukaryotic pathogen databases.
    Aurrecoechea C, Brestelli J, Brunk BP, Fischer S, Gajria B, Gao X, Gingle A, Grant G, Harb OS, Heiges M, Innamorato F, Iodice J, Kissinger JC, Kraemer ET, Li W, Miller JA, Nayak V, Pennington C, Pinney DF, Roos DS, Ross C, Srinivasamoorthy G, Stoeckert CJ Jr, Thibodeau R, Treatman C, Wang H. Aurrecoechea C, et al. Nucleic Acids Res. 2010 Jan;38(Database issue):D415-9. doi: 10.1093/nar/gkp941. Epub 2009 Nov 13. Nucleic Acids Res. 2010. PMID: 19914931 Free PMC article.
  • EuPathDB: the eukaryotic pathogen genomics database resource.
    Aurrecoechea C, Barreto A, Basenko EY, Brestelli J, Brunk BP, Cade S, Crouch K, Doherty R, Falke D, Fischer S, Gajria B, Harb OS, Heiges M, Hertz-Fowler C, Hu S, Iodice J, Kissinger JC, Lawrence C, Li W, Pinney DF, Pulman JA, Roos DS, Shanmugasundram A, Silva-Franco F, Steinbiss S, Stoeckert CJ Jr, Spruill D, Wang H, Warrenfeltz S, Zheng J. Aurrecoechea C, et al. Nucleic Acids Res. 2017 Jan 4;45(D1):D581-D591. doi: 10.1093/nar/gkw1105. Epub 2016 Nov 29. Nucleic Acids Res. 2017. PMID: 27903906 Free PMC article.
  • EuPathDB: The Eukaryotic Pathogen Genomics Database Resource.
    Warrenfeltz S, Basenko EY, Crouch K, Harb OS, Kissinger JC, Roos DS, Shanmugasundram A, Silva-Franco F. Warrenfeltz S, et al. Methods Mol Biol. 2018;1757:69-113. doi: 10.1007/978-1-4939-7737-6_5. Methods Mol Biol. 2018. PMID: 29761457 Free PMC article.
  • Genome information resources - developments at Ensembl.
    Hammond MP, Birney E. Hammond MP, et al. Trends Genet. 2004 Jun;20(6):268-72. doi: 10.1016/j.tig.2004.04.002. Trends Genet. 2004. PMID: 15145580 Review.
  • Deciphering ENCODE.
    Diehl AG, Boyle AP. Diehl AG, et al. Trends Genet. 2016 Apr;32(4):238-249. doi: 10.1016/j.tig.2016.02.002. Epub 2016 Mar 5. Trends Genet. 2016. PMID: 26962025 Review.
See all similar articles

Cited by 48 articles

See all "Cited by" articles


    1. Aurrecoechea C, Brestelli J, Brunk BP, Fischer S, Gajria B, Gao X, Gingle A, Grant G, Harb OS, Heiges M, et al. EuPathDB: a portal to eukaryotic pathogen databases. Nucleic Acids Res. 2010;38:D415–D419. - PMC - PubMed
    1. Squires RB, Noronha J, Hunt V, García Sastre A, Macken C, Baumgarth N, Suarez D, Pickett BE, Zhang Y, Larsen CN, et al. Influenza Research Database: an integrated bioinformatics resource for influenza research and surveillance. Influenza Other Respi Viruses. 2012;6:404–416. - PMC - PubMed
    1. Pickett BE, Sadat EL, Zhang Y, Noronha JM, Squires RB, Hunt V, Liu M, Kumar S, Zaremba S, Gu Z, et al. ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic Acids Res. 2011;40:D593–D598. - PMC - PubMed
    1. Megy K, Emrich SJ, Lawson D, Campbell D, Dialynas E, Hughes DST, Koscielny G, Louis C, Maccallum RM, Redmond SN, et al. VectorBase: improvements to a bioinformatics resource for invertebrate vector genomics. Nucleic Acids Res. 2012;40:D729–D734. - PMC - PubMed
    1. Gillespie JJ, Wattam AR, Cammer SA, Gabbard JL, Shukla MP, Dalay O, Driscoll T, Hix D, Mane SP, Mao C, et al. PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species. Infect. Immun. 2011;79:4286–4298. - PMC - PubMed

Publication types