Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 113 (9), 1043-53

Integration of Cardiac Proteome Biology and Medicine by a Specialized Knowledgebase

Affiliations

Integration of Cardiac Proteome Biology and Medicine by a Specialized Knowledgebase

Nobel C Zong et al. Circ Res.

Abstract

Rationale: Omics sciences enable a systems-level perspective in characterizing cardiovascular biology. Integration of diverse proteomics data via a computational strategy will catalyze the assembly of contextualized knowledge, foster discoveries through multidisciplinary investigations, and minimize unnecessary redundancy in research efforts.

Objective: The goal of this project is to develop a consolidated cardiac proteome knowledgebase with novel bioinformatics pipeline and Web portals, thereby serving as a new resource to advance cardiovascular biology and medicine.

Methods and results: We created Cardiac Organellar Protein Atlas Knowledgebase (COPaKB; www.HeartProteome.org), a centralized platform of high-quality cardiac proteomic data, bioinformatics tools, and relevant cardiovascular phenotypes. Currently, COPaKB features 8 organellar modules, comprising 4203 LC-MS/MS experiments from human, mouse, drosophila, and Caenorhabditis elegans, as well as expression images of 10,924 proteins in human myocardium. In addition, the Java-coded bioinformatics tools provided by COPaKB enable cardiovascular investigators in all disciplines to retrieve and analyze pertinent organellar protein properties of interest.

Conclusions: COPaKB provides an innovative and interactive resource that connects research interests with the new biological discoveries in protein sciences. With an array of intuitive tools in this unified Web server, nonproteomics investigators can conveniently collaborate with proteomics specialists to dissect the molecular signatures of cardiovascular phenotypes.

Keywords: COPaKB; Omics science; computational biology; mitochondria; organelles; proteomics; translational medical research.

Figures

Figure 1
Figure 1. The Schema of the Cardiac Organellar Protein Atlas Knowledgebase (COPaKB)
A. Differential datasets of the cardiac proteome served as the basis for implementing COPa Knowledgebase (KB), including protein mass spectra, protein expression images (e.g. immunohistochemical image of PSMB2 expression, immunofluorescence image of PSMA3 expression), and cardiac phenotypes (e.g. heart failure, HF, ischemia reperfusion injury, IR). Using a protein identification list as a common index, these datasets were integrated into COPaKB in a relational database. In a modular structure, proteins were organized according to their organellar origins. Communication between COPaKB and cardiovascular investigators was mediated by a dedicated internet portal and interface featuring an open web service infrastructure and a search engine. This infrastructure enables the delivery of integrated knowledge on the cardiac proteome in response to an input of raw spectral dataset. Collectively, COPaKB acts as a platform for synergistic interaction among cardiovascular investigators. B. The relational database uses relation tables (e.g. PTM table, PP relations table) to connect key property tables (e.g. spectrum table, protein table) of the cardiac proteome. Derivative property tables (e.g. disease relevance table) are connected to key property tables to archive diverse attributes of the cardiac proteome. Primary keys (PK) and foreign keys (FK) were used to establish correlations among these tables. (HPA stands for Human Protein Atlas; UniProt stands for Universal Protein Resource; PRIDE stands for PRoteomics IDEntifications database).
Figure 1
Figure 1. The Schema of the Cardiac Organellar Protein Atlas Knowledgebase (COPaKB)
A. Differential datasets of the cardiac proteome served as the basis for implementing COPa Knowledgebase (KB), including protein mass spectra, protein expression images (e.g. immunohistochemical image of PSMB2 expression, immunofluorescence image of PSMA3 expression), and cardiac phenotypes (e.g. heart failure, HF, ischemia reperfusion injury, IR). Using a protein identification list as a common index, these datasets were integrated into COPaKB in a relational database. In a modular structure, proteins were organized according to their organellar origins. Communication between COPaKB and cardiovascular investigators was mediated by a dedicated internet portal and interface featuring an open web service infrastructure and a search engine. This infrastructure enables the delivery of integrated knowledge on the cardiac proteome in response to an input of raw spectral dataset. Collectively, COPaKB acts as a platform for synergistic interaction among cardiovascular investigators. B. The relational database uses relation tables (e.g. PTM table, PP relations table) to connect key property tables (e.g. spectrum table, protein table) of the cardiac proteome. Derivative property tables (e.g. disease relevance table) are connected to key property tables to archive diverse attributes of the cardiac proteome. Primary keys (PK) and foreign keys (FK) were used to establish correlations among these tables. (HPA stands for Human Protein Atlas; UniProt stands for Universal Protein Resource; PRIDE stands for PRoteomics IDEntifications database).
Figure 2
Figure 2. Coverage of the Four Organellar Modules in the COPa Knowledgebase
A. Six biological replicates of human mitochondria preparations were analyzed to construct this spectral library module. A total of 1,398 proteins were identified and compiled; the cumulative proteome coverage reached the limits of the LC-MS/MS platform. B. Five biological replicates were analyzed for the human proteasome module to reach a plateau in protein identification with 283 proteins. C. Ten biological replicates were analyzed for the mouse mitochondria module to reach a plateau of 1,619 proteins. D. Five biological replicates were analyzed for the mouse proteasome module to reach a plateau of 151 proteins.
Figure 3
Figure 3. Large-scale Spectral Analysis over the Web
COPaKB Client orchestrates segmentation (1) and submission (2) of mass spectral data through the Internet to the COPaKB server. Data packets (3) received by the COPaKB server were analyzed using the spectral library as a reference (4). For matched spectra (5), the properties of their corresponding proteins are retrieved from COPaKB automatically (6), which are returned to COPaKB Client (7). By the end of the analysis, COPaKB Client presents a consolidated report (8) outlining the proteome properties encoded in the raw spectral files.
Figure 4
Figure 4. Sensitive Protein Identification via the COPaKB-mediated Web service
The test dataset of the murine mitochondrial proteome containing 111 raw data files was downloaded from the Peptide Atlas Repository(PAe000353). A. Compared to a database search workflow (SEQUEST), the COPaKB web service covered 144 shared proteins (78.7%) and an additional 117 identifications (63.9%). The 117 proteins uniquely identified by the COPaKB workflow were categorized according to their Gene Ontology annotations. Among them, 92 proteins had a subcellular location annotation of the mitochondrion with functional implications in metabolism (74), apoptosis (4), transport (3) and unknown (11). Among the 39 proteins identified uniquely with the sequence database search engine, only 7 proteins had a subcellular location annotation of the mitochondrion. B. Mass spectrum corresponding to peptide LFQADNDLPVHLK was identified as belonging to cytochrome c oxidase 7a1 of the mitochondrial electron transport chain complex IV, which was identified by the COPaKB web service. C. The expression profile of cytochrome b-c1 complex subunit 1 (Q9CZ13) in human myocardium was probed by its specific antibody (ID: HPA002815). This image was automatically retrieved by the COPaKB web service from the Human Protein Atlas after its identification. D. Immunofluorescence image of this protein with the same antibody provided organellar resolution of protein expression. With the reference of organellar markers, Q9CZ13 was probed to express in mitochondria.
Figure 4
Figure 4. Sensitive Protein Identification via the COPaKB-mediated Web service
The test dataset of the murine mitochondrial proteome containing 111 raw data files was downloaded from the Peptide Atlas Repository(PAe000353). A. Compared to a database search workflow (SEQUEST), the COPaKB web service covered 144 shared proteins (78.7%) and an additional 117 identifications (63.9%). The 117 proteins uniquely identified by the COPaKB workflow were categorized according to their Gene Ontology annotations. Among them, 92 proteins had a subcellular location annotation of the mitochondrion with functional implications in metabolism (74), apoptosis (4), transport (3) and unknown (11). Among the 39 proteins identified uniquely with the sequence database search engine, only 7 proteins had a subcellular location annotation of the mitochondrion. B. Mass spectrum corresponding to peptide LFQADNDLPVHLK was identified as belonging to cytochrome c oxidase 7a1 of the mitochondrial electron transport chain complex IV, which was identified by the COPaKB web service. C. The expression profile of cytochrome b-c1 complex subunit 1 (Q9CZ13) in human myocardium was probed by its specific antibody (ID: HPA002815). This image was automatically retrieved by the COPaKB web service from the Human Protein Atlas after its identification. D. Immunofluorescence image of this protein with the same antibody provided organellar resolution of protein expression. With the reference of organellar markers, Q9CZ13 was probed to express in mitochondria.
Figure 4
Figure 4. Sensitive Protein Identification via the COPaKB-mediated Web service
The test dataset of the murine mitochondrial proteome containing 111 raw data files was downloaded from the Peptide Atlas Repository(PAe000353). A. Compared to a database search workflow (SEQUEST), the COPaKB web service covered 144 shared proteins (78.7%) and an additional 117 identifications (63.9%). The 117 proteins uniquely identified by the COPaKB workflow were categorized according to their Gene Ontology annotations. Among them, 92 proteins had a subcellular location annotation of the mitochondrion with functional implications in metabolism (74), apoptosis (4), transport (3) and unknown (11). Among the 39 proteins identified uniquely with the sequence database search engine, only 7 proteins had a subcellular location annotation of the mitochondrion. B. Mass spectrum corresponding to peptide LFQADNDLPVHLK was identified as belonging to cytochrome c oxidase 7a1 of the mitochondrial electron transport chain complex IV, which was identified by the COPaKB web service. C. The expression profile of cytochrome b-c1 complex subunit 1 (Q9CZ13) in human myocardium was probed by its specific antibody (ID: HPA002815). This image was automatically retrieved by the COPaKB web service from the Human Protein Atlas after its identification. D. Immunofluorescence image of this protein with the same antibody provided organellar resolution of protein expression. With the reference of organellar markers, Q9CZ13 was probed to express in mitochondria.
Figure 4
Figure 4. Sensitive Protein Identification via the COPaKB-mediated Web service
The test dataset of the murine mitochondrial proteome containing 111 raw data files was downloaded from the Peptide Atlas Repository(PAe000353). A. Compared to a database search workflow (SEQUEST), the COPaKB web service covered 144 shared proteins (78.7%) and an additional 117 identifications (63.9%). The 117 proteins uniquely identified by the COPaKB workflow were categorized according to their Gene Ontology annotations. Among them, 92 proteins had a subcellular location annotation of the mitochondrion with functional implications in metabolism (74), apoptosis (4), transport (3) and unknown (11). Among the 39 proteins identified uniquely with the sequence database search engine, only 7 proteins had a subcellular location annotation of the mitochondrion. B. Mass spectrum corresponding to peptide LFQADNDLPVHLK was identified as belonging to cytochrome c oxidase 7a1 of the mitochondrial electron transport chain complex IV, which was identified by the COPaKB web service. C. The expression profile of cytochrome b-c1 complex subunit 1 (Q9CZ13) in human myocardium was probed by its specific antibody (ID: HPA002815). This image was automatically retrieved by the COPaKB web service from the Human Protein Atlas after its identification. D. Immunofluorescence image of this protein with the same antibody provided organellar resolution of protein expression. With the reference of organellar markers, Q9CZ13 was probed to express in mitochondria.
Figure 5
Figure 5. Integration of Discoveries from Multiple Analyses
A. Via COPaKB, discoveries from multiple proteomic investigations from discrete research group can be aligned by specifying the task ID of each analysis. Alternatively, the expression of selected biomarkers in these studies can be probed. B. There were 452 shared protein identities among the tasks 13855, 13878 and 14003. Meanwhile, 627 proteins were detected only in task 13855, 4 proteins only in 14003 and 17 proteins only in 13878.

Similar articles

See all similar articles

Cited by 27 articles

See all "Cited by" articles

Publication types

Substances

LinkOut - more resources

Feedback