Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jan 6;6:2.
doi: 10.1186/1472-6947-6-2.

The caCORE Software Development Kit: Streamlining Construction of Interoperable Biomedical Information Services

Free PMC article

The caCORE Software Development Kit: Streamlining Construction of Interoperable Biomedical Information Services

Joshua Phillips et al. BMC Med Inform Decis Mak. .
Free PMC article


Background: Robust, programmatically accessible biomedical information services that syntactically and semantically interoperate with other resources are challenging to construct. Such systems require the adoption of common information models, data representations and terminology standards as well as documented application programming interfaces (APIs). The National Cancer Institute (NCI) developed the cancer common ontologic representation environment (caCORE) to provide the infrastructure necessary to achieve interoperability across the systems it develops or sponsors. The caCORE Software Development Kit (SDK) was designed to provide developers both within and outside the NCI with the tools needed to construct such interoperable software systems.

Results: The caCORE SDK requires a Unified Modeling Language (UML) tool to begin the development workflow with the construction of a domain information model in the form of a UML Class Diagram. Models are annotated with concepts and definitions from a description logic terminology source using the Semantic Connector component. The annotated model is registered in the Cancer Data Standards Repository (caDSR) using the UML Loader component. System software is automatically generated using the Codegen component, which produces middleware that runs on an application server. The caCORE SDK was initially tested and validated using a seven-class UML model, and has been used to generate the caCORE production system, which includes models with dozens of classes. The deployed system supports access through object-oriented APIs with consistent syntax for retrieval of any type of data object across all classes in the original UML model. The caCORE SDK is currently being used by several development teams, including by participants in the cancer biomedical informatics grid (caBIG) program, to create compatible data services. caBIG compatibility standards are based upon caCORE resources, and thus the caCORE SDK has emerged as a key enabling technology for caBIG.

Conclusion: The caCORE SDK substantially lowers the barrier to implementing systems that are syntactically and semantically interoperable by providing workflow and automation tools that standardize and expedite modeling, development, and deployment. It has gained acceptance among developers in the caBIG program, and is expected to provide a common mechanism for creating data service nodes on the data grid that is under development.


Figure 1
Figure 1
Example UML domain model, shown as a class diagram. The model shown here, a subset of the larger caBIO model used in the production caCORE system, includes several classes related to the class Gene. This model was used for initial testing and validation of kit components. After the kit was validated, the caCORE SDK was used with all the full-sized caCORE domain models (not shown).
Figure 2
Figure 2
caCORE SDK Workflow. A UML object model and a description-logic terminology source (NCI Thesaurus in the present work) are the inputs into the workflow. The model is exported from the format native to the tool it was developed in to the standard XMI representation. The XMI file is then annotated with terminology concepts using Semantic Connector. The annotated model is used as input into Codegen, which generates the software for a caBIG-compatible data system with object-oriented APIs. The annotated model is loaded as metadata into the caDSR using UML Loader. Model metadata is reviewed and completed by a curator using caDSR utilities, and then becomes available from the caDSR APIs and web applications.
Figure 3
Figure 3
Architecture of caCORE SDK-generated system. Systems developed using caCORE SDK are deployed as a Web Application Archive (WAR) to a J2EE application server such as JBoss or a web application server such as Tomcat. Contents of the WAR file can be logically grouped into three categories. Libraries and Utilities are typically packaged as JAR files, and include AspectJ for auditing, Log4J for logging, and Hibernate for object-relational mapping. Framework, Domain Objects and Deployment Descriptors include caCORE SDK classes needed at runtime, domain objects with corresponding hibernate mapping files generated by caCORE SDK, and property files containing configuration parameters. API Services function as a façade or an entry point to the system. Client requests are processed by the interface proxy in the application server and mapped to the appropriate data source by the delegation service. In version 1.0.2 of caCORE SDK, described in this article, only Java APIs are generated; in subsequent versions Web Services APIs can also be generated, with Apache Axis providing SOAP message support.

Similar articles

  • caCORE: a common infrastructure for cancer informatics.
    Covitz PA, Hartel F, Schaefer C, De Coronado S, Fragoso G, Sahni H, Gustafson S, Buetow KH. Covitz PA, et al. Bioinformatics. 2003 Dec 12;19(18):2404-12. doi: 10.1093/bioinformatics/btg335. Bioinformatics. 2003. PMID: 14668224
  • User-centered semantic harmonization: a case study.
    Weng C, Gennari JH, Fridsma DB. Weng C, et al. J Biomed Inform. 2007 Jun;40(3):353-64. doi: 10.1016/j.jbi.2007.03.004. Epub 2007 Mar 21. J Biomed Inform. 2007. PMID: 17452021
  • The CAP cancer protocols--a case study of caCORE based data standards implementation to integrate with the Cancer Biomedical Informatics Grid.
    Tobias J, Chilukuri R, Komatsoulis GA, Mohanty S, Sioutos N, Warzel DB, Wright LW, Crowley RS. Tobias J, et al. BMC Med Inform Decis Mak. 2006 Jun 20;6:25. doi: 10.1186/1472-6947-6-25. BMC Med Inform Decis Mak. 2006. PMID: 16787533 Free PMC article.
  • [caCORE: core architecture of bioinformation on cancer research in America].
    Gao Q, Zhang YL, Xie ZY, Zhang QP, Hu ZZ. Gao Q, et al. Beijing Da Xue Xue Bao Yi Xue Ban. 2006 Apr 18;38(2):218-21. Beijing Da Xue Xue Bao Yi Xue Ban. 2006. PMID: 16617371 Review. Chinese.
  • Interoperability with Moby 1.0--it's better than sharing your toothbrush!
    BioMoby Consortium, Wilkinson MD, Senger M, Kawas E, Bruskiewich R, Gouzy J, Noirot C, Bardou P, Ng A, Haase D, Saiz Ede A, Wang D, Gibbons F, Gordon PM, Sensen CW, Carrasco JM, Fernández JM, Shen L, Links M, Ng M, Opushneva N, Neerincx PB, Leunissen JA, Ernst R, Twigger S, Usadel B, Good B, Wong Y, Stein L, Crosby W, Karlsson J, Royo R, Párraga I, Ramírez S, Gelpi JL, Trelles O, Pisano DG, Jimenez N, Kerhornou A, Rosset R, Zamacola L, Tarraga J, Huerta-Cepas J, Carazo JM, Dopazo J, Guigo R, Navarro A, Orozco M, Valencia A, Claros MG, Pérez AJ, Aldana J, Rojano M, Fernandez-Santa Cruz R, Navas I, Schiltz G, Farmer A, Gessler D, Schoof H, Groscurth A. BioMoby Consortium, et al. Brief Bioinform. 2008 May;9(3):220-31. doi: 10.1093/bib/bbn003. Epub 2008 Jan 31. Brief Bioinform. 2008. PMID: 18238804 Review.
See all similar articles

Cited by 19 articles

See all "Cited by" articles


    1. Covitz PA, Hartel F, Schaefer C, De Coronado S, Fragoso G, Sahni H, Gustafson S, Buetow KH. caCORE: a common infrastructure for cancer informatics. Bioinformatics. 2003;19:2404–2412. doi: 10.1093/bioinformatics/btg335. - DOI - PubMed
    1. ISO/IEC 11179, Information Technology -- Metadata Registries (MDR) 1999.
    1. caCORE 3.0. 2005.
    1. The NCI Cancer Models Database. 2005.
    1. caWorkbench - A Platform for Integrated Genomics. 2005.

Publication types

LinkOut - more resources