Background: The use of SharePoint® collaboration software for content management has become a critical part of today's drug discovery process. SharePoint 2010 software has laid a foundation which enables researchers to collaborate and search on various contents. The amount of data generated during a transition of a single compound from preclinical discovery to commercialization can easily range in terabytes, thus there is a greater demand of a chemically aware search algorithm that supplements SharePoint which enables researchers to query for information in a more intuitive and effective way. Thus by supplementing SharePoint with Chemically Aware™ features provides a great value to the pharmaceutical and biotech companies and makes drug discovery more efficient. Using several tools we have integrated SharePoint with chemical, compound, and reaction databases, thereby improving the traditional search engine capability and enhancing the user experience.
Results: This paper describes the implementation of a Chemically Aware™ system to supplement SharePoint. A Chemically Aware SharePoint (CASP) allows users to tag documents by drawing a structure and associating it with the related content. It also allows the user to search SharePoint software content and internal/external databases by carrying out substructure, similarity, SMILES, and IUPAC name searches. Building on traditional search, CASP takes SharePoint one step further by providing a intuitive GUI to the researchers to base their search on their knowledge of chemistry than textual search. CASP also provides a way to integrate with other systems, for example a researcher can perform a sub-structure search on pdf documents with embedded molecular entities.
Conclusion: A Chemically Aware™ system supplementing SharePoint is a step towards making drug discovery process more efficient and also helps researchers to search for information in a more intuitive way. It also helps the researchers to find information which was once difficult to find by allowing one to tag documents with molecular entities and integrating with image recognition software to find information from pdf documents.