Small-molecule discovery typically involves large-scale screening campaigns, spanning multiple compound collections. However, such activities can be cost- or time-prohibitive, especially when using complex assay systems, limiting the number of compounds tested. Further, low hit rates can make the process inefficient. Sparse coverage of chemical structure or biological activity space can lead to limited success in a primary screen and represents a missed opportunity by virtue of selecting the "wrong" compounds to test. Thus, the choice of screening collections becomes of paramount importance. In this perspective, we discuss the utility of generating "informer sets" for small-molecule discovery, and how this strategy can be leveraged to prioritize probe candidates. While many researchers may assume that informer sets are focused on particular targets (e.g., kinases) or processes (e.g., autophagy), efforts to assemble informer sets based on historical bioactivity or successful human exposure (e.g., repurposing collections) have shown promise as well. Another method for generating informer sets is based on chemical structure, particularly when the compounds have unknown activities and targets. We describe our efforts to screen an informer set representing a collection of 100,000 small molecules synthesized through diversity-oriented synthesis (DOS). This process enables researchers to identify activity early and more extensively screen only a few chemical scaffolds, rather than the entire collection. This elegant and economic outcome is a goal of the informer set approach. Here, we aim not only to shed light on this process, but also to promote the use of informer sets more widely in small-molecule discovery projects.
Keywords: chemoinformatics; compound repositories; general pharmaceutical process; high-content screening.