By processing and abstracting diverse omics datasets into associations between genes and their attributes, the Harmonizome database enables researchers to explore and integrate knowledge about human genes from many central omics resources. Here, we introduce Harmonizome 3.0, a significant upgrade to the original Harmonizome database. The upgrade adds 26 datasets that contribute nearly 12 million associations between genes and various attribute types such as cells and tissues, diseases, and pathways. The upgrade has a dataset crossing feature to identify gene modules that are shared across datasets. To further explain significantly high gene set overlap between dataset pairs, a large language model (LLM) composes a paragraph that speculates about the reasons behind the high overlap. The upgrade also adds more data formats and visualization options. Datasets are downloadable as knowledge graph (KG) assertions and visualized with Uniform Manifold Approximation and Projection (UMAP) plots. The KG assertions can be explored via a user interface that visualizes gene-attribute associations as ball-and-stick diagrams. Overall, Harmonizome 3.0 is a rich resource of processed omics datasets that are provided in several AI-ready formats. Harmonizome 3.0 is available at https://maayanlab.cloud/Harmonizome/.
© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.