BioLLM: A standardized framework for integrating and benchmarking single-cell foundation models

Patterns (N Y). 2025 Jul 30;6(8):101326. doi: 10.1016/j.patter.2025.101326. eCollection 2025 Aug 8.

Abstract

The application and evaluation of single-cell foundation models (scFMs) present significant challenges due to heterogeneous architectures and coding standards. To address this, we introduce BioLLM (biological large language model), a unified framework for integrating and applying scFMs to single-cell RNA sequencing analysis. BioLLM provides a unified interface that integrates diverse scFMs, eliminating architectural and coding inconsistencies to enable streamlined model access. With standardized APIs and comprehensive documentation, BioLLM supports streamlined model switching and consistent benchmarking. Our comprehensive evaluation of scFMs revealed distinct strengths and limitations, highlighting scGPT's robust performance across all tasks, including zero shot and fine-tuning. Geneformer and scFoundation demonstrated strong capabilities in gene-level tasks, benefiting from effective pretraining strategies. In contrast, scBERT lagged behind, likely due to its smaller model size and limited training data. Ultimately, BioLLM aims to empower the scientific community to leverage the full potential of foundational models, advancing our understanding of complex biological systems through enhanced single-cell analysis.

Keywords: model benchmarking; single-cell foundation models; unified framework; zero shot and fine-tuning.