Assessing the proficiency of large language models on funduscopic disease knowledge.
Wu JY, Zeng YM, Qian XZ, Hong Q, Hu JY, Wei H, Zou J, Chen C, Wang XY, Chen X, Shao Y.
Wu JY, et al.
Int J Ophthalmol. 2025 Jul 18;18(7):1205-1213. doi: 10.18240/ijo.2025.07.03. eCollection 2025.
Int J Ophthalmol. 2025.
PMID: 40688789
Free PMC article.
AIM: To assess the performance of five distinct large language models (LLMs; ChatGPT-3.5, ChatGPT-4, PaLM2, Claude 2, and SenseNova) in comparison to two human cohorts (a group of funduscopic disease experts and a group of ophthalmologists) on the specialized subject of …
AIM: To assess the performance of five distinct large language models (LLMs; ChatGPT-3.5, ChatGPT-4, PaLM2, Claude 2, and SenseNova) in comp …