Proof of concept for voice based MRI scanner control using large language models in real time guided interventions

Sci Rep. 2025 Aug 25;15(1):31206. doi: 10.1038/s41598-025-11290-6.

Abstract

In clinical MRI-guided interventions, the lack of high-quality peripheral equipment and specialized interventional MRI systems often necessitates delegating real-time control of MRI scanners to an assistant. We proposed a voice-based interaction system powered by large language models that enabled hands-free natural language control of MRI scanners. The system leveraged multi-agent collaboration driven by large language models to execute scanner functionalities, including sequence execution, parameter adjustment, and scanner table positioning. In 90 hands-free tests for 18 predefined tasks performed within a real MRI scanning room, the system achieved an overall task completion rate of 93.3% (95% CI 86.2-96.9%). On a consumer laptop without GPU support, the response time for control commands was approximately 5-10.5 s. Our study demonstrates the feasibility of using large language models for voice-based interaction with MRI scanners during interventions, eliminating the need for additional assistants and allowing human-like communication.

MeSH terms

  • Humans
  • Language
  • Large Language Models
  • Magnetic Resonance Imaging* / instrumentation
  • Magnetic Resonance Imaging* / methods
  • Proof of Concept Study
  • Voice*