In clinical MRI-guided interventions, the lack of high-quality peripheral equipment and specialized interventional MRI systems often necessitates delegating real-time control of MRI scanners to an assistant. We proposed a voice-based interaction system powered by large language models that enabled hands-free natural language control of MRI scanners. The system leveraged multi-agent collaboration driven by large language models to execute scanner functionalities, including sequence execution, parameter adjustment, and scanner table positioning. In 90 hands-free tests for 18 predefined tasks performed within a real MRI scanning room, the system achieved an overall task completion rate of 93.3% (95% CI 86.2-96.9%). On a consumer laptop without GPU support, the response time for control commands was approximately 5-10.5 s. Our study demonstrates the feasibility of using large language models for voice-based interaction with MRI scanners during interventions, eliminating the need for additional assistants and allowing human-like communication.
© 2025. The Author(s).