Data-Centric Machine Learning in Nursing: A Concept Clarification

Comput Inform Nurs. 2024 Jan 19. doi: 10.1097/CIN.0000000000001102. Online ahead of print.


The ubiquity of electronic health records and health information exchanges has generated abundant administrative and clinical healthcare data. The vastness of this rich dataset presents an opportunity for emerging technologies (eg, artificial intelligence and machine learning) to assist clinicians and healthcare administrators with decision-making, predictive analytics, and more. Multiple studies have cited various applications for artificial intelligence and machine learning in nursing. However, what is unknown in the nursing discipline is that while greater than 90% of machine-learning implementations use a model-centric strategy, a fundamental change is occurring. Because of the limitations of this approach, the industry is beginning to pivot toward data-centric artificial intelligence. Nurses should be aware of the differences, including how each approach affects their engagement in designing human-intelligent-like technologies and their data usage, especially regarding electronic health records. Using the Norris Concept Clarification method, this article elucidates the data-centric machine learning concept for nursing. This is accomplished by (1) exploring the concept's origins in the data and computer science disciplines; (2) differentiating data- versus model-centric machine learning approaches, including introducing the machine-learning operation life cycle and process; and (3) explaining the advantages of the data-centric phenomenon, especially concerning nurses' engagement in technological design and proper data usage.