Background: Skeletal muscle mass (SMM) and fat mass (FM) are essentially required for health and quality of life in older adults.
Objective: To generate the best SMM and FM prediction models using machine learning models incorporating socioeconomic, lifestyle, and biochemical parameters and the urban hospital-based Ansan/Ansung cohort, and to determine relations between SMM and FM and metabolic syndrome and its components in this cohort.
Methods: SMM and FM data measured using an Inbody 4.0 unit in 90% of Ansan/Ansung cohort participants were used to train seven machine learning algorithms. The ten most essential predictors from 1411 variables were selected by: (1) Manually filtering out 48 variables, (2) generating best models by random grid mode in a training set, and (3) comparing the accuracy of the models in a test set. The seven trained models' accuracy was evaluated using mean-square errors (MSE), mean absolute errors (MAE), and R² values in 10% of the test set. SMM and FM of the 31,025 participants in the Ansan/Ansung cohort were predicted using the best prediction models (XGBoost for SMM and artificial neural network for FM). Metabolic syndrome and its components were compared between four groups categorized by 50 percentiles of predicted SMM and FM values in the cohort.
Results: The best prediction models for SMM and FM were constructed using XGBoost (R2 = 0.82) and artificial neural network (ANN; R2 = 0.89) algorithms, respectively; both models had a low MSE. Serum platelet concentrations and GFR were identified as new biomarkers of SMM, and serum platelet and bilirubin concentrations were found to predict FM. Predicted SMM and FM values were significantly and positively correlated with grip strength (r = 0.726) and BMI (r = 0.915, p < 0.05), respectively. Grip strengths in the high-SMM groups of both genders were significantly higher than in low-SMM groups (p < 0.05), and blood glucose and hemoglobin A1c in high-FM groups were higher than in low-FM groups for both genders (p < 0.05).
Conclusion: The models generated by XGBoost and ANN algorithms exhibited good accuracy for estimating SMM and FM, respectively. The prediction models take into account the actual clinical use since they included a small number of required features, and the features can be obtained in outpatients. SMM and FM predicted using the two models well represented the risk of low SMM and high fat in a clinical setting.
Keywords: C-reactive protein; fat mass; grip strength; machine learning; platelet; prediction model; skeletal muscle mass.