[A heart sound classification method based on joint decision of extreme gradient boosting and deep neural network]

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2021 Feb 25;38(1):10-20. doi: 10.7507/1001-5515.202006025.
[Article in Chinese]

Abstract

Heart sound is one of the common medical signals for diagnosing cardiovascular diseases. This paper studies the binary classification between normal or abnormal heart sounds, and proposes a heart sound classification algorithm based on the joint decision of extreme gradient boosting (XGBoost) and deep neural network, achieving a further improvement in feature extraction and model accuracy. First, the preprocessed heart sound recordings are segmented into four status, and five categories of features are extracted from the signals based on segmentation. The first four categories of features are sieved through recursive feature elimination, which is used as the input of the XGBoost classifier. The last category is the Mel-frequency cepstral coefficient (MFCC), which is used as the input of long short-term memory network (LSTM). Considering the imbalance of the data set, these two classifiers are both improved with weights. Finally, the heterogeneous integrated decision method is adopted to obtain the prediction. The algorithm was applied to the open heart sound database of the PhysioNet Computing in Cardiology(CINC) Challenge in 2016 on the PhysioNet website, to test the sensitivity, specificity, modified accuracy and F score. The results were 93%, 89.4%, 91.2% and 91.3% respectively. Compared with the results of machine learning, convolutional neural networks (CNN) and other methods used by other researchers, the accuracy and sensibility have been obviously improved, which proves that the method in this paper could effectively improve the accuracy of heart sound signal classification, and has great potential in the clinical auxiliary diagnosis application of some cardiovascular diseases.

心音是诊断心血管疾病常用的医学信号之一。本文对心音正常/异常的二分类问题进行了研究,提出了一种基于极限梯度提升(XGBoost)和深度神经网络共同决策的心音分类算法,实现了对特征的选择和模型准确率的进一步提升。首先,本文对预处理后的心音信号进行心音分割,在此基础上提取了 5 个大类的特征,前 4 类特征采用递归特征消除法进行特征选择,作为 XGBoost 分类器的输入,最后一类为梅尔频率倒谱系数(MFCC),作为长短时记忆网络(LSTM)的输入。考虑到数据集的不平衡性,本文在两种分类器中皆使用了加权改进的方法。最后采用异质集成决策方法得到预测结果。将本文所提心音分类算法应用于 PhysioNet 网站在 2016 年发起的 PhysioNet 心脏病学挑战赛(CINC)所用公开心音数据库,以测试灵敏度、特异性、修正后的准确率以及 F 得分,结果分别为 93%、89.4%、91.2%、91.3%,通过与其他研究者应用机器学习、卷积神经网络(CNN)等方法的结果比较,在准确率和灵敏度上有明显提高,证明了本文方法能有效地提高心音信号分类的准确性,在部分心血管疾病的临床辅助诊断应用中有很大的潜力。.

Keywords: extreme gradient boosting; feature extraction; heart sound classification; long short-term memory network.

MeSH terms

  • Algorithms
  • Databases, Factual
  • Heart Sounds*
  • Neural Networks, Computer

Grants and funding

国家重点研发计划项目(SQ2018YFB130700)