Prediction of the development of islet autoantibodies through integration of environmental, genetic, and metabolic markers

J Diabetes. 2021 Feb;13(2):143-153. doi: 10.1111/1753-0407.13093. Epub 2020 Aug 16.

Abstract

Background: The Environmental Determinants of the Diabetes in the Young (TEDDY) study has prospectively followed, from birth, children at increased genetic risk of type 1 diabetes. TEDDY has collected heterogenous data longitudinally to gain insights into the environmental and biological mechanisms driving the progression to persistent islet autoantibodies.

Methods: We developed a machine learning model to predict imminent transition to the development of persistent islet autoantibodies based on time-varying metabolomics data integrated with time-invariant risk factors (eg, gestational age). The machine learning was initiated with 221 potential features (85 genetic, 5 environmental, 131 metabolomic) and an ensemble-based feature evaluation was utilized to identify a small set of predictive features that can be interrogated to better understand the pathogenesis leading up to persistent islet autoimmunity.

Results: The final integrative machine learning model included 42 disparate features, returning a cross-validated receiver operating characteristic area under the curve (AUC) of 0.74 and an AUC of ~0.65 on an independent validation dataset. The model identified a principal set of 20 time-invariant markers, including 18 genetic markers (16 single nucleotide polymorphisms [SNPs] and two HLA-DR genotypes) and two demographic markers (gestational age and exposure to a prebiotic formula). Integration with the metabolome identified 22 supplemental metabolites and lipids, including adipic acid and ceramide d42:0, that predicted development of islet autoantibodies.

Conclusions: The majority (86%) of metabolites that predicted development of islet autoantibodies belonged to three pathways: lipid oxidation, phospholipase A2 signaling, and pentose phosphate, suggesting that these metabolic processes may play a role in triggering islet autoimmunity.

背景: 青少年糖尿病的环境决定因素(TEDDY)研究一直跟踪着1型糖尿病遗传风险增加的儿童从出生到目前的状况。 TEDDY纵向收集了各种数据,以深入了解推动胰岛自身抗体发展的环境和生物学机制。 方法: 我们基于时变代谢组学数据并结合时变危险因素(例如胎龄),开发了一种机器学习模型来预测持续性胰岛自身抗体的发展。机器学习最初具有221种潜在特征(包括85种遗传特征,5种环境特征,131种代谢组学特征),并且基于整体的特征评估用于识别一小部分预测特征,可以对其进行查询以更好地了解导致持续性胰岛自身免疫的发病机理。 结果: 最终的集成机器学习模型包括42个不同特征,交叉验证的受试者工作特征曲线下的面积(AUC)为0.74,一个独立验证数据集的AUC约为0.65。该模型确定了一组主要的20个非时变标记,包括18个遗传标记(16个单核苷酸多态性[SNPs]和两个HLA-DR基因型)和2个人口统计学标记(胎龄和使用益生元)。通过与代谢组整合鉴定出22种补充代谢产物和脂质,包括己二酸和神经酰胺d42:0,这些物质可预测胰岛自身抗体的产生。 结论: 预测胰岛自身抗体产生的大多数(86%)代谢产物属于三个途径:脂质氧化途径、磷脂酶A2信号传导和磷酸戊糖途径,表明这些代谢过程可能在触发胰岛自身免疫中起作用。.

Keywords: autoimmunity; genetics; machine learning; metabolomics; 代谢组学; 机器学习; 自身免疫; 遗传.

MeSH terms

  • Autoantibodies*
  • Autoimmunity / genetics
  • Autoimmunity / immunology*
  • Child, Preschool
  • Diabetes Mellitus, Type 1 / genetics
  • Diabetes Mellitus, Type 1 / immunology*
  • Female
  • Genetic Predisposition to Disease*
  • Genotype
  • Gestational Age
  • Humans
  • Infant
  • Islets of Langerhans / immunology*
  • Male
  • Polymorphism, Single Nucleotide
  • Prospective Studies
  • Risk Factors

Substances

  • Autoantibodies