切换至 "中华医学电子期刊资源库"

中华妇幼临床医学杂志(电子版) ›› 2024, Vol. 20 ›› Issue (01) : 105 -113. doi: 10.3877/cma.j.issn.1673-5250.2024.01.014

论著

妊娠期糖尿病早孕期相关影响因素及基于早孕期孕妇糖脂相关生化指标与人口学资料的4种机器学习算法构建妊娠期糖尿病预测模型的临床价值
李莉1, 马梅2, 黄欣欣3, 杨丹林1, 潘勉1,()   
  1. 1. 福建省妇幼保健院·福建医科大学妇儿临床医学院产科,福州 350001
    2. 福建省妇幼保健院·福建医科大学妇儿临床医学院检验科,福州 350001
    3. 福建省妇幼保健院·福建医科大学妇儿临床医学院保健部,福州 350001
  • 收稿日期:2023-11-11 修回日期:2024-01-06 出版日期:2024-02-01
  • 通信作者: 潘勉

Analysis of early pregnancy-related influencing factors of gestational diabetes mellitus, and clinical value of building gestational diabetes mellitus prediction model based on four machine learning algorithms of glycolipid-related biochemical indexes and demographic information of pregnant women in early pregnancy

Li Li1, Mei Ma2, Xinxin Huang3, Danlin Yang1, Mian Pan1,()   

  1. 1. Department of Obstetrics, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou 350001, Fujian Province, China
    2. Department of Laboratory Medicine, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou 350001, Fujian Province, China
    3. Department of Healthcare, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou 350001, Fujian Province, China
  • Received:2023-11-11 Revised:2024-01-06 Published:2024-02-01
  • Corresponding author: Mian Pan
  • Supported by:
    Natural Science Foundation of Department of Science and Technology in Fujian Province(2021J01406)
引用本文:

李莉, 马梅, 黄欣欣, 杨丹林, 潘勉. 妊娠期糖尿病早孕期相关影响因素及基于早孕期孕妇糖脂相关生化指标与人口学资料的4种机器学习算法构建妊娠期糖尿病预测模型的临床价值[J]. 中华妇幼临床医学杂志(电子版), 2024, 20(01): 105-113.

Li Li, Mei Ma, Xinxin Huang, Danlin Yang, Mian Pan. Analysis of early pregnancy-related influencing factors of gestational diabetes mellitus, and clinical value of building gestational diabetes mellitus prediction model based on four machine learning algorithms of glycolipid-related biochemical indexes and demographic information of pregnant women in early pregnancy[J]. Chinese Journal of Obstetrics & Gynecology and Pediatrics(Electronic Edition), 2024, 20(01): 105-113.

目的

探讨妊娠期糖尿病(GDM)的早孕期相关影响因素,以及基于早孕期孕妇糖脂相关生化指标及人口学资料,采用4种机器学习算法构建GDM预测模型的临床价值。

方法

选择2021年12月至2022年12月在福建省妇幼保健院首次进行产前检查的6 257例孕龄为10~13+6孕周孕妇为研究对象。采取回顾性分析法,根据孕妇24~27+6孕周(中孕期)时是否被诊断为GDM,将其分为GDM组(n=1 499,GDM孕妇)和非GDM组(n=4 758,非GDM孕妇)。采用多因素非条件logistic回归分析法,对孕妇发生GDM的早孕期相关影响因素进行分析。基于早孕期孕妇糖脂相关生化指标和人口学资料(8个变量),采用决策树(DT)、逻辑回归(LR)、随机森林(RF)及极致梯度提升(XGB) 4种机器学习算法构建GDM预测模型,并且采用十折交叉验证,评估4种模型的GDM预测性能;并对4种算法构建GDM预测模型的受试者工作特征(ROC)曲线的曲线下面积(AUC)进行比较。本研究经福建省妇幼保健院伦理委员会批准(审批文号:2021KRD018)。所有孕妇签署临床研究知情同意书。

结果

①多因素非条件logistic回归分析结果显示,孕妇高龄(分娩年龄≥35岁)(OR=1.95,95%CI:1.70~2.24,P<0.001),孕前人体质量指数(BMI)≥18.5~24.0 kg/m2(OR=1.32,95%CI:1.11~1.58,P=0.002),孕前BMI≥24.0~28.0 kg/m2(OR=2.17,95%CI:1.73~2.73,P<0.001),孕前BMI≥28.0 kg/m2(OR=2.53,95%CI:1.70~3.78,P<0.001),早孕期血清载脂蛋白(Apo)B水平升高(OR=3.06,95%CI:2.14~4.37,P<0.001),早孕期血清空腹血糖(FPG)浓度增加(OR=2.08,95%CI:1.79~2.41,P<0.001),均为孕妇发生GDM的早孕期相关独立危险因素。②根据4种分类器的分类结果中特征值大小,采用孕妇年龄、受教育程度及孕前BMI,早孕期血清总胆固醇(TC)、甘油三酯(TG)、ApoA1、ApoB及FPG水平8个变量进行GDM预测模型构建的结果显示:DT、LR、RF、XGB 4种算法建立的GDM预测模型的AUC分别为0.645(95%CI:0.591~0.698)、0.699(95%CI:0.641~0.749)、0.672(95%CI:0.621~0.772)、0.597(95%CI:0.553~0.663),LR算法的AUC大于XGB算法,并且差异有统计学意义(Z=2.38、P=0.017),其余各算法的AUC分别两两比较,差异均无统计学意义(P>0.05)。③十折交叉验证结果显示,DT、LR、RF、XGB 4种算法构建GDM预测模型的平均AUC分别为0.586±0.025、0.661±0.020、0.632±0.023、0.576±0.019。

结论

基于早孕期糖脂相关生化指标及人口学资料,采用LR和RF算法构建GDM预测模型,对GDM具有一定预测价值,有助于GDM高危人群早期筛查,必要时对其进行临床干预,可降低母儿GDM相关不良妊娠结局。

Objective

To investigate the early pregnancy-related influencing factors of gestational diabetes mellitus (GDM), as well as the clinical value of building GDM prediction model based on the glycolipids-related biochemical indexes in early pregnancy and demographic information using four machine learning algorithms.

Methods

A total of 6 257 pregnant women with gestational age of 10 to 13+ 6 gestational weeks who had their first prenatal examinations in Fujian Maternity and Child Health Hospital from December 2021 to December 2022 were selected for the study. The pregnant women were categorized into the GDM group (n=1 499, GDM pregnant women) and the non-GDM group (n=4 758, non-GDM pregnant women) according to whether or not they were diagnosed with GDM at 24 to 27+ 6 gestational weeks by retrospective analysis. Early pregnancy-related influencing factors on the development of GDM in pregnant women were analyzed using multivariate unconditional logistic regression analysis. Based on the biochemical indexes related to glycolipids in early pregnancy and demographic information in pregnant women (8 variables), four machine learning algorithms, namely, decision tree (DT), logistic regression (LR), random forest (RF), and extreme gradient boosting (XGB) were used to build GDM prediction models, and ten-fold cross-validation was used to assess the performance of each model, and area under curve (AUC) of the receiver operating characteristic (ROC) curve among the GDM prediction models constructed by the four algorithms were compared. The study was approved by the Ethics Committee of Fujian Maternity and Child Health Hospital (Approval No. 2021KRD018). All pregnant women had signed the informed consent forms for clinical research.

Results

①The results of multivariate unconditional logistic regression analysis showed that pregnant women with advanced age (delivery age≥35 years) (OR=1.95, 95%CI: 1.70-2.24, P<0.001), with pre-pregnancy body mass index (BMI) ≥ 18.5-24.0 kg/m2 (OR=1.32, 95%CI: 1.11-1.58, P=0.002), pre-pregnancy BMI ≥ 24.0-28.0 kg/m2 (OR=2.17, 95%CI: 1.73-2.73, P<0.001), pre-pregnancy BMI ≥28.0 kg/m2 (OR=2.53, 95%CI: 1.70-3.78, P<0.001), elevated serum apolipoprotein (Apo) B levels during early pregnancy (OR=3.06, 95%CI: 2.14-4.37, P<0.001), and increased serum fasting glucose (FPG) concentration in early pregnancy (OR=2.08, 95%CI: 1.79-2.41, P<0.001) were all independent early pregnancy-related risk factors for the development of GDM in pregnant women. ②According to the magnitude of the eigenvalue in the classification results of the 4 classifiers, the results of GDM prediction model constructed using 8 variables of maternal age, degree of education and pre-pregnancy BMI, serum levels of total cholesterol (TC), triglyceride (TG), ApoA1, ApoB, and FPG during early pregnancy showed that the AUC of the GDM prediction models built by the 4 algorithms, namely, DT, LR, RF, and XGB, were 0.645 (95%CI: 0.591-0.698), 0.699 (95%CI: 0.641-0.749), 0.672 (95%CI: 0.621-0.772), and 0.597 (95%CI: 0.553-0.663), respectively, and the AUC of the LR algorithm was greater than that of the XGB algorithm, and the difference was statistically significant (Z=2.38, P=0.017), and there was no significant difference in AUC of pairwise comparison of the rest of the algorithms (P>0.05). ③The ten-fold cross-validation results showed that the average AUC of the GDM prediction models constructed by the four algorithms, DT, LR, RF, and XGB, were 0.586±0.025, 0.661±0.020, 0.632±0.023, and 0.576±0.019, respectively.

Conclusions

Based on the biochemical indexes related to glycolipids in early pregnancy and demographic data, GDM prediction models constructed with LR and RF algorithms, which has a certain predictive value for GDM, and helps to screen the high-risk group of GDM at early stage, and to provide clinical interventions when necessary to reduce GDM-related adverse pregnancy outcome of mother and fetus.

表1 2组早孕期孕妇一般临床资料及糖脂相关生化指标比较
表2 影响孕妇发生GDM的多因素非条件logistic回归分析
图1 4种机器学习算法构建GDM预测模型的ROC-AUC比较注:GDM为妊娠期糖尿病,ROC曲线为受试者工作特征曲线,AUC为曲线下面积。DT为决策树,LR为逻辑回归,RF为随机森林,XGB为极致梯度提升
表3 4种机器学习算法构建GDM预测模型的4个性能评估指标比较
图2 在训练集中预测GDM患者的4种机器学习算法的十折交叉验证AUC结果折线图注:GDM为妊娠期糖尿病,AUC为曲线下面积。DT为决策树,LR为逻辑回归,RF为随机森林,XGB为极致梯度提升
[1]
Kinnunen J, Nikkinen H, Keikkala E, et al. Gestational diabetes is associated with the risk of offspring′s congenital anomalies: a register-based cohort study[J]. BMC Pregnancy Childbirth, 2023, 23(1): 708. DOI: 10.1186/s12884-023-05996-6.
[2]
马雨鸿,马华姝,乔宗旭,等. 妊娠期糖尿病危险因素的结构方程模型分析[J]. 中国卫生统计2022, 39(3): 446-449. DOI: 10.3969/j.issn.1002-3674.2022.03.029.
[3]
Akash MSH, Noureen S, Rehman K, et al. Investigating the biochemical association of gestational diabetes mellitus with dyslipidemia and hemoglobin[J]. Front Med (Lausanne), 2023, 10: 1242939. DOI: 10.3389/fmed.2023.1242939.
[4]
Tong JN, Chen YX, Guan XN, et al. Association between the cut-off value of the first trimester fasting plasma glucose level and gestational diabetes mellitus: a retrospective study from Southern China[J]. BMC Pregnancy Childbirth, 2022, 22(1): 540. DOI: 10.1186/s12884-022-04874-x.
[5]
Shen L, Wang D, Huang Y, et al. Longitudinal trends in lipid profiles during pregnancy: association with gestational diabetes mellitus and longitudinal trends in insulin indices[J]. Front Endocrinol (Lausanne), 2022, 13: 1080633. DOI: 10.3389/fendo.2022.1080633.
[6]
Kouiti M, Hernández-Muñiz C, Youlyouz-Marfak I, et al. Preventing gestational diabetes mellitus by improving healthy diet and/or physical activity during pregnancy: an umbrella review[J]. Nutrients, 2022, 14(10): 2066. DOI: 10.3390/nu14102066.
[7]
Shepherd E, Gomersall JC, Tieu J, et al. Combined diet and exercise interventions for preventing gestational diabetes mellitus[J]. Cochrane Database Syst Rev, 2017, 11(11): CD010443. DOI: 10.1002/14651858.CD010443.pub3.
[8]
Abell SK, Shorakae S, Boyle JA, et al. Role of serum biomarkers to optimise a validated clinical risk prediction tool for gestational diabetes[J]. Aust N Z J Obstet Gynaecol, 2019, 59(2): 251-257. DOI: 10.1111/ajo.12833.
[9]
中华医学会妇产科学分会产科学组,中华医学会围产医学分会,中国妇幼保健协会妊娠合并糖尿病专业委员会. 妊娠期高血糖诊治指南(2022)[第一部分][J]. 中华妇产科杂志2022, 57(1): 3-12. DOI: 10.3760/cma.j.cn112141-20210917-00528.
[10]
杨剑锋,乔佩蕊,李永梅,等. 机器学习分类问题及算法研究综述[J]. 统计与决策2019, 35(6): 36-40. DOI: 10.13546/j.cnki.tjyjc.2019.06.008.
[11]
Greener JG, Kandathil SM, Moffat L, et al. A guide to machine learning for biologists[J]. Nat Rev Mol Cell Biol, 2022, 23(1): 40-55. DOI: 10.1038/s41580-021-00407-0.
[12]
韩娜,刘珏,金楚瑶,等. 2013—2017年北京市通州区34 637例孕妇妊娠期糖尿病流行情况及其影响因素研究[J]. 中华疾病控制杂志2019, 23(2): 156-161. DOI: 10.16462/j.cnki.zhjbkz.2019.02.007.
[13]
Zhu H, Zhao Z, Xu J, et al. The prevalence of gestational diabetes mellitus before and after the implementation of the universal two-child policy in China[J]. Front Endocrinol (Lausanne), 2022, 13: 960877. DOI: 10.3389/fendo.2022.960877.
[14]
Wu Q, Chen Y, Zhou M, et al. An early prediction model for gestational diabetes mellitus based on genetic variants and clinical characteristics in China[J]. Diabetol Metab Syndr, 2022, 14(1): 15. DOI: 10.1186/s13098-022-00788-y.
[15]
Harrison CL, Lombard CB, East C, et al. Risk stratification in early pregnancy for women at increased risk of gestational diabetes[J]. Diabetes Res Clin Pract, 2015, 107(1): 61-68. DOI: 10.1016/j.diabres.2014.09.006.
[16]
Zheng Y, Hou W, Xiao J, et al. Application value of predictive model based on maternal coagulation function and glycolipid metabolism indicators in early diagnosis of gestational diabetes mellitus[J]. Front Public Health, 2022, 10: 850191. DOI: 10.3389/fpubh.2022.850191.
[17]
Snyder BM, Baer RJ, Oltman SP, et al. Early pregnancy prediction of gestational diabetes mellitus risk using prenatal screening biomarkers in nulliparous women[J]. Diabetes Res Clin Pract, 2020, 163: 108139. DOI: 10.1016/j.diabres.2020.108139.
[18]
Xia J, Song Y, Rawal S, et al. Vitamin D status during pregnancy and the risk of gestational diabetes mellitus: a longitudinal study in a multiracial cohort[J]. Diabetes Obes Metab, 2019, 21(8): 1895-1905. DOI: 10.1111/dom.13748.
[19]
Corcoran SM, Achamallah N, Loughlin JO, et al. First trimester serum biomarkers to predict gestational diabetes in a high-risk cohort: striving for clinically useful thresholds[J]. Eur J Obstet Gynecol Reprod Biol, 2018, 222: 7-12. DOI: 10.1016/j.ejogrb.2017.12.051.
[20]
Thériault S, Forest JC, Massé J, et al. Validation of early risk-prediction models for gestational diabetes based on clinical characteristics[J]. Diabetes Res Clin Pract, 2014, 103(3): 419-425. DOI: 10.1016/j.diabres.2013.12.009.
[21]
Naylor CD, Sermer M, Chen E, et al. Selective screening for gestational diabetes mellitus. Toronto Trihospital Gestational Diabetes Project Investigators[J]. N Engl J Med, 1997, 337(22): 1591-1596. DOI: 10.1056/NEJM199711273372204.
[22]
Caliskan E, Kayikcioglu F, Oztürk N, et al. A population-based risk factor scoring will decrease unnecessary testing for the diagnosis of gestational diabetes mellitus[J]. Acta Obstet Gynecol Scand, 2004, 83(6): 524-530. DOI: 10.1111/j.0001-6349.2004.00389.x.
[23]
van Leeuwen M, Opmeer BC, Zweers EJK, et al. Estimating the risk of gestational diabetes mellitus: a clinical prediction model based on patient characteristics and medical history[J]. BJOG, 2010, 117(1): 69-75. DOI: 10.1111/j.1471-0528.2009.02425.x.
[24]
Teede HJ, Harrison CL, Teh WT, et al. Gestational diabetes: development of an early risk prediction tool to facilitate opportunities for prevention[J]. Aust N Z J Obstet Gynaecol, 2011, 51(6): 499-504. DOI: 10.1111/j.1479-828x.2011.01356.x.
[25]
Sweeting AN, Wong J, Appelblom H, et al. A novel early pregnancy risk prediction model for gestational diabetes mellitus[J]. Fetal Diagn Ther, 2019, 45(2): 76-84. DOI: 10.1159/000486853.
[26]
Liu R, Zhan Y, Liu X, et al. Stacking ensemble method for gestational diabetes mellitus prediction in Chinese pregnant women: a prospective cohort study[J]. J Healthc Eng, 2022, 2022: 8948082. DOI: 10.1155/2022/8948082.
[27]
Zhang Z, Yang L, Han W, et al. Machine learning prediction models for gestational diabetes mellitus: Meta-analysis[J]. J Med Internet Res, 2022, 24(3): e26634. DOI: 10.2196/26634.
[28]
Liu H, Li J, Leng J, et al. Machine learning risk score for prediction of gestational diabetes in early pregnancy in Tianjin, China[J]. Diabetes Metab Res Rev, 2021, 37(5): e3397. DOI: 10.1002/dmrr.3397.
[1] 高建松, 陈晓晓, 冯婷, 包剑锋, 魏淑芳, 潘林. 基于超声瞬时弹性成像的多参数决策树模型评估慢性乙型肝炎患者肝纤维化等级[J]. 中华医学超声杂志(电子版), 2023, 20(09): 923-929.
[2] 罗烨, 胡梦铃, 黄小凡, 林金鹏, 李竺蔓, 王少白. 支持向量机用于膝骨关节炎和韧带损伤的分类研究[J]. 中华关节外科杂志(电子版), 2024, 18(02): 351-358.
[3] 王招娣, 孙丽丽, 温佩婷, 吴坤. 成人肠外营养患者住院期间胰岛素添加管理的证据总结[J]. 中华危重症医学杂志(电子版), 2024, 17(01): 32-38.
[4] 黄艺承, 梁海祺, 何其焕, 韦发烨, 杨舒博, 谭舒婷, 翟高强, 程继文. 机器学习模型评估RAS亚家族基因对膀胱癌免疫治疗的作用[J]. 中华腔镜泌尿外科杂志(电子版), 2024, 18(02): 131-140.
[5] 杨龙雨禾, 王跃强, 招云亮, 金溪, 卫娜, 杨智明, 张贵福. 人工智能辅助临床决策在泌尿系肿瘤的应用进展[J]. 中华腔镜泌尿外科杂志(电子版), 2024, 18(02): 178-182.
[6] 吕伟豪, 费晓炜, 武秀权, 何鑫, 郇宇, 吴霜, 豆雅楠, 费舟, 胡世颉. 重型颅脑损伤合并应激性高血糖患者血糖水平与预后的关系[J]. 中华神经创伤外科电子杂志, 2023, 09(06): 338-342.
[7] 卢梦诗, 刘威, 马加威, 嵇丹丹, 贾璇, 詹心萍, 罗亮. 人工智能在急性呼吸窘迫综合征领域的应用进展[J]. 中华重症医学电子杂志, 2024, 10(01): 66-71.
[8] 朱菡, 卓士超, 吴迪, 朱雅楠, 韩佳欣. 术前血浆纤维蛋白原、血脂水平及MMR表达与结直肠癌病理特点及预后的相关性[J]. 中华消化病与影像杂志(电子版), 2024, 14(02): 141-145.
[9] 朱琴琴, 慈娟娟, 崔璐, 许海蓉, 李宇新, 丁炎波. 凝血功能、血脂、C反应蛋白及中性粒细胞/淋巴细胞水平对克罗恩病活动性评估及临床诊断的价值[J]. 中华消化病与影像杂志(电子版), 2024, 14(01): 35-40.
[10] 黄岩, 刘晓巍, 杨春玲, 兰烨. 急性胰腺炎合并糖尿病患者的临床特征及血糖代谢与病情严重度的相关性[J]. 中华消化病与影像杂志(电子版), 2023, 13(06): 439-442.
[11] 袁蔡骏, 闻萍, 徐玲玲. 连续血糖监测在慢性肾脏病合并糖尿病患者中的应用研究进展[J]. 中华临床医师杂志(电子版), 2024, 18(01): 79-82.
[12] 段福孝, 王鑫宇, 孙爽, 于知宇, 张成. 结直肠癌患者周围神经侵犯预测模型的建立与评价[J]. 中华临床医师杂志(电子版), 2023, 17(11): 1154-1162.
[13] 郭震天, 张宗明, 赵月, 刘立民, 张翀, 刘卓, 齐晖, 田坤. 机器学习算法预测老年急性胆囊炎术后住院时间探索[J]. 中华临床医师杂志(电子版), 2023, 17(09): 955-961.
[14] 初桂芝, 王淑娟, 栾文杰, 郭桂敏, 官春霞, 武晓峰, 李松洋, 王好玲, 栾泽东. 早孕期羊膜带综合征产前超声诊断分析[J]. 中华诊断学电子杂志, 2024, 12(01): 57-60.
[15] 叶一, 曾勇. 血脂与轻度认知障碍相关性的研究进展[J]. 中华脑血管病杂志(电子版), 2024, 18(01): 14-18.
阅读次数
全文


摘要