丙戊酸血药浓度预测的小样本多分类机器学习模型对比
x
请在关注微信后,向客服人员索取文件
篇名: | 丙戊酸血药浓度预测的小样本多分类机器学习模型对比 |
TITLE: | Comparison of small-sample multi-class machine learning models for plasma concentration prediction of valproic acid |
摘要: | 目的 构建用于预测丙戊酸(VPA)血药浓度的三分类(不足、正常、超限)和二分类(不足、正常)模型,并比较这2种模型的性能,为临床制定用药方案提供参考。方法收集2022年11月-2024年9月在西安国际医学中心医院接受VPA治疗并进行血药浓度检测的480名患者的临床数据(共695份数据)。分别针对三分类和二分类模型的目标变量构建预测模型,利用XGBoost特征重要性评分进行特征排名和选取,采用12种机器学习算法进行训练和验证,并通过准确率、F1分数及受试者工作特征曲线下面积(AUC)3个指标对模型的性能进行评价。结果在三分类模型中,合并肾病和合并电解质紊乱的XGBoost特征重要性评分排名较高;然而在二分类模型中,这些特征的重要性排名显著降低,提示其与VPA血药浓度超限之间存在紧密的关联。在三分类模型中,随机森林法表现最佳,但其测试集F1分数仅达到0.7040,AUC仅为0.5193;而在二分类模型中,CatBoost方法表现最佳,其测试集F1分数为0.7857,AUC达到了0.8195。结论本研究构建的三分类模型具有预测VPA血药浓度超限的能力,但预测及模型泛化能力较差;构建的二分类模型仅能对血药浓度不足和正常情况进行分类预测,但模型预测性能较强。 |
ABSTRACT: | OBJECTIVE To construct three-class (insufficient, normal, excessive) and two-class (insufficient, normal) models for predicting plasma concentration of valproic acid (VPA), and compare the performance of these two models, with the aim of providing a reference for formulating clinical medication strategies. METHODS The clinical data of 480 patients who received VPA treatment and underwent blood concentration test at the Xi’an International Medical Center Hospital were collected from November 2022 to September 2024 (a total of 695 sets of data). In this study, predictive models were constructed for target variables of three-class and two-class models. Feature ranking and selection were carried out using XGBoost scores. Twelve different machine learning algorithms were used for training and validation, and the performance of the models was evaluated using three indexes: accuracy, F1 score, and the area under the working characteristic curve of the subject (AUC). RESULTS XGBoost feature importance scores revealed that in the three-class model, the importance ranking of kidney disease and electrolyte disorders was higher. However, in the two-class model, the importance ranking of these features significantly decreased, suggesting a close association with the excessive blood concentration of VPA. In the three-class model, Random Forest method performed best, with F1 score of 0.704 0 and AUC of 0.519 3 on the test set; while in the two-class model, CatBoost method performed optimally, with F1 score of 0.785 7 and AUC of 0.819 5 on the test set. CONCLUSIONS The constructed three-class model has the ability to predict excessive VPA blood concentration, but its prediction and model generalization abilities are poor; the constructed two-class model can only perform classification prediction for insufficient and normal blood concentration cases, but its model performance is stronger. |
期刊: | 2025年第36卷第11期 |
作者: | 陈曦;袁申奥;袁海玲;赵杰;陈鹏;田春艳;苏怡;张云松;张玉 |
AUTHORS: | CHEN Xi, YUAN Shen’ao,YUAN Hailing,ZHAO Jie,CHEN Peng,TIAN Chunyan,SU Yi,ZHANG Yunsong,ZHANG Yu |
关键字: | 丙戊酸;机器学习;血药浓度预测;小样本数据集;模型对比 |
KEYWORDS: | valproic acid; machine learning; plasma |
阅读数: | 17 次 |
本月下载数: | 0 次 |
* 注:未经本站明确许可,任何网站不得非法盗链资源下载连接及抄袭本站原创内容资源!在此感谢您的支持与合作!