Objective Presentation delay of cancer patients prevents the patient from timely diagnosis and treatment leading to poor prognosis. Predicting the risk of presentation delay is crucial to improve the treatment outcomes. This study aimed to develop and validate prediction models of presentation delay risk in gastric cancer patients by using various machine learning models. Methods 875 cases of gastric cancer patients admitted to a tertiary oncology hospital from July 2023 to June 2024 were used as derivation cohort, 200 cases of gastric cancer patients admitted to other 4 tertiary hospital were used as external validation cohort. After collecting the data, statistical analysis was performed to identify discriminative variables for the prediction of presentation delay and 13 statistically significant variables are selected to develop machine learning models. The derivation cohort was randomly assigned to the training and internal validation set by the ratio of 7:3. Prediction models were developed based on six machine learning algorithms, which are logistic regression (LR), support vector machine (SVM), random forest (RF), gradient boosted trees (GBDT), extremely gradient boosting (XGBoost) and muti-layer perceptron (MLP). The discrimination and calibration of each model were assessed based on various metrics including accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1-Score and area under curve (AUC), calibration curves and Brier scores. The best model was selected based on comparing of various metrics. Based on the selected best model, the impact of features to the prediction result was analyzed with the permutation feature importance method. Results The incidence of presentation delay for gastric cancer patients was 39.3%. The developed models achieved performance metrics as AUC (0.893-0.925), accuracy (0.817-0.847), sensitivity (0.857-0.905), specificity (0.783-0.854), PPV (0.728-0.798), NPV (0.897-0.927), F1 score (0.791-0.826) and Brier score (0.107-0.138) in internal validation set, which indicated good discrimination and calibration for the prediction of presentation delay in gastric cancer patients. Among all models, RF based model was selected as the best one as it achieved good discrimination and calibration performance on both of internal and external validation set. Feature ranking results indicated that both of subjective and objective factors have significant impact on the occurrence of presentation delay in gastric cancer patients. Conclusion This study demonstrated that the RF based model has favorable performance for the prediction of presentation delay in gastric cancer patients. It can help medical staffs to screen out high-risk gastric cancer patients for presentation delay, and to take appropriate and specific interventions to reduce the risk of presentation delay.
基金:
National Natural Science Foundation of China [72304060]
第一作者机构:[1]Chengdu Med Coll, Sch Nursing, Chengdu, Peoples R China[2]Univ Elect Sci & Technol China, Sichuan Canc Hosp & Inst, Sichuan Clin Res Ctr Canc,Affiliated Canc Hosp, Sichuan Canc Ctr,Dept Gastr Surg, Chengdu, Peoples R China
通讯作者:
通讯机构:[1]Chengdu Med Coll, Sch Nursing, Chengdu, Peoples R China[7]Univ Elect Sci & Technol China, Sichuan Canc Hosp & Inst, Sichuan Clin Res Ctr Canc,Affiliated Canc Hosp, Sichuan Canc Ctr,Nursing Dept, Chengdu, Peoples R China
推荐引用方式(GB/T 7714):
Zhou Huali,Gu Qiong,Bao Rong,et al.Machine learning based models for predicting presentation delay risk among gastric cancer patients[J].FRONTIERS IN ONCOLOGY.2025,14:doi:10.3389/fonc.2024.1503047.
APA:
Zhou, Huali,Gu, Qiong,Bao, Rong,Qiu, Liping,Zhang, Yuhan...&Yang, Qing.(2025).Machine learning based models for predicting presentation delay risk among gastric cancer patients.FRONTIERS IN ONCOLOGY,14,
MLA:
Zhou, Huali,et al."Machine learning based models for predicting presentation delay risk among gastric cancer patients".FRONTIERS IN ONCOLOGY 14.(2025)