高级检索
当前位置: 首页 > 详情页

A Comparative Study of Five Large Language Models' Response for Liver Cancer Comprehensive Treatment

文献详情

资源类型:
WOS体系:
Pubmed体系:

收录情况: ◇ SCIE

机构: [1]Department of Liver Transplantation Center and HBP Surgery, Sichuan Clinical Research Center for Cancer, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science and Technology of China, Chengdu, People’s Republic of China
出处:
ISSN:

关键词: large language models liver cancer clinical applicability ChatGPT medical chatbot

摘要:
Large language models (LLMs) are increasingly used in healthcare, yet their reliability in specialized clinical fields remains uncertain. Liver cancer, as a complex and high-burden disease, poses unique challenges for AI-based tools. This study aimed to evaluate the comprehensibility and clinical applicability of five mainstream LLMs in addressing liver cancer-related clinical questions.We developed 90 standardized questions covering multiple aspects of liver cancer management. Five LLMs-GPT-4, Gemini, Copilot, Kimi, and Ernie Bot-were evaluated in a blinded fashion by three independent hepatobiliary experts. Responses were scored using predefined criteria for comprehensibility and clinical applicability. Overall group comparisons were conducted using the Fisher-Freeman-Halton test (for categorical data) and the Kruskal-Wallis test (for ordinal scores), followed by Dunn's post-hoc test or Fisher's exact test with Bonferroni correction. Inter-rater reliability was assessed using Fleiss' kappa.Kimi and GPT-4 achieved the highest proportions of fully applicable responses (68% and 62%, respectively), while Ernie Bot and Copilot showed the lowest. Comprehensibility was generally high, with Kimi and Ernie Bot scoring over 98%. However, none of the LLMs consistently provided guideline-concordant answers to all questions. Performance on professional-level questions was significantly lower than on common-sense ones, highlighting deficiencies in complex clinical reasoning.LLMs demonstrate varied performance in liver cancer-related queries. While GPT-4 and Kimi show promise in clinical applicability, limitations in accuracy and consistency-particularly for complex medical decisions-underscore the need for domain-specific optimization before clinical integration.Not applicable.© 2025 Zhong et al.

语种:
WOS:
PubmedID:
中科院(CAS)分区:
出版当年[2025]版:
大类 | 3 区 医学
小类 | 3 区 肿瘤学
最新[2025]版:
大类 | 3 区 医学
小类 | 3 区 肿瘤学
JCR分区:
出版当年[2024]版:
Q2 ONCOLOGY
最新[2024]版:
Q2 ONCOLOGY

影响因子: 最新[2024版] 最新五年平均 出版当年[2024版] 出版当年五年平均 出版前一年[2024版]

第一作者:
第一作者机构: [1]Department of Liver Transplantation Center and HBP Surgery, Sichuan Clinical Research Center for Cancer, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science and Technology of China, Chengdu, People’s Republic of China
通讯作者:
推荐引用方式(GB/T 7714):
APA:
MLA:

资源点击量:65780 今日访问量:0 总访问量:5151 更新日期:2025-12-01 建议使用谷歌、火狐浏览器 常见问题

版权所有©2020 四川省肿瘤医院 技术支持:重庆聚合科技有限公司 地址:成都市人民南路四段55号