Background and objective Gene expression analysis plays a critical role in lung cancer research, offering molecular feature-based diagnostic insights that are particularly effective in distinguishing lung cancer subtypes. However, the high dimensionality and inherent imbalance of gene expression data create significant challenges for accurate diagnosis. This study aims to address these challenges by proposing an innovative deep learning-based method for predicting lung cancer subtypes.Methods We propose a method called Exo-LCClassifier, which integrates feature selection, one-dimensional convolutional neural networks (1D CNN), and an improved Wasserstein Generative Adversarial Network (WGAN). First, differential gene expression analysis was performed using DESeq2 to identify significantly expressed genes from both normal and tumor tissues. Next, the enhanced WGAN was applied to augment the dataset, addressing the issue of sample imbalance and increasing the diversity of effective samples. Finally, a 1D CNN was used to classify the balanced dataset, thereby improving the model's diagnostic accuracy.Results The proposed method was evaluated using five-fold cross-validation, achieving an average accuracy of 0.9766 +/- 0.0070, precision of 0.9762 +/- 0.0101, recall of 0.9827 +/- 0.0050, and F1-score of 0.9793 +/- 0.0068. On an external GEO lung cancer dataset, it also showed strong performance with an accuracy of 0.9588, precision of 0.9558, recall of 0.9678, and F1-score of 0.9616.Conclusion This study addresses the critical challenge of imbalanced learning in lung cancer gene expression analysis through an innovative computational framework. Our solution integrates three advanced techniques: (1) DESeq2 for differential expression analysis, (2) WGAN for data augmentation, and (3) 1D CNN for feature learning and classification. The source codes are publicly available at: https://github.com/lanlinxxs/Exo-classifier.
基金:
Noncommunicable Chronic Diseases-National Science and Technology Major Project [2023ZD0506101/2023ZD0506100]; Medico-Engineering Cooperation Funds from University of Electronic Science and Technology of China [ZYGX2021YGLH211]; Sichuan Science and Technology Program [2022YFG0176]; National Science and Technology Innovation Fund 2030 [SQ2023AAA031457]
第一作者机构:[1]Univ Elect Sci & Technol China, Inst Intelligent Comp, Chengdu, Sichuan, Peoples R China[2]Trusted Cloud Comp & Big Data Key Lab Sichuan Prov, Chengdu, Sichuan, Peoples R China
通讯作者:
推荐引用方式(GB/T 7714):
Zhan Siyu,Yu Hao,Liu Shuang,et al.High-precision lung cancer subtype diagnosis on imbalanced exosomal data via Exo-LCClassifier[J].FRONTIERS IN GENETICS.2025,16:doi:10.3389/fgene.2025.1583081.
APA:
Zhan, Siyu,Yu, Hao,Liu, Shuang,Qin, Ke&Guo, Lu.(2025).High-precision lung cancer subtype diagnosis on imbalanced exosomal data via Exo-LCClassifier.FRONTIERS IN GENETICS,16,
MLA:
Zhan, Siyu,et al."High-precision lung cancer subtype diagnosis on imbalanced exosomal data via Exo-LCClassifier".FRONTIERS IN GENETICS 16.(2025)