RESEARCH

HOME RESEARCH
Health Analytics
Diagnosis
Predictive Model
Clinical Attributes
Other: Computation Methods for Health
Lung Cancer Prediction using Electronic Claims Records: A Transformer-based Approach
Abstract
Electronic claims records (ECRs) are large scale and longitudinal collections of individual's medical service seeking actions. Compared to in-hospital medical records (EMRs), ECRs are more standardized and cross-sites. Recently, there has been studies showing promising results on modeling claims data for a wide range of medical applications. However, few of them address the exclusion criteria on cohort selection to extract new incidence without prior signs and also often lack of emphasis on predicting cancer in early stages. In this work, we aim to design a lung cancer prediction framework using ECRs with rigorous exclusion design using state-of-the-art sequence-based transformer. Furthermore, this work presents one of the first results by applying disease prediction model to the entire population in Taiwan. The result shows over 2.1 predictive power, 5 average positive predictive value (PPV), and 0.668 area under curve (AUC) in all-stage lung cancer and around 2.0 predictive power, 1 average PPV and 0.645 AUC in early-stage in our dataset. Sub-cohort analysis could funnel high precision selective group into prioritized clinical examination. Onset analysis validates the effect of our exclusion criteria. This work presents comprehensive analyses on lung cancer prediction, and the proposed approach can serve as a state-of-the-art disease risk prediction framework on claims data.
Figures
An overview of our framework, including data, model, and experiment design.
An overview of our framework, including data, model, and experiment design.
Keywords
Electronic Claims Records | Lung Cancer | Transformer | Deep Learning
Authors
Huan-Yu Chen Chi-Chun Lee
Publication Date
2023/10/12
Journal
IEEE Journal of Biomedical and Health Informatics
DOI
10.1109/JBHI.2023.3324191
Publisher