Poster Session 1
Category: Prematurity
Poster Session 1
Elisa Gi Soo Um, MD (she/her/hers)
Department of Obstetrics and Gynecology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
Seoul, Seoul-t'ukpyolsi, Republic of Korea
Hyun Sun Ko, MD, PhD
Department of Obstetrics and Gynecology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea
Seoul, Seoul-t'ukpyolsi, Republic of Korea
Seon-Kwon Jang
Department of Life Science, Dongguk University
Seoul, Seoul-t'ukpyolsi, Republic of Korea
Min-Ho Lee
Department of Life Science, Dongguk University
Seoul, Seoul-t'ukpyolsi, Republic of Korea
Preterm birth (PTB), defined as delivery between 20+0 and 36+6 weeks of gestation, is a major cause of neonatal morbidity and mortality worldwide. In South Korea, the PTB rate has increased from 3.8% in 2000 to 9.2% in 2021 despite declining fertility rates. This study aimed to develop a machine learning (ML)-based model for PTB prediction using national health and screening data and to identify key determinants of PTB risk, at the national level.
Study Design:
We conducted a retrospective cohort study using linked data from the Korean National Health Insurance Service, national health screenings, and infant records. Eligible participants were women with deliveries (2017–2020) who underwent a health screening within 1.5 years prior to delivery and had matched infant data. After merging seven datasets, 41 predictive features were retained following statistical filtering. To address class imbalance, fixed-size stratified splits were used: training (n=120,000; 20,000 PTB), validation (n=60,000; 10,000 PTB), and test (n=566,344; 10,686 PTB). ML models (Random Forest, XGBoost, LightGBM) were trained and evaluated by AUC and accuracy.
Results:
Among 705,658 women with deliveries, 40,656 (5.8%) resulted in PTB. All three models demonstrated high performance, with an AUC of 0.72 and an accuracy of 98.1%. Feature importance analysis identified fetal number and maternal weight as the top predictors across all models, followed by socioeconomic factors such as occupation and insurance status. Among the pre-pregnancy medication histories, use of sleeping pill and antianxiety medication were identified as important features.
Conclusion:
ML models trained on large-scale health data showed effective performance in PTB prediction. These models provide interpretable risk stratification tools that may support early detection and enable targeted perinatal interventions to reduce PTB rates in South Korea. Additionally, efforts to lower the rate of multiple pregnancies—one of the major contributors to PTB—should also be considered as part of a broader prevention strategy.