Icon Legend

This session is not in your schedule.

This session is in your schedule. Click again to remove it.

Presentation Icons

Additional registration fee required

Faculty have requested this content not be shared outside of the session

CME Credit Offered

Abstract Award

Recording available 2/16-5/2

AIUM Credit

Foundation Awardee

Poster Icons

Abstract Award

Foundation Awardee

Poster Session 3

Category: Digital Health Technologies (DHT)

Poster Session 3

(820) Assessing the Accuracy and Safety of ChatGPT Responses to Common Questions About Preeclampsia in Pregnancy

Thursday, February 12, 2026

10:30 AM - 12:00 PM

Submitting Author and Presenting Author(s)

Joe Haydamous, MD (he/him/his)

PGY1
Department of Obstetrics, Gynecology and Reproductive Sciences, McGovern Medical School at UTHealth Houston
Department of Obstetrics and Gynecology, McGovern Medical School at UT Health, Houston, Texas, United States

Coauthor(s)

Laura Diab, MD (she/her/hers)

Division of Maternal-Fetal Medicine, Department of Obstetrics, Gynecology and Reproductive Sciences, McGovern Medical School at UTHealth Houston
Houston, Texas, United States
Farah H. Amro, MD

Assistant Professor
Division of Maternal-Fetal Medicine, Department of Obstetrics, Gynecology and Reproductive Sciences, McGovern Medical School at UTHealth Houston
Houston, Texas, United States

Objective:

To evaluate the accuracy, completeness, and safety of ChatGPT-generated responses to common patient and provider questions about preeclampsia (preE) in pregnancy, assessing its potential role as an educational and triage-support tool in obstetric care.

Study Design:

Clinically relevant questions were developed from ACOG, SMFM, and WHO guidelines, as well as frequently discussed topics in patient forums and prenatal education resources. Responses were generated using GPT‑4 and independently evaluated by five maternal‑fetal medicine (MFM) specialists. Reviewers rated each response on a 5‑point Likert scale for accuracy, completeness, and safety. Two standardized prompts simulated patient- and provider-directed queries. Questions were grouped into four categories: General Knowledge, Public Health/Prevention, Management & Treatment, and Diagnostic Interpretation.

Results:

Across all domains, ChatGPT responses demonstrated high performance. Mean ratings (1–5 scale) were 4.60 for accuracy, 4.29 for completeness, and 4.43 for safety. The proportion of evaluations rated ≥4 was 92% for accuracy, 86% for completeness, and 85% for safety. Public Health/Prevention questions achieved the highest ratings across domains. Management & Treatment responses showed good alignment with guideline recommendations. Diagnostic Interpretation responses were generally accurate and safe but had slightly lower completeness, reflecting less detail in complex clinical decision-making. No unsafe or potentially harmful content was identified.

Conclusion:

ChatGPT generated safe, accurate, and generally complete responses to preE-related questions. Its high accuracy and safety, particularly for public health and patient education topics, suggest potential for integration into prenatal counseling and triage support. Opportunities remain to enhance completeness in complex diagnostic contexts while maintaining its strong safety profile.