Preparing for Pandemics with Large Language Models: An Evaluation of Sensitivity Across COVID-19, Zika, and Monkeypox Case Reports
Nguyen, Dan ; Rao, Arya S ; Mazumder, Aneesh ; Arraiza, Bianca ; Aldrich, Alex ; Marks, William ; Succi, Marc D
Student Authors
Faculty Advisor
Academic Program
UMass Chan Affiliations
Document Type
Publication Date
Keywords
Subject Area
Collections
Embargo Expiration Date
Link to Full Text
Abstract
Large language models (LLMs) have emerged as potential tools for early disease characterization and pandemic preparedness due to their ability to interpret complex textual data. This study evaluated the sensitivity of three LLMs: GPT-5, Claude Sonnet 4, and Gemini 2.5 Pro on early case reports of COVID-19, Mpox, and Zika. Each case report was modified to remove explicit diagnostic terms, and models were prompted to identify whether the presentation represented a disease of pandemic potential. Claude Sonnet 4 achieved the highest sensitivity overall across all three diseases. GPT-5 demonstrated inconsistent results, performing poorly on Mpox. Findings highlight significant variability in diagnostic reliability across LLMs, emphasizing the need for multimodal integration, dataset refinement, and ethical oversight. Limitations include the small sample size, retrospective English-language case reports, text-only inputs, and evaluation of known diseases.
Clinical Trial Number. Not applicable.
Source
Nguyen D, Rao AS, Mazumder A, Arraiza B, Aldrich A, Marks W, Succi MD. Preparing for Pandemics with Large Language Models: An Evaluation of Sensitivity Across COVID-19, Zika, and Monkeypox Case Reports. J Med Syst. 2026 Mar 28;50(1):40. doi: 10.1007/s10916-026-02367-4. PMID: 41896422.