Back to Main Conference 2026
LREC 2026main

EpiGator: An Event-based Surveillance System for Infectious Disease Outbreaks

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/5jrha624xs52

Abstract

We present EpiGator, a novel event-based system for global surveillance of outbreaks of infectious epidemics that automatically processes streams of news articles and generates reports about the outbreaks, which is crucial for medical authorities. The goal of our work is to combine our experience in outbreak surveillance with state-of-the-art large language models (LLM), which allows us to reduce the overall cost of system development and maintenance. The EpiGator pipeline combines keyword filtering, relevance classification, event-based clustering, and multi-document summarization. A key novelty lies in using a fine-tuned LLM to identify articles relevant to ongoing outbreaks, followed by a zero-shot information extraction pipeline that normalizes the event features and clusters the related articles. For each cluster, we generate an outbreak summary using instruction-tuned LLMs. We evaluate EpiGator output against disease outbreak reports written by medical specialists.

Details

Paper ID
lrec2026-main-614
Pages
pp. 7732-7743
BibKey
wu-etal-2026-epigator
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • YW

    Yiheng Wu

  • JH

    Jue Hou

  • TS

    Trangcasanchai Sathianpong

  • LP

    Lidia Pivovarova

  • RY

    Roman Yangarber

Links