Enhancing and Evaluating Tabular Models on the Fly via Synthetic Question–Answer Generation
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Question Answering (QA) over Tabular Data has been traditionally a challenging task, but LLMs have recently shown the ability to respond to questions related to this type of structured data. However, current tabular QA datasets are skewed toward Wikipedia tables and SQL-style answers composed of human-crafted question–answer pairs. This limits the evaluation of LLMs on this task to a narrow genre of data and language, while also requiring extensive human effort for dataset or benchmark creation. To address this, we introduce SynTabQA, a methodology for the automatic generation of synthetic question–answer pairs from any unannotated table. SynTabQA defines a detailed question typology, enabling fine-grained evaluation and facilitating the creation of diverse QA datasets. Our approach not only provides an automated test bed for any tabular dataset but can also be used in few-shot settings to supply LLMs with tailored examples, improving their focus and accuracy. We validate SynTabQA on two large, manually constructed tabular QA benchmarks of distinct nature.