Automated Extraction of Answer Candidates for Question Generation
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Answering questions based on a reference text is a frequently employed comprehension assessment method that enables teachers to effectively and efficiently evaluate students. Various tools and methods were developed to tackle automated question generation, however, selecting valid answer candidates as a first step is less addressed. Thus, we introduce a solution built on top of FairytaleQA and tailored for training a DeBERTa-based model to classify the quality of each candidate to be part of a strong answer-question pair. First, we extract answer candidates by syntactically parsing the context (i.e., selecting text spans from the reference text based on the nodes in the constituency tree); then, questions are generated for the extracted candidates using a pre-trained LLM model on this task. Next, we assess a candidate’s quality by relying on another fine-tuned model’s capability to answer the previously generated question for that candidate. This enables us to categorize answers using a four-class system: very good, good, average, and unusable. A significant advantage of our method is that the encoder classifier can score all potential answer candidates in a single inference step for the entire context. We compare our selection against both the answers from explicit questions in the original dataset and a fine-tuned LLM for answer selection using an Elo ranking system. In addition, we propose three strategies based on semantic similarity and text position to ensure coverage and diversity of candidates’ selection.