Request Correction

Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.

Correction Guidelines

Click the edit button next to a field to report a correction.
Fill in the suggested correction value for each field you want to correct.
Provide your name and email so we can contact you if needed.

View all submitted correction requests

Paper Information

lrec2026-main-089

Recovering Registers from Leveled Wordlists

View lrec2026-main-089.pdf

Paper Fields

Click the edit button next to a field to report a correction.

Title

Recovering Registers from Leveled Wordlists

Abstract

For vocabulary learning in language acquisition, it is desirable for learners to acquire words that they are likely to need in the language environments they will encounter. Such language environments are referred to as “registers” in general corpora, which are typically designed to include diverse registers. However, the proportion of registers included, that is, which registers are included and to what extent, is determined by the circumstances under which each general corpus was compiled and is not necessarily optimized for language learning. To bridge this gap, various leveled wordlists have been created in language education using linguistic resources other than word frequency, such as expert judgment and learner responses. However, it has not been quantitatively clear what gap in register proportions in general corpora these leveled wordlists were designed to fill. This study proposes a method that, given a leveled wordlist and a general corpus, estimates the register ratio that best aligns the frequency ordering of words across registers with the leveled wordlist. This makes it easier for learners and educators to interpret which wordlists are appropriate for particular learning goals. Our method is formulated as a linear programming problem and yields a globally optimal solution. Unlike neural networks, it is less susceptible to variation due to initial values or approximation and is therefore easier to interpret. We evaluated the proposed method on two languages, English and Japanese, through a range of experiments. We further show that it can also be used to evaluate vocabulary lists created for specific contexts, such as those generated by Large Language Models like ChatGPT.

Authors

Expand an author to correct their information. Use the remove button to request author removal, or add a new author.

PDF Attachment

You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.

Drag & drop a PDF here, or click to select

Your Information

Name

Comment

Author Declaration *

I declare that I have notified all co-authors of the proposed corrections and obtained their consent, and that all modifications adhere to research ethics standards and the LREC correction policy.

Select at least one field to correct using the edit buttons above.