Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
Recovering Registers from Leveled Wordlists
Paper Fields
Click the edit button next to a field to report a correction.
Recovering Registers from Leveled Wordlists
For vocabulary learning in language acquisition, it is desirable for learners to acquire words that they are likely to need in the language environments they will encounter. Such language environments are referred to as “registers” in general corpora, which are typically designed to include diverse registers. However, the proportion of registers included, that is, which registers are included and to what extent, is determined by the circumstances under which each general corpus was compiled and is not necessarily optimized for language learning. To bridge this gap, various leveled wordlists have been created in language education using linguistic resources other than word frequency, such as expert judgment and learner responses. However, it has not been quantitatively clear what gap in register proportions in general corpora these leveled wordlists were designed to fill. This study proposes a method that, given a leveled wordlist and a general corpus, estimates the register ratio that best aligns the frequency ordering of words across registers with the leveled wordlist. This makes it easier for learners and educators to interpret which wordlists are appropriate for particular learning goals. Our method is formulated as a linear programming problem and yields a globally optimal solution. Unlike neural networks, it is less susceptible to variation due to initial values or approximation and is therefore easier to interpret. We evaluated the proposed method on two languages, English and Japanese, through a range of experiments. We further show that it can also be used to evaluate vocabulary lists created for specific contexts, such as those generated by Large Language Models like ChatGPT.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.