Home Business & Finance Automated Lexical Coverage for Language Learning: From...
Business & Finance

Automated Lexical Coverage for Language Learning: From General to Specialized Word Lists

Key Points

Announce Type: replace Abstract: A General Service List (GSL) is a commonly used resource for language learners to identify important English words. Traditional GSL creation is resource-intensive, relying on linguistic expertise and subjective input. We created our own GSL and evaluated its performance against the New General Service List (NGSL).

arXiv:2512.15552v2 Announce Type: replace Abstract: A General Service List (GSL) is a commonly used resource for language learners to identify important English words. Traditional GSL creation is resource-intensive, relying on linguistic expertise and subjective input. We created our own GSL and evaluated its performance against the New General Service List (NGSL). We found that creating a Specialized Word List (SWL), tailored to a specific text, is a practical method for language learners. Because an SWL is derived from the target text itself, it reaches the 95% coverage required for language comprehension by construction, and it does so with substantially fewer words than a general list applied to the same text: across nine texts spanning fiction, academic papers, and scripts, the NGSL covered 64-85% of each text, whereas a text-specific list reached 95% with far smaller vocabularies. By restricting the SWL process to objective criteria only, it can be automated, scaled, and tailored to the needs of language-learners across the globe.
GSL (ORG) the New General Service List (ORG) NGSL (ORG) SWL (ORG)
Originally published by arXiv CS Read original →