Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

arXiv CS Friday 05 June 2026, 04:00 UTC By Hanxu Hu, Zden\v{e}k \v{S}najdr, Pinzhen Chen, Jannis Vamvas, Rico Sennrich 1 min read

Key Points

Announce Type: new Abstract: Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time.

arXiv:2606.06428v1 Announce Type: new Abstract: Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To translate extremely low-resource languages at scale, we argue that LLMs must acquire the meta-skill of utilizing in-context linguistic knowledge rather than memorizing specific languages. In this paper, we propose a reinforcement learning (RL) approach to unseen language translation given rich linguistic context, using a surface-level translation metric (chrF) as the reward. Empirically, despite the lightweight reward, our RL-trained models effectively extract and apply relevant linguistic information from the provided context, leading to better translations on completely unseen languages than in-context learning or supervised fine-tuning. Our analyses suggest that outcome-based RL can extend beyond conventional reasoning tasks like math and coding to serve as a recipe for language learning from context.

RL (ORG)

Originally published by arXiv CS Read original →

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Related Stories

A Meta Employee Who Just Lost Their Job Was Detained by Immigration Agents

Farage suddenly returns to political stage – but dodges questions about £5m gift

Bill Gates says Epstein wanted personal relationship, but he 'never reciprocated'

Google's latest DiffusionGemma open AI model comes with a 4x speed boost