Home Knowledge Base DeMix Corpora

DeMix Corpora

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training

arXiv:2602.00747v3 Announce Type: replace Abstract: Determining an effective data mixture is a key factor in Large Language Model (LLM) pre-training, where models must balance general competence with proficiency on hard tasks such as math and code. However, identifying an optimal mixture remains an open challenge, as existing approaches either rely on unreliable tiny-scale proxy experiments or require prohibitively expensive large-scale exploration. To address this, we propose Decouple...

arXiv CS 9d ago