Home Knowledge Base PCGrad

PCGrad

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

When Gradients Collide: Failure Modes of Multi-Objective Prompt Optimization for LLM Judges

Announce Type: replace Abstract: Customizing an LLM judge to a specific problem or domain often involves optimizing its prompt across multiple evaluation criteria simultaneously. Textual gradient methods automate this for a single judge criterion, however they produce natural-language critiques, not numerical vectors. Thus, the conflict-resolution toolkit of multi-task learning (PCGrad, MGDA) does not apply to this multi-objective textual gradient setting.

arXiv CS 5d ago

Cross-Environment Neural Reranking for Sample-Efficient Action Selection in Text-Based Agents

Announce Type: new Abstract: Large language model agents achieve strong performance on text-based benchmarks but incur prohibitive inference costs, motivating the use of compact neural rerankers for action selection. We investigate whether a single lightweight model can perform action selection across multiple diverse environments, a capability that would eliminate per-environment model maintenance. Training DeBERTa-v3 (184M-434M parameters) jointly on ALFWorld, WebShop, and ScienceWorld...

arXiv CS 8d ago