Mistral Small
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
GRPO Does Not Close the Multi-Agent Coordination Gap
Announce Type: new Abstract: We measure how well current large language models coordinate as multiple agents sharing a common resource, using the dining philosophers problem as a clean test bed. Across 630 episodes spanning seven models and three philosopher counts, four frontier closed-source systems reach mean reward 0.45 to 0.87 and Mistral-Small 24B reaches 0.83 to 0.99, while Qwen3-14B reaches 0.13 to 0.35. We then ask whether group relative policy optimization (GRPO) on rollouts from...
Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning
arXiv:2602.21103v2 Announce Type: replace Abstract: Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency and substantial test-time inference costs. The standard alternative, fine-tuning smaller models, often sacrifices interpretability while introducing significant resource and operational overhead. To address these limitations, we introduce Prompt-Level Distillation (PLD).
Do LLMs Hold Their Values? MANTA: A Multi-Turn Adversarial Benchmark for Animal Welfare Reasoning
arXiv:2605.16301v2 Announce Type: replace Abstract: Evaluating animal welfare reasoning in LLMs remains an open challenge despite rapid deployment in consumer and professional contexts where welfare considerations appear implicitly in everyday queries. Existing benchmarks such as AnimalHarmBench evaluate this through single-turn, explicitly framed questions, measuring whether models avoid harmful content when directly asked. This approach overlooks two failure modes: alignment degradation...
Von der Leyen’s AI pick triggers conflict-of-interest criticism
BRUSSELS — The appointment of Siemens’ chairman as a European Commission adviser on industrial AI is triggering a backlash in Brussels, weeks after the German engineering giant helped secure a rollback of the EU’s AI rules. “My first reaction was just: Wow,” said Kim van Sparrentak, a Dutch lawmaker who led the work on the AI file for the Greens in the European Parliament. “They fought hard against AI rules for themselves, they lobby against technological sovereignty,...