Home › Knowledge Base › Socrates

Socrates

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Socratic-SWE: Self-Evolving Coding Agents via Trace-Derived Agent Skills

arXiv:2606.07412v1 Announce Type: new Abstract: LLM-driven software engineering agents have become a central testbed for real-world language-model capability, yet their training remains limited by the availability of high-quality SWE tasks. Existing synthetic data methods typically create tasks through fixed mutation or bug-injection procedures, making the resulting distributions largely independent of the agent's own weaknesses and training progress. We introduce Socratic-SWE, a closed-loop...

arXiv CS 2d ago

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

arXiv:2606.05563v1 Announce Type: new Abstract: Evaluating LLM mediators remains challenging, as mediation unfolds as a real-time trajectory shaped by disputants' shifting emotions, intentions, and context. Existing testbeds rely on a few expert-authored domains, vary mainly strategic posture, and score every turn against every topic, introducing off-topic noise. We introduce SoCRATES, a benchmark for evaluating proactive LLM mediators in realistic, multi-domain testbeds.

arXiv CS 5d ago

ClinTutor-R1: Advancing Scalable and Robust One-to-Many Alignment in Clinical Socratic Education

arXiv:2512.05671v2 Announce Type: replace Abstract: While Large Language Models (LLMs) have achieved remarkable success in dyadic (one-on-one) instruction, they face significant challenges in One-to-Many alignment, such as clinical ward rounds, where an instructor must simultaneously guide a diverse group of trainees. Current models often suffer from context dilution and goal misalignment, failing to balance individual scaffolding with collective learning progress. To address this, we...

arXiv CS 8d ago

Elmes*: Automated Construction of Fine-Grained Evaluation Rubrics for Large Language Models in Long-Tail Educational Scenarios

Announce Type: new Abstract: Evaluating large language models (LLMs) for education requires measuring how models teach, not only what they know. Existing benchmarks emphasize domain-general correctness or depend on manually designed rubrics that scale poorly to long-tail pedagogical scenarios. We introduce Elmes*, an end-to-end framework for constructing, refining, and applying fine-grained scenario-specific rubrics.

arXiv CS 2d ago

Inside Google’s AI training for teachers

MOUNTAIN VIEW, Calif. — Sitting in an atrium on Google’s campus, a group of K-12 educators imagined the worst response they could receive when they tried to persuade their colleagues to use artificial intelligence. They pictured a veteran English teacher who was still upset that cursive is no longer taught.

NBC News 1d ago

Book Dedications

To my sister, Dr. Soma Mohammed Mohammed Baroud. I write your name in full, because that is how it appeared on the white body bag that held your remains soon after the bomb was dropped. Dedications A random assortment of book dedications.

Hacker News 8d ago

The criminal cartels cashing in on the World Cup – podcast

Football fans are celebrating the tournament coming to Guadalajara. But with a brutal crime syndicate holding sway there, what are the risks for fans – and the government?Excitement is mounting in Mexico as the World Cup opens in Mexico City, then heads to the city of Guadalajara. Mexican journalist Leon Krauze is a fan.

The Guardian World 1d ago

The Hardest Things to Say to One Another

This is an edition of The Wonder Reader, a newsletter in which our editors recommend a set of stories to spark your curiosity and fill you with delight. Sign up here to get it every Saturday morning. Recently, Russell Shaw realized that he had texted his kids the same two words—Too loud—133 times since 2020.

The Atlantic 10d ago

Has Trump Corrupted the Military?

Subscribe here: Apple Podcasts | Spotify | YouTubeOn this week’s episode of The David Frum Show, The Atlantic’s David Frum opens with his thoughts about the recently reported peace talks between the United States and Iran. David argues that these reported talks indicate the United States is losing the war in Iran, and that the loss highlights what has always been true: The presidency is too big a job for Donald Trump.Then David is joined by Representative Jason Crow of Colorado to discuss...

The Atlantic 13d ago