Home Knowledge Base Elo

Elo

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

ViKing of Norway: Pragg wins crown that eluded even Vishy

Vishy Anand has been there and done that. The first Indian chessman to build an incredible résumé with his global achievements, he saw D Gukesh (Candidates and World title), Arjun Erigaisi (breaking into the Elo 2800 club) and K Humpy (World Rapid crown) follow in his footsteps. But late on Friday in Oslo, R Praggnanandhaa scaled a peak that even Anand could not conquer in his several forays at Norway Chess.

Times of India 3d ago

ChessMimic: Per-Rating Transformer Models for Human Move, Clock, and Outcome Prediction in Online Blitz Chess

arXiv:2606.04473v1 Announce Type: new Abstract: We present ChessMimic, a system of three small encoder-only transformers - for move, thinking-time, and outcome prediction - conditioned on the position, recent move history, player rating, and clock state. We fit a separate instance of each model per 100-Elo rating band, trading parameter efficiency for sharper per-skill calibration. On a held-out month-wide slice of Lichess Rated Blitz games ChessMimic's human move prediction accuracy...

arXiv CS 6d ago

Correct Looks Better: Pairwise Comparisons Reveal Accuracy Rankings

arXiv:2606.09409v1 Announce Type: new Abstract: Pairwise comparisons combined with aggregation methods like Elo have become central to evaluating generative models, yet concerns remain that they reward superficial stylistic cues or display judge biases. In a more positive turn, we show that model rankings from pairwise comparisons strongly agree with ground-truth-based accuracy rankings when such ground truth is available for comparison. By converting five well-known benchmarks into...

arXiv CS 1d ago

Predicting every game of the entire World Cup: All...

Everyone is using artificial intelligence to do, well, everything. With the World Cup starting on June 11, you can't scroll for more than a couple of minutes without hitting another post or video or reel of someone telling you how they used AI to predict the World Cup. So, I decided to use my own supercomputer to predict every game of the 2026 World Cup -- the supercomputer is called "my brain."

ESPN 5d ago

Ranked: The final 48 World Cup rosters are in! Whi...

Finally, we have arrived. The World Cup starts in a little over a week, and every team has finalized its 26-man squad. We know every country that will be participating in the World Cup, and we also know -- barring last-minute injuries -- every player who will be participating in the World Cup.

ESPN 8d ago

Why isn't the U.S. better at soccer?

Why isn't the U.S. better at soccer? Well, better at men's soccer. Can a World Cup at home finally be the breakthrough for the USMNT?

Hacker News 2d ago

From Player to Master: Enhancing Test-Time Learning of LLM Agents via Reinforcement Learning over Memory

Announce Type: new Abstract: Large language model (LLM) agents are increasingly deployed in long-running settings where improving through experience at test time becomes important. A common approach is to update an explicit memory after each interaction to guide future decisions. However, most existing methods rely on hand-designed prompting rules, making it difficult to align memory updates with downstream objectives over multi-step horizons consistently.

arXiv CS 1d ago

Variational Proximal Policy Optimization

Announce Type: cross Abstract: Reinforcement Learning from Human Feedback via Proximal Policy Optimization often suffers from policy mode collapse, brittle exploration loops, and distribution drift. This paper introduces Variational Proximal Policy Optimization (\(\textsc{VP}_2\textsc{O}\)), a particle-based variational inference framework that maps policy optimization to Stein Variational Gradient Descent within a Mixture-of-Experts architecture. By leveraging functional kernels over...

arXiv CS 1d ago

Frontier Lag: A Bibliometric Audit of Capability Misrepresentation in Academic AI Evaluation

arXiv:2605.04135v2 Announce Type: replace Abstract: Readers of applied-domain LLM capability evaluations want to know what AI systems can currently do. That literature answers a related, but consequentially different, question: what older, cheaper, less-elicited models could do months or years earlier (a 2026 paper evaluating GPT-3.5 or GPT-4 zero-shot, say, against a frontier of reasoning-capable, tool-using systems like GPT-5.5 Pro and Claude Opus 4.7), often reported with sparse...

arXiv CS 5d ago

Future Power Rankings: How all 68 Power 4 college football teams stack up

Projecting a college football program's future is harder than ever. Rosters and fortunes change dramatically and championship pathways are more open than ever. The assets that make a program great in 2026 might not be there in 2027.

ESPN 1d ago