Home › Knowledge Base › Iterative Selection with Trade

Iterative Selection with Trade

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

LISTEN to Your Preferences: An LLM Framework for Multi-Objective Selection

Announce Type: replace Abstract: Human experts often struggle to select the best option from a large set of items with multiple competing objectives, a process bottlenecked by the difficulty of formalizing complex, implicit preferences. To address this, we introduce LISTEN (LLM-based Iterative Selection with Trade-off Evaluation from Natural-language), an agentic LLM-based framework that treats the LLM as a decision-making agent capable of iteratively refining its internal preference model...

arXiv CS 8d ago

Noncooperative Coordination via a Trading-based Auction

arXiv:2502.03616v4 Announce Type: replace Abstract: Noncooperative multi-agent systems often face coordination challenges due to conflicting preferences among agents. In particular, when agents act in their own self-interest, they may prefer different choices among multiple feasible outcomes, leading to suboptimal outcomes or even safety concerns. We propose an algorithm named trading auction for consensus (TACo), a decentralized approach that enables noncooperative agents to reach consensus...

arXiv CS 1d ago

Before Parc Ferm\'e: RL-Time Pruning for Efficient Embodied LLMs in Autonomous Driving

arXiv:2605.31256v1 Announce Type: new Abstract: Embodied Large Language Models (LLMs) are increasingly used as reasoning modules in robotic control pipelines to improve human-robot interaction, but their memory and generation latency make real-time deployment difficult. Pruning can reduce these costs, but for controllers that undergo multiple pre- and post-training phases, the crucial question is not only how much to prune, but when pruning should occur. In this work, we propose Before Parc...

arXiv CS 9d ago

RASER: Recoverability-Aware Selective Escalation Router for Multi-Hop Question Answering

arXiv:2606.02488v1 Announce Type: new Abstract: Multi-hop question-answering systems often use expensive retrieval on every question. They may decompose the question, run several retrieval rounds, or search through bridge entities before answering. All of these strategies rely on repeated LLM calls to rewrite or decompose the question, which increases extra token cost, and it is not fitting when the LLM budget is tight.

arXiv CS 8d ago

The Thunder built a contender -- now they must dec...

AN HOUR AFTER the San Antonio Spurs dethroned his Oklahoma City Thunder in a dramatic Game 7, Shai Gilgeous-Alexander was asked a version of the question nearly every NBA star gets after his team falls short. How much input do you plan to have on the franchise's offseason maneuvering? "I will give zero input," Gilgeous-Alexander said.

ESPN 7d ago

DAPD: Dependency-Aware Parallel Decoding via Attention for Diffusion LLMs

arXiv:2603.12996v2 Announce Type: replace Abstract: Parallel decoding for Diffusion LLMs (dLLMs) is difficult because each denoising step provides only token-wise marginal distributions, while unmasking multiple tokens simultaneously requires accounting for inter-token dependencies. We propose Dependency-Aware Parallel Decoding (DAPD), a simple, training-free decoding method that uses self-attention to induce a conditional dependency graph over masked tokens. At each iteration, edges in this...

arXiv CS 8d ago

Crystal Nights by Greg Egan

Publication history - Interzone #215, April 2008. - Free podcast at Transmissions From Beyond. [Site no longer active] - Oceanic (collection, Orion) -

Hacker News 8d ago

RiskFlow: Fast and Faithful Safety-Critical Traffic Scenario Generation

arXiv:2606.06423v1 Announce Type: new Abstract: Safety-critical traffic scenario generation is essential for evaluating autonomous driving systems under rare but high-risk interactions. Existing diffusion-based methods offer strong controllability in closed-loop generation, but their iterative denoising process is computationally expensive and may accumulate sampling and guidance errors over long rollouts, causing unrealistic motion artifacts such as jitter, abnormal acceleration, and...

arXiv CS 5d ago

Autopilot-Preserving Residual Q-Learning with HJB-Inspired Finite-Action Risk Filtering for Fixed-Wing UAV Command Supervision

Announce Type: new Abstract: A fixed-wing UAV must hold airspeed, altitude, and heading references under wind, gusts, and turbulence, channels coupled so that correcting one can degrade another. Classical autopilots stabilize the airframe well but adapt poorly when a hard crosswind meets an aggressive turn, while reinforcement-learning (RL) policies acting directly on the surfaces concentrate exploration risk at the actuator interface. We place a learned supervisor above an unchanged...

arXiv CS 8d ago

Mixture of Horizons in Action Chunking

Announce Type: replace Abstract: Vision-language-action (VLA) models have shown remarkable capabilities in robotic manipulation, but their performance is sensitive to the $\textbf{action chunk length}$ used during training, termed $\textbf{horizon}$. Our empirical study reveals an inherent trade-off: longer horizons provide stronger global foresight but degrade fine-grained accuracy, while shorter ones sharpen local control yet struggle on long-term tasks, implying fixed choice of single...

arXiv CS 9d ago