Home › Knowledge Base › Better Decision-Making Agents

Better Decision-Making Agents

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Post-Training LLMs as Better Decision-Making Agents: A Regret-Minimization Approach

Announce Type: replace Abstract: Large language models (LLMs) are increasingly deployed as "agents" for decision-making (DM) in interactive and dynamic environments. Yet, since they were not originally designed for DM, recent studies show that LLMs can struggle even in basic online DM problems, failing to achieve low regret or an effective exploration-exploitation tradeoff. To address this, we introduce Iterative Regret-Minimization Fine-Tuning (Iterative RMFT), a post-training procedure...

arXiv CS 9d ago

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models

Announce Type: new Abstract: Large language models (LLMs) have been widely adopted in healthcare, yet they still encounter significant challenges in complex clinical decision-making scenarios. Existing benchmarks primarily assess LLM performance in single-course settings and lack systematic evaluation in multi-course scenarios, where a patient's condition evolves over time.

arXiv CS 7d ago

PSG-Nav: Probabilistic Scene Graph Navigation via Multiverse Decision Making

arXiv:2606.01313v1 Announce Type: new Abstract: Open-vocabulary navigation requires embodied agents to manage significant perception uncertainty stemming from semantic ambiguity and model errors. However, most existing works settle for local optimal deterministic approaches, depriving complex navigation decision-making over multiple composite possibilities that are critical for globally better solutions. In this paper, we propose Probabilistic Scene Graph Navigation (PSG-Nav), which...

arXiv CS 8d ago

Physicist Richard Feynman's forgotten notes on 'the restaurant problem' finally deciphered after 50 years

Physicist Richard Feynman's forgotten notes on 'the restaurant problem' finally deciphered after 50 years Researchers cracked a 50-year-old math problem scribbled by Richard Feynman over lunch. The equations show that humans are better decision-makers than scientists once thought. It started with a plate of ginger chicken.

Live Science 1d ago

Path Planning Using Deep Deterministic Policy Gradient: A Reinforcement Learning Approach

arXiv:2606.07855v1 Announce Type: new Abstract: Path-planning for autonomous vehicles in threat-laden environments is a fundamental challenge because the problem is nonlinear and nonconvex even in simplest scenarios. While traditional optimal control methods can be used to find ideal paths, the computational time is often too slow for real-time decision-making. To solve this challenge, we propose a method based on Deep Deterministic Policy Gradient (DDPG) and model the threat as possibly...

arXiv CS 1d ago

One Model, Multiple Goals: Adaptive Multi-Objective Learning for E-commerce Dialogue Systems

arXiv:2606.09293v1 Announce Type: new Abstract: Dialogue systems in e-commerce scenarios often need to satisfy multiple objectives: accurately reasoning over user profiles (e.g., eligibility, credit limit) to ensure correct decision-making and user state interpretation, while also generating natural and faithful responses. These goals are complementary but not identical. In this work, we propose MORE, an adaptive Multi-Objective REinforcement learning framework that jointly optimizes...

arXiv CS 1d ago

Microsoft’s AI chief says superintelligence is near, but won’t take your job

Today I’m talking with Mustafa Suleyman, the CEO of Microsoft AI. And I’m actually going to keep today’s intro short — I’m working from my wife’s family farm this week, as you’ll see in the video, but also this is a real burner of an episode. We covered everything from Mustafa’s approach to training new models to his criticisms of Anthropic talking about Claude as though it is conscious.

The Verge 2d ago

The Epi-LLM Framework: probing LLM behavioral priors through epidemiological agent-based models

arXiv:2606.02867v1 Announce Type: new Abstract: Human behaviour during epidemics affects infectious disease dynamics, but quantifying this remains deeply challenging. Here we introduce the Epi-LLM framework: a novel integration of agent-based modelling, real-life epigames, and large language models (LLMs) in which a synthetic society of agents reasons and adapts dynamically over an outbreak contact network. Comparing synthetic agent behaviour against a no-intervention SEIR baseline and human...

arXiv CS 7d ago

COMAP: Co-Evolving World Models and Agent Policies for LLM Agents

Announce Type: new Abstract: Equipping language agents with world models enables them to anticipate environment dynamics and evaluate candidate actions before execution. However, existing textual world models are typically fixed after training, preventing them from adapting to the on-policy state-action distributions induced by an evolving agent. Meanwhile, agent-improvement methods often rely on external rewards or verifiers, limiting their applicability in realistic interactive environments.

arXiv CS 8d ago

The back-channel bid to go soft on Maduro

When Marco Rubio was named secretary of State, many in both South Florida Republican circles and the American energy industry exulted. But one man who bridged both worlds knew he had a problem. A longtime investor in Venezuela, the main source of crude oil needed to produce the asphalt that had made his family rich, Harry Sargeant III kept relations with top officials in Caracas even as they seized most foreign oil holdings.

Politico EU 2d ago