Multi-Objective Reinforcement Learning
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Constrained Multi-Objective Reinforcement Learning with Max-Min Criterion
arXiv:2605.31388v1 Announce Type: new Abstract: Multi-Objective Reinforcement Learning (MORL) extends standard RL by optimizing policies with respect to multiple, often conflicting, objectives. While max-min MORL has emerged as an effective approach for promoting fairness, its applicability remains limited, particularly when constraints must be incorporated. In this paper, we propose a MORL framework that integrates the max-min criterion with explicit constraint satisfaction.
VentAgent: When LLMs Learn to Breathe -- Multi-Objective Arbitration for ARDS Ventilation
Announce Type: new Abstract: Mechanical ventilation for Acute Respiratory Distress Syndrome (ARDS) requires balancing competing physiological goals, including oxygenation, lung protection, and acid-base homeostasis. However, current data-driven methods, especially those imitating retrospective Electronic Health Records (EHR), often suffer from imitation bias. They may capture superficial correlations from inconsistent clinical demonstrations, such as associating passive ventilator settings...
Multi-Objective Reinforcement Learning for Tactical Decision Making for Trucks in Highway Traffic
arXiv:2601.18783v2 Announce Type: replace Abstract: Balancing safety, efficiency, and operational costs in highway driving poses a challenging decision-making problem for heavy-duty vehicles. A central difficulty is that conventional scalar reward formulations, obtained by aggregating these competing objectives, often obscure the structure of their trade-offs. We present a Proximal Policy Optimization based multi-objective reinforcement learning framework that learns a set of policies...
One Model, Multiple Goals: Adaptive Multi-Objective Learning for E-commerce Dialogue Systems
arXiv:2606.09293v1 Announce Type: new Abstract: Dialogue systems in e-commerce scenarios often need to satisfy multiple objectives: accurately reasoning over user profiles (e.g., eligibility, credit limit) to ensure correct decision-making and user state interpretation, while also generating natural and faithful responses. These goals are complementary but not identical. In this work, we propose MORE, an adaptive Multi-Objective REinforcement learning framework that jointly optimizes...
From Global Policies to Local Strategies: Multi-Objective Optimization of Resource-Specific Handover Policies
arXiv:2606.01857v1 Announce Type: new Abstract: Efficient resource allocation is a key challenge in business process management, with direct implications for cost, throughput time, and utilization. While recent Reinforcement Learning (RL) approaches have shown promise in deriving adaptive allocation policies, they typically neglect inter-resource collaboration patterns that can strongly influence real-world task handovers. Recognizing this, this paper introduces the first approach for...
MoDex: A Diffusion Policy for Sequential Multi-Object Dexterous Grasping
arXiv:2606.05407v1 Announce Type: new Abstract: This work addresses sequentially grasping multiple objects with a single dexterous hand without releasing those already held. Most dexterous grasping methods commit all of the hand's degrees of freedom to a single object, underutilizing its dexterity and leaving no redundancy for subsequent grasps. The proposed solution, MoDex, is a diffusion policy that predicts the next gripper pose directly from observations, conditioned on an opposition...
Population-Free Pareto Tracking for Sample-Efficient Multi-Policy MORL
Announce Type: replace Abstract: Multi-objective reinforcement learning (MORL) is a fundamental framework for real-world decision-making problems involving multiple conflicting criteria. Existing multi-policy (MP) methods typically rely on online evolutionary frameworks that maintain large policy populations, leading to high sample complexity and excessive agent-environment interactions. To mitigate these limitations, we present Multi-policy Pareto Front Tracking (MPFT), a framework without...
Adversarial Instance Generation and Robust Training for Neural Combinatorial Optimization with Multiple Objectives
arXiv:2601.01665v2 Announce Type: replace Abstract: Deep reinforcement learning (DRL) has shown great promise in addressing multi-objective combinatorial optimization problems (MOCOPs). Nevertheless, the robustness of these learning-based solvers has remained insufficiently explored, especially across diverse and complex problem distributions. In this paper, we propose a unified robustness-oriented framework for preference-conditioned DRL solvers for MOCOPs.
Fine-Tuning Diffusion Models for Molecular Generation via Reinforcement Learning and Fast Sampling
arXiv:2606.01220v1 Announce Type: new Abstract: Generating molecules that simultaneously satisfy drug-like properties and conform to the 3D structure of a target protein is a core challenge in structure-based drug design (SBDD). Existing generative approaches, however, often rely on costly post-hoc processing during Sampling or require carefully curated datasets during training, yet still achieve modest gains. These limitations are especially pronounced in multi-objective settings, where...
ScenePilot: Controllable Boundary-Driven Critical Scenario Generation for Autonomous Driving
Announce Type: replace Abstract: Safety-critical scenarios are central to evaluating autonomous driving systems, yet their rarity in naturalistic logs makes simulation-based stress testing indispensable. Most scenario generation methods treat surrounding agents as adversaries, but they either (i) induce failures without explicitly modeling vehicle-road physical limits, yielding visually extreme yet physically unsolvable crashes, or (ii) enforce physical feasibility or policy feasibility in...