Learning of Robot Safety Policies
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Learning of Robot Safety Policies via Adversarial Synthetic Scenarios
Announce Type: new Abstract: In this work, we propose an agentic gamification framework for hazard-informed learning of robot safety policies through synthetic scenarios. We model scenario generation as an adversarial game between two agents: a Red Team that explores the space of potential failures by constructing hazardous situations, and a Blue Team that incrementally refines safety policies to prevent them. This iterative process enables efficient discovery of high-risk edge cases that...
COP-Q: Safety-First Reinforcement Learning for Robot Control via Cholesky-Ordered Projection
arXiv:2606.04749v1 Announce Type: new Abstract: Safe robot control requires maximizing return while satisfying safety constraints. In off-policy safe reinforcement learning, reward and safety Q-values are commonly learned by separate critic ensembles, with uncertainty handled independently for each objective. This objective-wise treatment neglects inter-objective correlation and can lead to overly conservative value estimates, thereby reducing sample efficiency.
Same Weights, Different Robot: A Deployment Safety View of VLA Policies
Announce Type: new Abstract: Vision-language-action (VLA) policies are often treated as checkpoint-defined objects: if the weights, prompt, and benchmark suite match, the deployment is assumed to be the same policy. Robot execution breaks this assumption because the same normalized model output can become a different physical action after action unnormalization and controller conventions are applied. This creates a deployment-safety gap: safety review can certify the checkpoint while missing...
Shield-Loco: Shielding Locomotion Policies with Predictive Safety Filtering
arXiv:2606.07193v1 Announce Type: new Abstract: Reinforcement learning (RL) policies enable dynamic legged locomotion but lack mechanisms to avoid violations of safety constraints that are absent during training. Large-scale offline safe learning is impractical for covering all edge cases. Existing safety frameworks either rely on reduced-order models that cannot reason about whole-body behaviors or require conservative recovery controllers that degrade task performance.
Latent Activation Editing: Inference-Time Refinement of Learned Policies for Safer Multirobot Navigation
arXiv:2509.20623v2 Announce Type: replace Abstract: Reinforcement learning has enabled significant progress in complex domains such as coordinating and navigating multiple quadrotors. However, even well-trained policies remain vulnerable to collisions in obstacle-rich environments. Addressing these infrequent but critical safety failures through retraining or fine-tuning is costly and risks degrading previously learned skills.
Safe-RULE: Safe Reinforcement UnLEarning
arXiv:2606.09559v1 Announce Type: new Abstract: Offline safe reinforcement learning (Safe RL) enables policy learning without online interactions, making it suitable for safety-critical systems such as robotics systems. However, its reliance on static datasets exposes offline Safe RL to data poisoning attacks, where adversaries inject malicious samples that compromise safety and induce unsafe policy behavior. In this work, we propose a new learning paradigm, named safe reinforcement...
Robust Koopman Control Barrier Filters for Safe Actor-Critic Reinforcement Learning
arXiv:2605.26452v2 Announce Type: replace Abstract: Safe reinforcement learning (RL) for robotic systems requires policies that improve task performance while satisfying state and input constraints during both training and deployment. Control barrier functions (CBFs) provide a principled mechanism for enforcing forward invariance through minimally invasive safety filters, but their use in model-free RL is limited by the need for accurate dynamics and hand-designed barrier certificates. We...
PerceptTwin: Semantic Scene Reconstruction for Iterative LLM Planning and Verification
arXiv:2606.04226v1 Announce Type: new Abstract: Simulation environments are useful for both robot policy learning and planning verification and validation. Traditionally, the process of creating a simulation was onerous. Creating a bespoke simulation environment for each individual environment that a robot would operate in was simply infeasible.
GSAM: A Generalizable and Safe Robotic Framework for Articulated Object Manipulation
Announce Type: new Abstract: Articulated object manipulation is a unique challenge for service robots. Existing methods employ end-to-end policy learning, visionmotion planning, and large-language/visual-language model (LLM/VLM), but often overlook the diversity of articulated objects and the complexity of interactions between end-effector and handle, leading to limited generalization and destructive collisions.
VLA-Arena: An Open-Source Framework for Benchmarking Vision-Language-Action Models
Announce Type: replace Abstract: While Vision-Language-Action models (VLAs) are rapidly advancing towards generalist robot policies, it remains difficult to quantitatively understand their limits and failure modes. To address this, we introduce a comprehensive benchmark called VLA-Arena. We propose a novel structured task design framework to quantify difficulty across three orthogonal axes: (1) Task Structure, (2) Language Command, and (3) Visual Observation.