Clean Task Success Rate
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models
arXiv:2605.25889v4 Announce Type: replace Abstract: Vision-Language-Action (VLA) models reach high success rates on clean inputs but collapse under small adversarial perturbations: a $16/255$ PGD attack drops OpenVLA-7B's LIBERO success from $95\%$ to under $5\%$. Whether this trade-off has a theoretical floor was open. We prove that it does. For any VLA policy, capability $I(\Astar;\Api)$ and robustness $I(\Api;\Atildepi)-I(\Api;\delta)$ sum to at most $H(\Astar)+I(X;\Xtilde)$, the task...
SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models
arXiv:2601.14323v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models are increasingly deployed in safety-critical robotic applications, yet their security vulnerabilities remain underexplored. We identify a fundamental security flaw in modern VLA systems: the combination of action chunking and delta pose representations creates an intra-chunk visual open-loop. This mechanism forces the robot to execute K-step action sequences, allowing per-step perturbations to accumulate...
World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis
arXiv:2606.05979v1 Announce Type: new Abstract: We propose world-language-action (WLA) models as a new class of embodied foundation models. WLA takes textual instructions, images, and robot states as inputs to jointly predict textual subtasks, subgoal images, and robot actions, conjoining the \emph{world modeling interface} to learn from extensive egocentric videos as in the world-action model (WAM) and the \emph{language reasoning} capacities to solve complex long-horizon tasks as in...
POISE: Position-Aware Undetectable Skill Injection on LLM Agents
arXiv:2606.07943v1 Announce Type: new Abstract: Agent skills provide a lightweight mechanism for extending general-purpose agents, but their open format exposes them to skill-poisoning attacks. A practically dangerous injection must stay invisible: if executing the payload derails the user's legitimate task, the resulting failure signal invites inspection of the skill. We therefore evaluate attacks by Attack Success Rate, which requires the injected payload to execute and the user's task to...
When AI Builds Itself: Our progress toward recursive self-improvement
For most of AI’s history, humans drove every step in its development cycle. But at Anthropic, we are delegating a growing share of AI development to AI systems themselves, which is speeding up our work. Taken far enough, and given enough compute, that trend points to an AI system capable of fully autonomously designing and developing its own successor.
Ranking the 21 best U21 players at the 2026 World ...
The FIFA World Cup is the biggest tournament in sport, and it's a great place to show off your talent as a young player! The list of teenagers to have made their breakthrough at a World Cup is long, and includes the likes of Pelé, Kylian Mbappé, Michael Owen and Thomas Müller. But who might do so this summer?
RDGen: Demonstration Generation for High-Quality Robot Learning via Reinforcement Learning
arXiv:2605.30957v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robot control. However, their performance remains fundamentally constrained by the availability of high-quality robot trajectory data. In current robot learning practice, such data are primarily collected through human teleoperation, which is labor-intensive, costly, and difficult to scale.
State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space
arXiv:2601.04266v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models are widely deployed in safety-critical embodied AI applications such as robotics. However, their complex multimodal interactions also expose new security vulnerabilities. In this paper, we investigate a backdoor threat in VLA models, where malicious inputs cause targeted misbehavior while preserving performance on clean data.
How back is the U? Can Clemson rebound? Previewing...
Life in the ACC certainly isn't boring. In the past year alone, the conference has produced a long and awkward CFP rankings battle, an irate affiliate member, a thrilling national title game run, the strangest tiebreaker result imaginable, an out-of-nowhere 11-win season, the most disappointing team in the country, an epic pro-to-college face-plant, 18 of the 38 best games of the 2025 season, the No. 1 pick in the NFL draft (indirectly) and the most awkward possible move to nine-game...
Hybrid Adversarial Defence for Natural Language Understanding Tasks
new Abstract: Large Language Models (LLMs) are vulnerable both to hallucination and adversarial manipulation. Although these problems are closely related, existing defences typically address them separately. We investigate a hybrid defence framework that combines entropy-based models, designed to reduce hallucinations, with uncertainty-based models and geometric-based models, designed to reduce vulnerability.