Home Knowledge Base Clean Task Success Rate

Clean Task Success Rate

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models

arXiv:2605.25889v4 Announce Type: replace Abstract: Vision-Language-Action (VLA) models reach high success rates on clean inputs but collapse under small adversarial perturbations: a $16/255$ PGD attack drops OpenVLA-7B's LIBERO success from $95\%$ to under $5\%$. Whether this trade-off has a theoretical floor was open. We prove that it does. For any VLA policy, capability $I(\Astar;\Api)$ and robustness $I(\Api;\Atildepi)-I(\Api;\delta)$ sum to at most $H(\Astar)+I(X;\Xtilde)$, the task...

arXiv CS 8d ago

SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models

arXiv:2601.14323v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models are increasingly deployed in safety-critical robotic applications, yet their security vulnerabilities remain underexplored. We identify a fundamental security flaw in modern VLA systems: the combination of action chunking and delta pose representations creates an intra-chunk visual open-loop. This mechanism forces the robot to execute K-step action sequences, allowing per-step perturbations to accumulate...

arXiv CS 8d ago

World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

arXiv:2606.05979v1 Announce Type: new Abstract: We propose world-language-action (WLA) models as a new class of embodied foundation models. WLA takes textual instructions, images, and robot states as inputs to jointly predict textual subtasks, subgoal images, and robot actions, conjoining the \emph{world modeling interface} to learn from extensive egocentric videos as in the world-action model (WAM) and the \emph{language reasoning} capacities to solve complex long-horizon tasks as in...

arXiv CS 5d ago

POISE: Position-Aware Undetectable Skill Injection on LLM Agents

arXiv:2606.07943v1 Announce Type: new Abstract: Agent skills provide a lightweight mechanism for extending general-purpose agents, but their open format exposes them to skill-poisoning attacks. A practically dangerous injection must stay invisible: if executing the payload derails the user's legitimate task, the resulting failure signal invites inspection of the skill. We therefore evaluate attacks by Attack Success Rate, which requires the injected payload to execute and the user's task to...

arXiv CS 1d ago

When AI Builds Itself: Our progress toward recursive self-improvement

For most of AI’s history, humans drove every step in its development cycle. But at Anthropic, we are delegating a growing share of AI development to AI systems themselves, which is speeding up our work. Taken far enough, and given enough compute, that trend points to an AI system capable of fully autonomously designing and developing its own successor.

Hacker News 6d ago

Ranking the 21 best U21 players at the 2026 World ...

The FIFA World Cup is the biggest tournament in sport, and it's a great place to show off your talent as a young player! The list of teenagers to have made their breakthrough at a World Cup is long, and includes the likes of Pelé, Kylian Mbappé, Michael Owen and Thomas Müller. But who might do so this summer?

ESPN 8d ago

RDGen: Demonstration Generation for High-Quality Robot Learning via Reinforcement Learning

arXiv:2605.30957v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robot control. However, their performance remains fundamentally constrained by the availability of high-quality robot trajectory data. In current robot learning practice, such data are primarily collected through human teleoperation, which is labor-intensive, costly, and difficult to scale.

arXiv CS 9d ago

State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space

arXiv:2601.04266v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models are widely deployed in safety-critical embodied AI applications such as robotics. However, their complex multimodal interactions also expose new security vulnerabilities. In this paper, we investigate a backdoor threat in VLA models, where malicious inputs cause targeted misbehavior while preserving performance on clean data.

arXiv CS 1d ago

How back is the U? Can Clemson rebound? Previewing...

Life in the ACC certainly isn't boring. In the past year alone, the conference has produced a long and awkward CFP rankings battle, an irate affiliate member, a thrilling national title game run, the strangest tiebreaker result imaginable, an out-of-nowhere 11-win season, the most disappointing team in the country, an epic pro-to-college face-plant, 18 of the 38 best games of the 2025 season, the No. 1 pick in the NFL draft (indirectly) and the most awkward possible move to nine-game...

ESPN 7d ago

Hybrid Adversarial Defence for Natural Language Understanding Tasks

new Abstract: Large Language Models (LLMs) are vulnerable both to hallucination and adversarial manipulation. Although these problems are closely related, existing defences typically address them separately. We investigate a hybrid defence framework that combines entropy-based models, designed to reduce hallucinations, with uncertainty-based models and geometric-based models, designed to reduce vulnerability.

arXiv CS 6d ago