Home › Knowledge Base › Demonstration

Demonstration

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

RPO-PDT: Demonstrating Role-Play-Based Knowledge Adaptation for Student Support Dialogue (Demonstration System)

arXiv:2606.09255v1 Announce Type: new Abstract: We present RPO-PDT: a retrieval-grounded, role-play-based dialogue system for adaptive student support in higher education. RPO-PDT is: (1) able to provide institution-specific Personal Development Tutor (PDT) guidance using structured knowledge sources; (2) constrained by explicit persona, boundary, confidentiality, and safety policies; and (3) designed around a reverse-roleplay loop where unresolved interactions are replayed from the student...

arXiv CS 1d ago

Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach

arXiv:2605.30903v1 Announce Type: new Abstract: Inverse reinforcement learning (IRL) typically assumes demonstrations from a single optimal demonstrator, but in many applications data come from multiple imperfect demonstrators with heterogeneous suboptimality levels. We study reward learning in this setting through a feasible-reward-set framework: for each demonstrator, we encode its declared suboptimality level as a linear constraint and intersect the resulting feasible sets across...

arXiv CS 9d ago

Auditing Demonstration Curation Metrics: Action-Only Scorers Fail on the Structural Defects That Degrade Imitation Policies

arXiv:2606.05588v1 Announce Type: new Abstract: Imitation-learning policies inherit the quality of the demonstrations they are trained on, and a growing set of curation metrics promise to score and filter low-quality demonstrations automatically. These metrics are each validated on different data with different protocols, so it is unclear which of them actually identify the demonstrations that harm a policy. We build a controlled testbed in which demonstration defects are injected with known...

arXiv CS 5d ago

ReGIL: Retrieval-Guided Imitation Learning from a Single Demonstration

arXiv:2606.09381v1 Announce Type: new Abstract: Learning robot manipulation policies with deep neural networks from a single demonstration remains highly challenging, as even small deviations from the demonstrated trajectory can quickly compound into failure, while collecting substantial online interaction data is costly. We propose ReGIL, a retrieval-guided imitation learning framework that treats a single demonstration as an external memory. ReGIL repeatedly queries this static memory...

arXiv CS 1d ago

LLM Trainer: Automated Robotic Data Generation via Demonstration Augmentation using LLMs

arXiv:2509.20070v2 Announce Type: replace Abstract: We present LLM Trainer, a fully automated pipeline that leverages the world knowledge of Large Language Models (LLMs) to transform a small number of human demonstrations (as few as one) into a large robot dataset for imitation learning. Our approach decomposes demonstration generation into two steps: (1) offline demonstration annotation that extracts keyframes, salient objects, and pose-object relations; and (2) online keypose retargeting...

arXiv CS 8d ago

VOLT: Vision and Language Trajectory Segmentation for Faster-than-Demonstration Policies

arXiv:2606.06323v1 Announce Type: new Abstract: Humans often take longer to demonstrate a task than a robot would need to execute it. Rather than learning to replicate the demonstration at the same pace, many industrial and practical applications require robots to perform tasks as quickly as possible. In this paper, we investigate several hypotheses for learning policies that operate faster-than-demonstrations.

arXiv CS 5d ago

Ultra-Orthodox Jewish demonstrators storm Israeli police station

Ultra-Orthodox Jewish demonstrators storm Israeli police station Ultra-Orthodox Jewish demonstrators storm Israeli police station Ultra-Orthodox Jewish demonstrators stormed a police station in Beit Shemesh, Israel, to protest the arrest of a man who abandoned military service. Israeli police used sound bombs and tear gas to disperse the crowd. Published On 1 Jun 2026

Al Jazeera 9d ago

EaDex: A Cross-Embodiment Dexterous Manipulation Framework from Low-Cost Demonstrations

arXiv:2606.03268v1 Announce Type: new Abstract: Dexterous manipulation learning has long been hindered by the high costs of data and training, as pure reinforcement learning typically requires large-scale interactive exploration and imitation learning depends on high-quality demonstrations that are expensive to collect. To address this problem, we propose EaDex, a multi-embodiment dexterous manipulation learning framework under low-cost demonstration conditions, which enables rapid...

arXiv CS 7d ago

Good Reasoning Makes Good Demonstrations: Implicit Reasoning Quality Supervision via In-Context Reinforcement Learning

arXiv:2603.09803v2 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) improves reasoning in large language models but treats all correct solutions equally, potentially reinforcing flawed traces that arrive at correct answers by chance. We observe that \emph{better reasoning makes better demonstrations}: high-quality solutions serve as more effective in-context examples than low-quality ones. We term this teaching ability \textbf{Demonstration Utility}, and...

arXiv CS 6d ago

Escaping the Verifier: Learning to Reason via Demonstrations

arXiv:2511.21667v4 Announce Type: replace Abstract: Training Large Language Models (LLMs) to reason often relies on Reinforcement Learning (RL) with task-specific verifiers. However, many real-world reasoning-intensive tasks lack verifiers, despite offering abundant expert demonstrations that remain under-utilized for reasoning-focused training. We introduce RARO (Relativistic Adversarial Reasoning Optimization), which learns strong reasoning capabilities from expert demonstrations alone via...

arXiv CS 5d ago