Home › Knowledge Base › R1

R1

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling

Announce Type: replace Abstract: Safe and feasible trajectory planning is critical for real-world autonomous driving systems. However, existing learning-based planners rely heavily on expert demonstrations, which not only lack explicit safety awareness but also risk inheriting undesirable behaviors such as speeding from suboptimal human driving data. Inspired by the success of large language models, we propose Plan-R1, a two-stage trajectory planning framework that decouples principle...

arXiv CS 8d ago

Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning

arXiv:2507.21892v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) mitigates hallucination in LLMs by incorporating external knowledge, but relies on chunk-based retrieval that lacks structural semantics. GraphRAG methods improve RAG by modeling knowledge as entity-relation graphs, but still face challenges in high construction cost, fixed one-time retrieval, and reliance on long-context reasoning and prompt design. To address these challenges, we propose Graph-R1, the...

arXiv CS 6d ago

Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling

arXiv CS 7d ago

A Comprehensive Anatomy of Human and DeepSeek-R1 LLM Mathematical Reasoning

new Abstract: The emergence of "Aha moments" in large language models, particularly DeepSeek-R1-0120, has raised the question of whether these systems genuinely reason or merely imitate the appearance of reasoning. We conduct a comprehensive empirical comparison between model and human reasoning across all 30 problems from AIME 2025, exhaustively annotating 10,247 reasoning steps into five functional categories: Analysis, Inference, Branch, Backtrace, and Reflection. We find a clear...

arXiv CS 2d ago

QueryAgent-R1: Bridging Query Generation and Product Retrieval for E-Commerce Query Recommendation

arXiv:2606.05671v1 Announce Type: new Abstract: Query recommendation in e-commerce search aims to proactively suggest queries that match users' potential interests. However, existing methods mainly optimize query-level relevance, while neglecting whether the retrieved products align with users' downstream preferences. This mismatch often leads to high query click through rates (CTR) but low product conversion rates (CVR).

arXiv CS 5d ago

Claw-R1: A Step-Level Data Middleware System for Agentic Reinforcement Learning

arXiv:2606.09138v1 Announce Type: new Abstract: Agentic reinforcement learning (RL) has become an important post-training paradigm for turning LLMs from static chatbots into interactive agents, giving rise to representative applications such as OpenClaw. Existing work mainly focuses on policy optimization algorithms and training frameworks, but pays less attention to the full data lifecycle of agent-environment interactions, from data production to training consumption. To bridge this gap,...

arXiv CS 1d ago

Food-R1: A Unified Multi-Task Food Vision-Language Model with Reinforcement Learning

arXiv:2606.04986v1 Announce Type: new Abstract: Recent studies have explored Vision-Language Models (VLMs) for food analysis. However, most existing methods rely primarily on supervised fine-tuning (SFT), which often limits reasoning and generalization capabilities. Moreover, high-quality large-scale nutritional annotations remain scarce.

arXiv CS 6d ago

KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering

arXiv:2512.10999v3 Announce Type: replace Abstract: Knowledge Base Question Answering (KBQA) challenges models to bridge the gap between natural language and strict knowledge graph schemas by generating executable logical forms. While Large Language Models (LLMs) have advanced this field, current approaches often struggle with a dichotomy of failure: they either generate hallucinated queries without verifying schema existence or exhibit rigid, template-based reasoning that mimics synthesized...

arXiv CS 7d ago

ClinTutor-R1: Advancing Scalable and Robust One-to-Many Alignment in Clinical Socratic Education

arXiv:2512.05671v2 Announce Type: replace Abstract: While Large Language Models (LLMs) have achieved remarkable success in dyadic (one-on-one) instruction, they face significant challenges in One-to-Many alignment, such as clinical ward rounds, where an instructor must simultaneously guide a diverse group of trainees. Current models often suffer from context dilution and goal misalignment, failing to balance individual scaffolding with collective learning progress. To address this, we...

arXiv CS 8d ago

Facial-R1: Aligning Reasoning and Recognition for Facial Emotion Analysis

arXiv:2511.10254v2 Announce Type: replace Abstract: Facial Emotion Analysis (FEA) extends traditional facial emotion recognition by incorporating explainable, fine-grained reasoning. The task integrates three subtasks: emotion recognition, facial Action Unit (AU) recognition, and AU-based emotion reasoning to model affective states jointly. While recent approaches leverage Vision-Language Models (VLMs) and achieve promising results, they face two critical limitations: (1) hallucinated...

arXiv CS 5d ago