Home › Knowledge Base › The Granularity Gap

The Granularity Gap

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

The Granularity Gap: A Multi-Dimensional Longitudinal Audit of Sycophancy in Gemini Models

arXiv:2606.05183v1 Announce Type: new Abstract: Large language models are increasingly deployed as high-stakes advisors, yet standard alignment benchmarks treat sycophancy as a binary failure mode. We introduce the Granularity Gap: coarse binary metrics mask substantial social-compliance behaviors where models capitulate to user framing, validate questionable premises, or soften factual corrections without producing overtly false outputs. We evaluate six Gemini variants across generations...

arXiv CS 5d ago

The Impact of Temporal Granularity on Socio-Demographic Inference from Household Load Profiles

Announce Type: new Abstract: Smart meter data can reveal sensitive socio-demographic characteristics of households, raising privacy concerns. While this risk has been demonstrated at fixed granularities, the role of temporal resolution in shaping inference performance remains insufficiently explored. This paper addresses this gap by analyzing how load profiles with granularities from 15 minutes to 7 days affect the predictability of eight socio-demographic attributes in a dataset of 1,589...

arXiv CS 7d ago

Asymmetric Stream Allocation and Linear Decodability in MIMO Coded Caching

Announce Type: replace Abstract: Coded caching (CC) can transform cache memory at network devices into an active communication resource and significantly enhance the Degrees of Freedom (DoF) of multi-input multi-output (MIMO) systems by jointly exploiting global caching and spatial multiplexing gains. Existing linearly decodable MIMO-CC designs, however, largely rely on symmetric stream allocation, where all scheduled users receive the same number of streams, which induces coarse DoF...

arXiv CS 5d ago

Training One Model to Master Cross-Level Agentic Actions via Reinforcement Learning

Announce Type: replace Abstract: The paradigm of agentic AI is shifting from engineered complex workflows to post-training native models. However, existing agents are typically confined to static, predefined action spaces-such as exclusively using APIs, GUI events, or robotic commands. This rigidity limits their adaptability in dynamic environments where the optimal granularity of interaction varies contextually.

arXiv CS 5d ago

NextMotionQA: Benchmarking and Judging Human Motion Understanding with Vision-Language Models

arXiv:2606.04773v1 Announce Type: new Abstract: Reliable evaluation of human motion understanding is fundamental to advancing embodied AI, robotics, and animation. However, existing benchmarks suffer from coarse semantic granularity, undifferentiated difficulty, limited annotation quality, and pervasive answer ambiguity, leaving them unable to diagnose where current models fail. To bridge this gap, we introduce NextMotionQA, a comprehensive benchmark that leverages vision-language models...

arXiv CS 6d ago

AGENTSERVESIM: A Hardware-aware Simulator for Multi-Turn LLM Agent Serving

Announce Type: new Abstract: Multi-turn LLM agents interleave model calls with external tool invocations, shifting serving from stateless request processing to stateful program execution. Serving these workloads requires scheduling, KV-cache management, and routing policies that use program-level context, including turn dependencies, tool-induced gaps, and reusable KV state. Evaluating such policies directly on real systems is costly, since each design point may require dedicated accelerator...

arXiv CS 1d ago

SagaQA: A Multi-hop Reasoning Benchmark for Long-form Narrative Understanding in TV Series

Announce Type: new Abstract: We introduce SagaQA, a long-form video benchmark for multi-hop reasoning over full-length TV series. Existing video reasoning benchmarks often emphasize local understanding of adjacent frames or clips. SagaQA addresses this gap by requiring high-level comprehension of extended multimodal narratives in entire TV shows.

arXiv CS 7d ago

MAVIS: Multi-Agent Video Retrieval via Structured Video Understanding

arXiv:2606.09641v1 Announce Type: new Abstract: The dominant paradigm in video retrieval relies on embedding-based full-corpus scanning, which suffers from inherent computational inefficiency and the semantic asymmetry between information-dense videos and sparse textual queries. To bridge this gap, we introduce \textbf{MAVIS}, a novel multi-agent framework that rethinks retrieval as cooperative reasoning rather than brute-force search. MAVIS first bridges the granularity mismatch by parsing...

arXiv CS 1d ago

StepPO: Step-Aligned Policy Optimization for Agentic Reinforcement Learning

arXiv:2604.18401v3 Announce Type: replace Abstract: Agentic reinforcement learning (RL) is emerging as a critical post-training paradigm for improving LLM agent capabilities. Existing RL algorithms for LLMs largely follow the token-centric paradigm as in RLHF and RLVR, where tokens serve as the basic units for modeling and optimization. However, this paradigm introduces a granularity mismatch in agentic RL, as it optimizes token-level predictions while LLM agents make step-level decisions...

arXiv CS 2d ago

StepPO: Step-Aligned Policy Optimization for Agentic Reinforcement Learning

Announce Type: replace Abstract: Agentic reinforcement learning (RL) is emerging as a critical post-training paradigm for improving LLM agent capabilities. Existing RL algorithms for LLMs largely follow the token-centric paradigm as in RLHF and RLVR, where tokens serve as the basic units for modeling and optimization. However, this paradigm introduces a granularity mismatch in agentic RL, as it optimizes token-level predictions while LLM agents make step-level decisions through cycles of...

arXiv CS 8d ago