Home Knowledge Base Information Retention Score

Information Retention Score

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

PersonaTree: Structured Lifecycle Memory for Person Understanding in LLM Agents

arXiv:2606.04780v1 Announce Type: new Abstract: Persistent LLM agents require memory representations that make the formation of person understanding explicit across long term interaction. Existing agent memory methods emphasize information retention and retrieval, yet give limited account of how accumulated interaction evidence is abstracted into person understanding. We view this process as schema formation, where situated evidence is abstracted into reusable patterns and stable person...

arXiv CS 6d ago

NormEval: A Unified Multi-Metric Framework for Evaluating Semantic Fidelity in Text Normalization

arXiv:2511.20409v2 Announce Type: replace Abstract: Text normalization methods such as stemming and lemmatization are fundamental components of NLP pipelines. As new normalization tools are developed for diverse languages, evaluation methodologies remain fragmented, relying on Compression Ratio, downstream accuracy, or sequence-to-sequence prediction scores in isolation, failing to distinguish between beneficial vocabulary reduction and harmful semantic distortion.

arXiv CS 8d ago

When Meaning Travels: A Granular Lens on Hybrid-MoE's Role in Idiomatic Understanding for Language Models

arXiv:2606.01671v1 Announce Type: new Abstract: In the contemporary epoch of multilingual education, learning idioms provides a fascinating gateway towards creativity, cultural values, historical context, and diverse perspectives inherent to various linguistic traditions. This paper showcases the navigation of retaining figurative and cultural semantics in low-resource Southeast Asian languages such as Hindi, Bengali, and Thai, where culturally rich idioms pose significant obstacles for...

arXiv CS 8d ago

Any2Poster: Any-Source Poster Generation Across Modalities and Domains

Announce Type: new Abstract: Visual posters are a compact medium for communicating dense information, yet progress on automatic poster generation remains difficult to measure because existing evaluations are often restricted to paper-only inputs, narrow domains, or surface-level visual similarity. We introduce Any2Poster Bench, a benchmark for any-source poster generation that evaluates systems across eight input modalities--PDFs, URLs, PPTX, DOCX, Markdown, LaTeX, notebooks, and videos--and...

arXiv CS 7d ago

Anatomy of Unlearning: The Dual Impact of Fact Salience and Model Fine-Tuning

Announce Type: replace Abstract: Machine Unlearning (MU) enables Large Language Models (LLMs) to remove unsafe or outdated information. However, existing work assumes that all facts are equally forgettable and largely ignores whether the forgotten knowledge originates from pretraining or supervised fine-tuning (SFT). In this paper, we introduce DUET (Dual Unlearning Evaluation across Training Stages), a benchmark of 28.6k Wikidata-derived triplets annotated with fact popularity using...

arXiv CS 8d ago

The Evaluation Blind Spot: A Stereological Theory of Benchmark Coverage for Large Language Models

arXiv:2606.05169v1 Announce Type: new Abstract: We give a stereological theory of LLM benchmark coverage. For any suite with effective dimensionality d_eff, the visible Hausdorff distance between two convex capability profiles consistent with the same scores is bounded by epsilon + C R m^(-1/(d_eff-1)), with matching Lipschitz lower bound. Empirically, three independent leaderboards (Open LLM v2, an extended 12-benchmark suite, LiveBench) all have d_eff in [2.86, 4.80] on their competitive...

arXiv CS 5d ago

Mexico's Mora is poised to be World Cup's breakout...

Just days out from the FIFA World Cup, and surrounded by inquiring media at a crowded roundtable, an unexpected question was thrown at star Mexico midfielder Gilberto Mora. "If we make it to the final and we win, how are you going to celebrate? With an ice cream?" asked a journalist, clearly tongue-in-cheek.

ESPN 5h ago

TokenMizer: Graph-Structured Session Memory for Long-Horizon LLM Context Management

arXiv:2606.06337v1 Announce Type: new Abstract: Large language model (LLM) deployments for long-horizon tasks face a fundamental constraint: context windows are finite while productive work sessions are not. When history exceeds the Maximum Effective Context Window (MECW), critical structured information - architectural decisions, task transitions, file histories - is silently discarded. Existing mitigations treat history as flat text, destroying the relational structure that makes sessions...

arXiv CS 5d ago

Hallucinations as Orthogonal Noise: Inference-Time Manifold Alignment via Dynamic Contextual Orthogonalization

Announce Type: new Abstract: Hallucination in Large Language Models (LLMs), characterized by the generation of content inconsistent with contextual facts or logical constraints -- remains a persistent challenge for reliable deployment. In this work, we address this issue through a geometric framework rooted in the linear representation hypothesis. We propose that hallucinations manifest as orthogonal noise relative to the semantic manifold of the residual stream.

arXiv CS 7d ago

Whole-genome duplication shaped cell-type evolution in the vertebrate brain

Abstract The complex brains of vertebrates have more cell types than those of their closest relatives. Whole-genome duplications (WGDs) occurred during early vertebrate evolution1, but it is unclear whether the duplicated genes (ohnologues) facilitated cell-type evolution. Here using brain single-cell transcriptomes from five chordates—human2, mouse3, lizard4, lamprey5 and amphioxus—we report that many cell-type families with conserved core transcription factors in vertebrates do not show...

Nature 21h ago