correctly?A
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
The Self-Correction Illusion: LLMs Correct Others but Not Themselves
arXiv:2606.05976v1 Announce Type: new Abstract: Recent work shows that LLM agents struggle to correct errors in their own reasoning traces yet show markedly higher correction rates when identical claims appear under external sources. We ask whether this asymmetry reflects a capability deficit or a role-label artifact: does an agent's willingness to correct a wrong claim depend causally on the chat-template role that carries it, rather than on the claim's content?
CECOR: Correction-oriented synthetic data construction for factual error correction
arXiv:2605.02277v2 Announce Type: replace Abstract: Factual Error Correction (FEC) aims to revise inaccurate text into statements that are factually consistent with external evidence. Although recent methods perform well on single-hop correction, they often treat claims as atomic units and struggle with multi-hop cases that require compositional reasoning across multiple evidence sources. This challenge is further amplified by limited paired data and difficulties in locating semantic errors...
Correction of the basis set error due to the absence of the electron-electron cusp in the wave function by using an adiabatic correction
Announce Type: new Abstract: This article proposes an analytical method to address the slow convergence of electronic structure calculations caused by the inability of finite one-particle basis sets to describe the electron-electron cusp. An equivalence is made between a calculation using a finite basis set with the physical Coulomb interaction and a calculation using a complete basis set with a model interaction (specifically, the error-function screened Coulomb potential characterized by a...
Learning Self-Correction in Vision-Language Models via Rollout Augmentation
arXiv:2602.08503v2 Announce Type: replace Abstract: Self-correction is essential for solving complex reasoning problems in vision-language models (VLMs). However, existing reinforcement learning (RL) methods struggle to learn it, as effective self-correction behaviors emerge only rarely, making learning signals extremely sparse. To address this challenge, we propose correction-specific rollouts (Octopus), an RL rollout augmentation framework that synthesizes dense self-correction examples by...
Upper Bounds on Multiple $b$-Burst Deletion-Correcting Codes
arXiv:2606.01245v1 Announce Type: new Abstract: Motivated by their applications in DNA-based storage systems, codes capable of correcting consecutive deletions have attracted significant attention. An important class of such codes consists of those that can correct multiple consecutive deletion errors, commonly referred to as multiple $b$-burst deletion-correcting codes. In this paper, we investigate the fundamental limits of multiple $b$-burst deletion-correcting codes.
CLFEC: A New Task for Unified Linguistic and Factual Error Correction in paragraph-level Chinese Professional Writing
arXiv:2602.23845v2 Announce Type: replace Abstract: Chinese text correction has traditionally focused on spelling and grammar, while factual error correction is usually treated separately. However, in paragraph-level Chinese professional writing, linguistic (word/grammar/punctuation) and factual errors frequently co-occur and interact, while many draft-level errors are sparsely observable in published texts after editorial review, making unified correction both necessary and controlled...
When Correct Decisions Hide Internal Stress: Decision-State Probing in Multimodal Language Models
Announce Type: new Abstract: Multimodal language models are typically evaluated through external behavior: selecting the correct image--text match, rejecting unsupported captions, or answering visual queries correctly. However, correct behavior alone does not show that the model's internal decision state remains stable under controlled semantic stress. We study this gap through S$^3$E (Structured Semantic Stress Evaluation), a framework for analyzing behavior-internal decoupling in...
Disentangling Visual and Factual Correctness in LVLMs' Visualization Literacy
arXiv:2606.03142v1 Announce Type: new Abstract: Large Vision-Language Models (LVLMs) show strong visualization interpretation, yet it is unclear whether their responses reflect genuine reasoning over visual evidence or factual priors learned during training. Current evaluations mix these two sources, obscuring when correct visual interpretation is overridden by memorized facts. We present a framework that isolates visual correctness from factual correctness, revealing validity limitations in...
AI paired with tiny optical device corrects distorted light for sharper imaging
AI paired with tiny optical device corrects distorted light for sharper imaging Gaby Clark Scientific Editor Robert Egan Associate Editor Blurry light from lens imperfections is a problem everywhere, from microscopes to telescopes to smartphone cameras. Using a tiny yet carefully engineered optical element and artificial intelligence, University of California San Diego engineers have built a way to spot and correct those distortions from a single image—a step that could make advanced optical...
Can LLMs Write Correct TLA+ Specifications? Evaluating Natural-Language-to-TLA+ Generation
Announce Type: new Abstract: TLA+ has supported industrial verification at companies such as Amazon and Microsoft, yet writing correct TLA+ specifications from natural language still requires time and expertise, which limits adoption. LLMs show promise, but no prior study measures whether they produce semantically correct TLA+ specifications from natural language. This paper presents the first systematic evaluation of LLM-based TLA+ specification synthesis from natural language.