Boundary F1
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models
arXiv:2606.06103v1 Announce Type: new Abstract: Medical image segmentation is often framed as a search for stronger architectures, but this can obscure a more fundamental question: what does the dataset require from the model? In medical imaging, this requirement is shaped by foreground occupancy, morphology, boundary ambiguity, topology sensitivity, annotation quality, acquisition variation, and operating point. This paper introduces the Medical Segmentation Dataset Knowledge Card (MS-DKC),...
Antonelli, Russell promise smarter racing but no backing off after Canada clash
Antonelli, Russell promise smarter racing but no backing off after Canada clash MONACO, June 4 : Championship leader Kimi Antonelli and Mercedes team mate George Russell said they will continue fighting wheel to wheel in this weekend's Monaco Grand Prix despite team boss Toto Wolff warning he might have to apply the handbrake. At last month's Canadian Grand Prix, 19-year-old Antonelli was left fuming after contact between the pair during the sprint race won by Russell. The following day in...
Answer Presence Drives RAG Rewriting Gains
arXiv:2606.05633v1 Announce Type: new Abstract: Retrieval-augmented QA pipelines often route retrieved passages through an LLM \emph{rewriter} before a smaller reader, lifting F1 by tens of points on multi-hop benchmarks; this gain is typically credited to improved evidence quality. We ask whether that lift is causally driven by the gold answer string appearing in the rewritten context rather than by curation per se, using a controlled intervention audit. For each rewritten context we re-run...
SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy
arXiv:2602.22971v2 Announce Type: replace Abstract: As LLMs achieved breakthroughs in general reasoning, their proficiency in specialized scientific domains reveals pronounced gaps in existing benchmarks due to data contamination, insufficient complexity, and prohibitive human labor costs. Here we present SPM-Bench, an original, PhD-level multimodal benchmark specifically designed for scanning probe microscopy (SPM). We propose a fully automated data synthesis pipeline that ensures both high...
Cross Paraphrastic Invariance Learning for Hallucination Detection
arXiv:2606.08157v1 Announce Type: new Abstract: Large language models (LLMs) frequently generate hallucinations, which are unsupported by a source document. To avoid costly LLM-as-evaluator pipelines and the heavy annotation demands of existing classifiers, we propose CPIL (Cross Paraphrastic Invariance Learning), a two-stage Siamese framework that maximizes the utility of existing labeled data. Concretely, CPIL constructs informative training pairs by: (i) generating paraphrastic views of...
LastAct: Trajectory-Guided Latest-Activity Localization for Real-Time Smart-Home Activity Recognition
arXiv:2606.00260v2 Announce Type: replace Abstract: Human Activity Recognition (HAR) from ambient sensors enables smart-home applications such as health monitoring and assisted living. In realistic deployments, however, sensor events arrive as a continuous stream and activity boundaries are unknown. Sliding-window inference therefore produces many windows that straddle transitions and contain mixed activities, creating boundary contamination that violates the pre-segmented instance...
Efficient Punctuation Restoration via Weighted Lookahead Scoring Method for Streaming ASR Systems
arXiv:2606.05179v1 Announce Type: new Abstract: Punctuation restoration improves ASR (Automatic Speech Recognition) readability. However streaming ASR requires online decisions with limited future context. In streaming ASR, the system predicts punctuation incrementally, which makes generation-based approaches prone to latency and alignment failures under boundary-wise evaluation.
A thalamus–brainstem attractor network drives history-biased decisions
Abstract Natural environments often change gradually, making it adaptive to bias decisions on the basis of the recent past — a phenomenon known as serial dependence1,2,3. Large-scale recordings during behaviour have identified that serial dependence is a common motif for decision-making, with neural representations of past experiences found throughout the brain4,5,6,7,8,9,10,11. However, it remains unclear whether this bias arises from dedicated neural circuits with history-specific...
DiffuSent: Towards a Unified Diffusion Framework for Aspect-Based Sentiment Analysis
arXiv:2606.01323v1 Announce Type: new Abstract: Aspect-Based Sentiment Analysis (ABSA) encompasses seven distinct subtasks, each focusing on different extracted elements. Despite the proven success of generative models in unified aspect sentiment analysis, existing approaches often rely on auto-regressive token-by-token generation without grasping the whole information of the aspect and opinion terms, resulting in boundary insensitivity, particularly in context of multi-word aspect and...
AutoIQ: An Ensemble Framework for Automatic Assessment of Geometric Distortion in Prostate Diffusion-Weighted Imaging
Announce Type: cross Abstract: Geometric distortion in prostate diffusion-weighted imaging (DWI) can impair lesion localization and reduce the reliability of MRI-based clinical assessment. We propose AutoIQ, an ensemble machine learning framework for automatic quantification and classification of DWI geometric distortion severity. A total of 140 retrospective prostate biparametric MRI examinations were analyzed, including 33 scans with severe distortion requiring repeat acquisition and 107...