Home Knowledge Base Semantic Web

Semantic Web

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Web Agents Should Use Typed Actions Instead of Click-Based Browsing

arXiv:2602.17245v2 Announce Type: replace Abstract: This position paper argues that building a reliable agentic Web requires shifting from low-level interaction primitives to typed actions supported by a semantic layer. Today's web agents primarily operate through clicks, keystrokes, and DOM manipulation, which leads to brittle long-horizon behavior, high execution cost, and limited auditability. We propose web verbs as a concrete design for this layer.

arXiv CS 1d ago

Cohort-based Semantic Labeling: AI-Enabled Recovery of Visualization Semantics from Deployed SVGs

arXiv:2606.09782v1 Announce Type: new Abstract: Many web-based visualizations are deployed as Scalable Vector Graphics (SVG), a format that faithfully preserves visual appearance but typically omits the higher-level semantic structure needed for machine interpretation. Once rendered and published, information about a visualization's components, roles, and encodings is no longer explicitly available, limiting downstream operations such as querying, accessibility augmentation, explanation,...

arXiv CS 1d ago

MetaConfigurator: AI-Assisted RDF Authoring from JSON Data

arXiv:2606.07094v1 Announce Type: new Abstract: Scientific workflows increasingly generate structured JSON data that is easy to exchange but difficult to interpret consistently across systems due to lacking semantic interoperability. While JSON Schema ensures structural validation, it provides no native support for Linked Data semantics. This paper presents an RDF Authoring View extending the open-source JSON Schema editor MetaConfigurator, enabling researchers to transform existing JSON,...

arXiv CS 2d ago

pcbGPT: Automatic PCB Schematic Synthesis from Natural Language Requirements

arXiv:2606.01188v1 Announce Type: new Abstract: Translating natural-language hardware requirements into correct printed circuit board (PCB) schematics remains difficult in embedded, IoT, and wearable development. Designers must choose compatible components, interpret datasheets, add support circuitry, and expose correct interfaces before layout and prototyping can begin, while many such circuits cannot be validated through straightforward simulation.

arXiv CS 8d ago

Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions

arXiv:2501.04661v3 Announce Type: replace Abstract: The web-scale of pretraining data has created an important evaluation challenge: to disentangle linguistic competence on cases well-represented in pretraining data from generalization to out-of-domain language, specifically the dynamic, real-world instances less common in pretraining data. To this end, we construct a diagnostic evaluation to systematically assess natural language understanding in LLMs by leveraging Construction Grammar...

arXiv CS 9d ago

Towards Generalization of Block Attention via Automatic Segmentation and Block Distillation

arXiv:2605.15913v4 Announce Type: replace Abstract: Block attention, which processes the input as separate blocks that cannot attend to one another, offers significant potential to improve KV cache reuse in long-context scenarios such as Retrieval-Augmented Generation (RAG). However, its broader application is hindered by two key challenges: the difficulty of segmenting input text into meaningful, self-contained blocks, and the inefficiency of existing block fine-tuning methods that risk...

arXiv CS 5d ago

GEM: Geometric Entropy Mixing for Optimal LLM Data Curation

arXiv:2605.26121v2 Announce Type: replace Abstract: LLM pre-training efficacy increasingly depends on data composition rather than sheer volume. Yet, optimal mixing is hindered by categorization flaws: human taxonomies suffer from ontological misalignment, and Euclidean clustering fails to address embedding anisotropy. We introduce GEM (Geometric Entropy Mixing), a framework reformulating data curation as a variational problem on the hypersphere augmented with a mixing-balance regularizer.

arXiv CS 9d ago

Odysseus – self-hosted AI workspace

─────────────────────────────────────────────── ⊹ ࣪ ˖ ૮( ˶ᵔ ᵕ ᵔ˶ )っ Odysseus vers. 1.0 ─────────────────────────────────────────────── A self-hosted AI workspace -- meant to be the self-hosted version of the UI experience you get from ChatGPT and Claude. But with more jank and fun.

Hacker News 10d ago

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

arXiv:2606.02373v1 Announce Type: new Abstract: Search agents are often trained as policies over growing transcripts: the model must decide how to search while also remembering what it has seen, which evidence is useful, which constraints remain open, and which claims have actually been checked. We argue that this formulation puts too much routine state management inside the policy: reinforcement learning is forced to optimize both semantic search decisions and recoverable bookkeeping that...

arXiv CS 8d ago

Building Reliable Long-Form Generation via Hallucination Rejection Sampling

arXiv:2606.03628v1 Announce Type: new Abstract: Large language models (LLMs) have achieved remarkable progress in open-ended text generation, yet they remain prone to hallucinating incorrect or unsupported content, which undermines their reliability. This issue is exacerbated in long-form generation due to hallucination snowballing, a phenomenon where early errors propagate and compound into subsequent outputs. To address this challenge, we propose a novel inference-time hallucination...

arXiv CS 7d ago