The Environment Semantics Gap
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
AutoSUT: The Environment Semantics Gap in Structured CTI for Adversary Emulation
Announce Type: new Abstract: Structured Cyber Threat Intelligence (CTI) is increasingly used for adversary emulation, detection evaluation, and cyber range design. However, these workflows still require a target System Under Test (SUT) whose environment is not fully described by public CTI. We measure how much of that environment can be derived from MITRE ATT&CK Structured Threat Information Expression (STIX) bundles.
Beyond Similarity: Trustworthy Memory Search for Personal AI Agents
Announce Type: new Abstract: Personal AI agents increasingly rely on long-term memory to provide persistent personalization across sessions. However, existing memory pipelines are largely driven by semantic similarity: memory data close to the current query is retrieved and injected into the model context. This creates a critical trustworthiness gap, since a semantically related memory may still be contextually inappropriate, leading to threats such as cross-domain leakage, sycophancy,...
TARIC: Memory-Augmented Traversability-Aware Outdoor VLN under Interrupted Semantic Cues
arXiv:2605.31121v1 Announce Type: new Abstract: Outdoor vision-language navigation (VLN) in long-range, open-world environments is frequently disrupted by semantic-cue interruptions, where informative goal cues become sparse, occluded, or leave the field of view. Once such cues disappear, agents enter a cue-free phase and often degrade into backtracking, oscillatory headings, or aimless exploration. While memory-based methods attempt to bridge these gaps, they often fail under...
STEPS: Semantic-Contract-Guided Scheduling for LLM-Assisted Natural-Language-Driven Edge AI Services
arXiv:2606.09537v1 Announce Type: new Abstract: Networked AI services are increasingly delivered through edge infrastructures to support latency-sensitive applications. Edge scheduling is critical for deciding where and how AI services are executed under limited communication and computing resources. Existing frameworks usually assume that requirements are given as numerical constraints, such as latency bounds, energy budgets, or cost limits.
CTIConnect: A Benchmark for Retrieval-Augmented LLMs over Heterogeneous Cyber Threat Intelligence
arXiv:2510.11974v2 Announce Type: replace Abstract: Cyber Threat Intelligence (CTI) is foundational to modern cybersecurity, enabling organizations to proactively defend against evolving threats. However, the sheer volume and heterogeneity of CTI data, spanning structured knowledge bases (CVE, CWE, CAPEC, MITRE ATT&CK) and unstructured threat reports, far exceed the capacity of manual analysis. The strong contextual understanding and reasoning of Large Language Models (LLMs) have driven...
Neo4j plots Palantir alternative with GraphAware acquisition
"The no-kill-switch kind of thing? It's increasingly becoming a requirement," says Neo4j CEO Emil Eifrem. This is one of the reasons behind the company's decision to buy GraphAware, an intelligence analysis software platform built on the graph database, which is positioning itself as an alternative to Palantir, the controversial US spy-tech biz.
Language-based Trial and Error Falls Behind in the Era of Experience
arXiv:2601.21754v3 Announce Type: replace Abstract: While Large Language Models (LLMs) excel in language-based agentic tasks, their applicability to unseen, nonlinguistic environments (e.g., symbolic or spatial tasks) remains limited. Previous work attributes this performance gap to the mismatch between the pretraining distribution and the testing distribution. In this work, we demonstrate the primary bottleneck is the prohibitive cost of exploration: mastering these tasks requires extensive...
From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents
Announce Type: new Abstract: Large language model (LLM)-based agents increasingly solve complex tasks by interacting with external tools, retrieval systems, memory modules, environments, and other agents. These capabilities expand agent autonomy, but also make agent behavior harder to verify, debug, and audit. Final-answer accuracy alone cannot explain how an output was produced, which evidence supported each claim, whether tool calls were justified, how memory influenced later decisions, or...
Decoding the Surgical Scene: A Scoping Review of Scene Graphs in Surgery
arXiv:2509.20941v2 Announce Type: replace Abstract: As surgical AI transitions from pixel-level detection to complex reasoning, Scene Graphs (SGs) offer the structured, relational representations necessary to decode dynamic surgical environments. This PRISMA-ScR-guided scoping review systematically maps the evolving landscape of SG research in surgery, analyzing 52 primary studies to chart applications and methodological shifts. Our analysis reveals rapid growth, yet uncovers a critical...
eMEM: A Hybrid Spatio-Temporal Memory System For Embodied Agents
Announce Type: new Abstract: We present eMEM (Embodied Memory), a hybrid graph-based memory system for embodied agents operating in physical environments. Current agent memory architectures, such as Generative Agents, MemGPT, and A-MEM, treat memory as text streams or knowledge graphs, but embodied agents require memory that is simultaneously searchable by meaning, space, and time. eMEM fills this gap with a multi-index architecture (SQL ITE for structured storage, hnswlib for approximate...