the Guidance Gate
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Debugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents
arXiv:2605.08717v2 Announce Type: replace Abstract: Software engineering agents are increasingly deployed in evaluable engineering environments, yet post-failure recovery remains costly, manual, and ad hoc. Existing systems expose traces or generate follow-up feedback, but they do not convert heterogeneous runtime evidence into grounded, bounded recovery guidance for a subsequent attempt. We present PROBE, a failure-anchored framework for structured recovery in software engineering agents.
SKILL.nb: Selective Formalization and Gated Execution for Durable Agent Workflows
arXiv:2606.08049v1 Announce Type: new Abstract: AI agents increasingly turn past experience into reusable artifacts such as code, workflows, and procedural memories. Reuse can improve efficiency, but it also creates a lifecycle reliability problem: artifacts that succeed once may fail under environment drift, underspecified tasks, or changing task distributions, especially in web automation. We introduce SKILL.nb, a framework for governing reusable agent workflows with evidence-calibrated...
LaGuardia Airport AI hologram answers traveler questions
Airports can feel like a maze when you are rushing to a gate, hunting for baggage claim or trying to find a lounge before boarding. Now, LaGuardia Airport's Terminal B wants to make that all feel a little less stressful with a life-sized AI hologram named Bridget.Bridget can hold a real conversation with you. She can answer questions about gates, shops, baggage claim and VIP lounges.
TARIC: Memory-Augmented Traversability-Aware Outdoor VLN under Interrupted Semantic Cues
arXiv:2605.31121v1 Announce Type: new Abstract: Outdoor vision-language navigation (VLN) in long-range, open-world environments is frequently disrupted by semantic-cue interruptions, where informative goal cues become sparse, occluded, or leave the field of view. Once such cues disappear, agents enter a cue-free phase and often degrade into backtracking, oscillatory headings, or aimless exploration. While memory-based methods attempt to bridge these gaps, they often fail under...
LLM Agent-Assisted Reverse Engineering with Quantitative Readability Metrics
Announce Type: new Abstract: Automatic decompilers produce functionally correct but often unreadable C code. This paper addresses one stage of the reverse engineering workflow: improving the readability of decompiled code using LLM agents guided by quantitative metrics. We present a three-phase research evolution.
What Structural Inductive Bias Helps Transformers Reason Over Knowledge Graphs? A Study with Tabula RASA
arXiv:2602.02834v4 Announce Type: replace Abstract: What structural inductive bias helps transformers reason over knowledge graphs? Through controlled ablations of a minimal transformer modification with four independently removable components (sparse adjacency masking, edge-type biases, query scaling, value gating), we isolate which structural signals drive multi-hop reasoning. Our finding is sharp: sparse adjacency masking alone accounts for the dominant share of improvement over unmasked...
United Airlines jet flew too low and too slow before striking light pole and truck near Newark airport: NTSB
A United Airlines flight was flying too low and too slow when it barreled into a light pole and a tractor-trailer last month just before landing at Newark Liberty International Airport, the National Transportation Safety Board said. Flight 169 was traveling from Venice, Italy, to New Jersey when it struck the pole and a Schmidt Bakery truck during its descent at around 2 p.m. on May 3.Dashcam video showed the moment the Boeing 767-400 passenger jet struck the bakery truck on the New Jersey...
Intrinsic Selection and Particle Resampling for Inference-Time Scaling Beyond Domain Verifiability
Announce Type: new Abstract: Inference-Time Scaling (ITS) has largely succeeded in verifiable domains like math and coding, where cheap verification enables scalable output selection. However, extending ITS to tasks prone to systematic failure - driven by faulty initial assumptions or unmet multidimensional constraints - typically relies on costly external solvers or brittle, model-based verifiers. Our key insight is that the intrinsic statistics of parallel sample sets, specifically...
Adaptive Loss Balancing for Noise-Robust GRPO in Generative Recommendation
arXiv:2606.08480v1 Announce Type: new Abstract: Reinforcement learning (RL) presents a promising avenue for enhancing generative recommendation beyond supervised imitation, leveraging reward signals to guide policy improvement. However, its efficacy is critically contingent on the trustworthiness of the reward model for the samples it evaluates. In practice, production rankers, the widely adopted reward models, are trained on exposure-biased logs, leading to sample-dependent inaccuracies...
Towards Persistent Case-Based Memory for Autonomous Data Science: A CBR-Augmented R&D-Agent with a Locally Deployable Small Language Model
Announce Type: new Abstract: Most top-performing autonomous data-science agents rely on frontier cloud models and lack persistent, cross-session memory. This paper addresses two open gaps: (1) the underexplored use of formally structured, quality-controlled Case-Based Reasoning (CBR) case bases coupling symbolic case records with executable code artefacts; and (2) the untested viability of Small Language Models (SLMs) as locally deployable agent backbones. We present CBR-augmented...