Home › Knowledge Base › the Activation Oracle

the Activation Oracle

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Building Better Activation Oracles

Announce Type: new Abstract: Activation Oracles (AOs) are promising methods for interpreting residual stream activations. However, current AOs face important issues, such as hallucinations and vagueness. Additionally, text-inversion confounds make them hard to evaluate.

arXiv CS 7d ago

Building Better Activation Oracles

arXiv:2606.02609v2 Announce Type: replace Abstract: Activation Oracles (AOs) are promising methods for interpreting residual stream activations. However, current AOs face important issues, such as hallucinations and vagueness. Additionally, text-inversion confounds make them hard to evaluate.

arXiv CS 2d ago

AI Job Grief: The Unnamed Psychological Crisis Hitting Tech Workers

AI Job Grief: The Unnamed Psychological Crisis Hitting Tech Workers In the summer of 2025, an Epic Games layoff cut a worker who was a terminally ill father. According to the most-discussed account of the episode, his family lost his life insurance along with the job.

Hacker News 11d ago

The ACUTE Protocol: Operationalizing Language Model Activations for Better Calibration, Utility, and Trust

new Abstract: As language models improve and become increasingly deployed to solve a variety of tasks, trustworthiness becomes essential. Calibration is a good proxy for trust: well-calibrated confidence estimates help inform the risk versus reward tradeoff when trusting a specific model output. Unfortunately, even as models improve, they remain poorly calibrated, often biasing towards overconfidence.

arXiv CS 1d ago

Thresholded Local Hyper-Flow Diffusion

arXiv:2606.09340v1 Announce Type: new Abstract: Local Hyper-Flow Diffusion (HFD) gives an edge-size-independent Cheeger-type guarantee for seeded clustering in general submodular hypergraphs, but existing HFD solvers do not keep intermediate computation local at every iteration. We introduce Thresholded Local HFD (TL-HFD), a first-order method that maintains an active region around the seeds, performs projected subgradient updates on that region and its immediate boundary, and expands via...

arXiv CS 1d ago

Graph Neural Networks for Fast Operator Selection in Adaptive VQE

arXiv:2606.08794v1 Announce Type: cross Abstract: Adaptive variational quantum algorithms like ADAPT-VQE construct tailored ans\"atze by iteratively selecting operators from a pool using gradient-based criteria. While this avoids oversized parameter spaces, repeatedly scanning the full pool incurs a classical cost that scales linearly with pool size-a major bottleneck for systems with long-range interactions or large operator sets. Here, we reformulate adaptive operator selection as a...

arXiv Physics 1d ago

Remember with Confidence: Uncertainty Quantification for Spatio-temporal Memory with Probabilistic Guarantees

Announce Type: new Abstract: Long-horizon robot operation requires spatio-temporal memory to record the environment state and recall it for downstream reasoning. Scene graphs and retrieval-augmented systems ground VLM descriptions to persistent 3D entities with rich semantic descriptions. However, VLM captions are noisy and viewpoint-inconsistent, and existing systems treat them as an oracle with no mechanism to detect unreliable stored descriptions.

arXiv CS 1d ago

Your Model Already Knows: Attention-Guided Safety Filter for Vision-Language-Action Models

arXiv:2606.09749v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models have demonstrated impressive end-to-end performance across a variety of robotic manipulation tasks. However, these policies offer no guarantees against collisions with task-irrelevant objects in the scene. Existing safety filters sidestep this problem by querying a vision-language model (VLM) to identify obstacles and their locations.

arXiv CS 1d ago

To Be Multimodal or Not to Be: Query-Adaptive Audio-Visual Person Retrieval via Active Modality Detection

arXiv:2606.05931v1 Announce Type: new Abstract: When retrieving a person from a video archive by voice and face, should the system be multimodal or not? In real-world broadcast archives, unlike curated benchmarks, a target may be heard but unseen, seen but unheard, or both. Fusing scores from an absent modality injects noise, degrading precision below the best unimodal system.

arXiv CS 5d ago

Iterative Thresholding Pursuit with Continuation for $\ell_{1-2}$-Regularized Sparse Recovery

Announce Type: new Abstract: Sparse recovery aims to reconstruct sparse signals from underdetermined and possibly noisy linear measurements. Existing $\ell_{1-2}$ iterative thresholding schemes are first-order methods. We propose an iterative thresholding pursuit method with continuation (ITP-C) for $\ell_{1-2}$-regularized sparse recovery.

arXiv CS 5d ago