Home Knowledge Base the Black Box

the Black Box

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

EvoDefense: Co-Evolving Black-Box Defense with Large Language Models

arXiv:2605.31140v1 Announce Type: new Abstract: Large Language Models (LLMs) remain highly vulnerable to diverse attacks, particularly in black-box settings where the internals of target models are inaccessible. Existing black-box defenses typically rely on pre-defined filtering heuristics, which often fail to generalize to unseen attack types and target model architectures. We introduce EvoDefense, an experience-guided co-evolving black-box defense paradigm.

arXiv CS 9d ago

Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents

arXiv:2606.05296v1 Announce Type: new Abstract: LLM agents operate in two distinct regimes: open-weight agents amenable to reinforcement learning (RL) and black-box agents whose behaviour must be controlled purely at test time. Although black-box agents are often backed by state-of-the-art proprietary LLMs, API-only access precludes parameter-level optimization, rendering most RL methods inapplicable. To address this limitation, we turn to a known equivalence between RL and Bayesian inference.

arXiv CS 5d ago

LLMs are not the black box you were promised

LLMs are not the "black box" you were promised. Mechanistic interpretability — peering into a neural network to reverse engineer its inner workings — has made major strides. Anthropic's On the Biology of a Large Language Model (2025) is a landmark in that effort.

Hacker News 7d ago

Training Diffusion Language Models for Black-Box Optimization

Announce Type: replace Abstract: We study offline black-box optimization (BBO), aiming to discover improved designs from an offline dataset of designs and labels, a problem common in robotics and DNA with limited labeled samples. While recent work applies autoregressive LLMs to BBO by formatting tasks as natural-language prompts, their left-to-right design generation struggles to capture the strong bidirectional dependencies inherent in design problems. To address this, we propose adapting...

arXiv CS 8d ago

Bounded Behavioral Indistinguishability for Black-Box LLM Distillation

arXiv:2605.30448v1 Announce Type: new Abstract: Black-box LLM distillation is usually evaluated as an output-matching problem: a student is considered successful when its responses are semantically similar to, or task-consistent with, those of a teacher. However, output similarity does not imply that the student is behaviorally indistinguishable from the model it imitates. We introduce bounded behavioral indistinguishability, formalized as $(\epsilon,q,t,\mathbb{A})$-behavioral...

arXiv CS 9d ago

Explaining Black-Box Language Models: Learning to Optimize Linguistically-Structured Word Subsets

arXiv:2606.08497v1 Announce Type: new Abstract: As deep language models (DLMs) are increasingly deployed in high-stakes domains such as healthcare, understanding their decision rationale becomes paramount for ensuring trust, safety, and accountability. However, achieving this vital level of interpretability is particularly challenging when these DLMs operate as black-box systems (e.g., via APIs), where access to internal model states (e.g., parameters, gradients) is restricted. Despite...

arXiv CS 1d ago

China launches AI framework to improve ‘black box’ transparency and raise standards

China launches AI framework to improve ‘black box’ transparency and raise standards The initiative underscores Beijing’s growing focus on AI governance, as concerns grow over algorithm bias and data security China has pledged to improve the accuracy, reliability and transparency of AI through a new national evaluation framework, as policymakers move to establish common standards for assessing the fast-evolving technology. New guidelines released by the central government said Beijing would...

South China Morning Post 11d ago

BAHSD: Bridging the Long-tail Gap via Adaptive Distillation in Black-box Sequential Recommendation

Announce Type: replace Abstract: Sequential recommendation systems are widely adopted but often deployed as black-box APIs, which has driven recent interest in model extraction to replicate their capabilities locally. However, the long-tail distribution induces severe signal heterogeneity: dense head sequences trigger the solidification of teacher preference, biasing extraction toward local patterns, while sparse tail sequences yield flat, noisy predictions. Existing one-size-fits-all...

arXiv CS 5d ago

BAHSD: Bridging the Long-tail Gap via Adaptive Distillation in Black-box Sequential Recommendation

arXiv:2606.03091v1 Announce Type: new Abstract: Sequential recommendation systems are widely adopted but often deployed as black-box APIs, which has driven recent interest in model extraction to replicate their capabilities locally. However, the long-tail distribution induces severe signal heterogeneity: dense head sequences trigger the solidification of teacher preference, biasing extraction toward local patterns, while sparse tail sequences yield flat, noisy predictions. Existing...

arXiv CS 7d ago

Randomized separations in black-box TFNP

arXiv:2606.04697v1 Announce Type: new Abstract: We study the relationship between deterministic and randomized black-box reducibility between problems in TFNP. Our main contribution is a general technique that establishes equivalence between these reducibility types from specific TFNP problems to any TFNP problem. In particular, we show that this equivalence holds for reductions from complete problems in PPP, PPAD, PPA, and $t$-PPP.

arXiv CS 6d ago