Home Knowledge Base Dual Steering

Dual Steering

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

SAILRec: Steering LLM Attention to Dual-Side Semantically Aligned Collaborative Embeddings for Recommendation

arXiv:2606.04514v1 Announce Type: new Abstract: Recent LLM-based recommenders enhance language models with collaborative embeddings from user-item interactions, but making such embeddings available does not ensure their proper use during inference. Through a diagnostic attention analysis, we find that the utilization of collaborative embeddings is depth-dependent and alignment-sensitive, suggesting that LLMs need to balance their internal semantic knowledge with external collaborative...

arXiv CS 6d ago

The Information Geometry of Softmax: Probing and Steering

arXiv:2602.15293v2 Announce Type: replace Abstract: This paper concerns the question of how AI systems encode semantic structure into the geometric structure of their representation spaces. The motivating observation is that the natural geometry of these representation spaces should reflect the way models use representations to produce behavior. We focus on the important special case of representations that define softmax distributions.

arXiv CS 9d ago

Latent Space Disentanglement via Activation Steering for Interpretable Attribute Control in Symbolic Music Generation

new Abstract: Transformer-based architectures have significantly advanced the generation of complex symbolic sequences, yet a significant gap remains in achieving fine-grained, interpretable control over discrete signal attributes. This paper investigates the mechanistic interpretability of the Multitrack Music Transformer (MMT) and proposes a framework for deterministic attribute modulation without retraining to bridge this gap via inference-time activation steering. Utilizing the...

arXiv CS 9d ago

The Smart Bird Feeders Everyone’s Talking About (and Actually Buying) (2026)

you’ve probably seen a smart bird feeder or know someone who has one. They’re easily recognizable with their clear housing, cameras, and solar panels. Perhaps a friend or family member has sent you a photo or video of a bright goldfinch or handsome woodpecker (guilty).

Wired 1d ago

Mental Damage: Caption Poisoning Attacks on Retrieval-Augmented Text-to-Music Generation

Announce Type: new Abstract: Retrieval-augmented text-to-music (TTM) systems augment underspecified user prompts using captions retrieved from a music caption dataset. This design introduces an integrity dependency on the music knowledge database. We show that an attacker can poison the database by injecting a small number of crafted music captions, causing the system to retrieve malicious captions that bias prompt augmentation and steer generation away from the user's intended function,...

arXiv CS 9d ago

Endogenous Resistance to Activation Steering in Language Models

arXiv:2602.06941v2 Announce Type: replace Abstract: Large language models can recover mid-generation from task-misaligned activation steering, producing explicit verbal restarts (e.g., ``wait, that's not right'') and continuing on-topic even while the steering perturbation remains active. We term this Endogenous Steering Resistance (ESR). Using sparse autoencoder (SAE) latents to steer model activations, we find that Llama-3.3-70B exhibits explicit ESR at \llamaseventyEsrRate\%, with smaller...

arXiv CS 2d ago

ZIPP:Zero-shot Image Personalization from Personas

arXiv:2606.08841v1 Announce Type: new Abstract: Text-to-image diffusion models are increasingly deployed in open-ended creative contexts, yet their outputs remain impersonal, optimized for aggregate aesthetics rather than individual taste. Human preferences are pluralistic: one user favoring muted, nostalgic portraits may prefer vibrant street photography, while another gravitates toward dreamy film aesthetics. Existing methods require dense interaction histories or per-user fine-tuning,...

arXiv CS 1d ago

Physics-Guided Geometric Diffusion for Macro Placement Generation

arXiv:2605.16451v2 Announce Type: replace Abstract: Macro placement is a pivotal stage in VLSI physical design, fundamentally determining the overall chip performance. Recent data-driven placement methods have demonstrated significant potential, yet they often struggle to handle sequential dependencies and to balance topological connectivity with physical constraints. To bridge this gap, we propose MacroDiff+, a physics-guided geometric diffusion framework.

arXiv CS 8d ago

Personalized 3D Myocardial Infarct Geometry Reconstruction from Cine MRI for Cardiac Digital Twins

arXiv:2606.01808v1 Announce Type: new Abstract: Accurate 3D geometric characterization of myocardial infarction (MI) is essential for building cardiac digital twins (CDTs) to precisely simulate infarct-related electrophysiology. Late gadolinium enhancement magnetic resonance imaging (LGE MRI) is the clinical reference for locating MI, yet its reliance on contrast agents restricts use in renally impaired patients and limits longitudinal follow-ups. As an alternative, contrast-free cine MRI...

arXiv CS 8d ago

SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment

new Abstract: Aligning Large Language Models (LLMs) with human values often degrades their general capabilities, termed the alignment tax. Existing methods mitigate this by balancing dual objectives, which heavily rely on massive general-purpose data or auxiliary reward models. In this paper, we argue that, because safety features are inherently sparse within the output distribution, alignment requires localized modifications rather than global trade-offs.

arXiv CS 8d ago