The Amplifying Mirror: Locating and Steering the Partisan Direction
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
The Amplifying Mirror: Locating and Steering the Partisan Direction inside a Large Language Model
Announce Type: new Abstract: Large language models are rapicly replacing search engines as the primary interface between people and information. Unlike search engines, which retrieve existing content, LLMs generate novel text shaped by internal representations learned during training. Here we show that partisan political identity is encoded in the model's activation space, and that this direction directly shapes generation.