Home Knowledge Base Pythia-160M to Pythia-410M

Pythia-160M to Pythia-410M

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

A Negative Result on Cross-Model Activation Transfer in a Pythia Multi-Hop Setting

Announce Type: new Abstract: Recent work shows that language models can transmit behavioural traits through hidden signals in generated data during training. We ask whether a more direct and stricter channel is also viable: can one language model communicate useful intermediate reasoning state to another at inference time by translating and injecting hidden activations, rather than by passing natural-language text?

arXiv CS 7d ago

A Negative Result on Cross-Model Activation Transfer in a Pythia Multi-Hop Setting

arXiv:2606.03280v2 Announce Type: replace Abstract: Recent work shows that language models can transmit behavioural traits through hidden signals in generated data during training. We ask whether a different activation-mediated channel is viable: can one language model communicate a useful intermediate reasoning state to another at inference time through a post-hoc linear activation bridge, rather than through a textual or structured-token relay?

arXiv CS 2d ago