Home Knowledge Base Neuron-Level Interventions for Gendered

Neuron-Level Interventions for Gendered

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Neuron-Level Interventions for Gendered and Gender-Neutral Generation in Language Models

Announce Type: new Abstract: Language models (LMs) can produce gendered language and stereotypes even when given neutral prompts. Most prior work on gender bias in LMs primarily examines gender through a binary lens (feminine vs. masculine), with limited attention to gender-neutral forms, such as they/them pronouns or neutrally phrased job titles. How gender-related signals are encoded in the internal representations of LMs remains an open question.

arXiv CS 9d ago