Home Knowledge Base HARVE

HARVE

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

HARVE: Hacking-Aware Reward-Head Vector Editing for Robust Reward Models

Announce Type: new Abstract: Reward models are central to large language model (LLM) alignment, but they remain vulnerable to reward hacking. To evaluate reward-model robustness, we introduce RewardHackBench containing 13 reward-hacking patterns covering real life high-stakes domains and general settings, and we find severe failures on specific subcategories across eight reward models. To mitigate these failures, we propose HARVE, a training-free reward-head editing method for scalar reward...

arXiv CS 7d ago

Trichome and inflorescence evolution in the paleopolyploid tree genus Greyia Hook. & Harv.

The southern African endemic tree genus Greyia Hook. & Harv. (Francoaceae, Geraniales) comprises three species distinguished by striking variation in trichome morphology and floral architecture, yet the genomic and regulatory basis of these traits has remained largely unexplored. Here, we present chromosome-scale genome assemblies for all recognized Greyia species, namely Greyia radlkoferi, Greyia sutherlandii, and Greyia flanaganii, providing the first high-quality genomic resources for...

bioRxiv 10d ago