Home › Knowledge Base › Patcher

Patcher

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Patcher: Post-Hoc Patching of Backdoored Large Language Models

Announce Type: new Abstract: Large language models remain vulnerable to jailbreak backdoor attacks, where adversaries poison safety alignment data to embed hidden triggers that bypass safety mechanisms. Existing defenses often require comprehensive attack information or multiple triggered examples, making them impractical when defenders only observe a single reported failure case without knowing whether it stems from a backdoor attack or a natural alignment bug. This paper presents Patcher,...

arXiv CS 7d ago

Defending Against Malicious Finetuning by Scaling Train-time Adversarial Attacks

Announce Type: new Abstract: Current open-weight large language models (LLMs) are prone to malicious finetuning attacks, which could compromise the safety alignment of LLMs with only a few steps of supervised finetuning (SFT) on poisoned datasets. Existing alignment-stage defenses are primarily designed to defend against attacks that use parameter-efficient finetuning methods. However, they fail to defend against stronger attacks that use full-parameter finetuning.

arXiv CS 1d ago

macOS 27 requires Apple Silicon, as Apple draws down the Intel Mac era

As Apple announced last year, this year's macOS release will end support for Intel Macs. The macOS 27 Golden Gate release will require a Mac with an Apple Silicon chip inside, including the original M1 that launched in the MacBook Air, MacBook Pro, and Mac mini back in late 2020. Intel Macs running macOS 26 Tahoe can expect security and Safari patches for about two more years after the release of macOS 27 Golden Gate.

Ars Technica 1d ago