Home › Technology › CVE-Bench: testing LLM agents on real-world vulnerability patches

Technology

CVE-Bench: testing LLM agents on real-world vulnerability patches

Hacker News Friday 29 May 2026, 19:28 UTC By logickkk1 1 min read

Key Points

Summary: The article discusses the development of CVE-Bench, a tool designed to test large language model (LLM) agents on real-world vulnerability patches. The tool aims to evaluate the effectiveness of LLM agents in identifying and mitigating vulnerabilities in software systems. The article highlights the importance of testing LLM agents on real-world scenarios to ensure their reliability and accuracy in detecting and fixing vulnerabilities.

Article URL: https://giovannigatti.github.io/cve-bench/

Comments URL: https://news.ycombinator.com/item?id=48328088

Points: 3

# Comments: 1

Originally published by Hacker News Read original →

Google is making some changes to how it saves your interactions with Search. In an email sent to users, Google says it will save the images, files, audio, and video you use to search under a new "Search Services History" setting. That includes the images you search for with Google Lens, recordings from its real-time Search Live tool, voice searches, and phrases spoken into Translate, according to an update on the company's website.

The Verge 1h ago

ASML to Cut Fewer Jobs Than Planned After Union Negotiations

The ASML Holding NV headquarters in Veldhoven, Netherlands, on Thursday, Oct. 17, 2024. ASML Chief Executive Officer Christophe Fouquet said he expects the chip market’s long-awaited recovery will extend “well into 2025,” following disappointing third-quarter earnings that sparked a broad selloff across the semiconductor industry.

Bloomberg Technology 1h ago

Engadget Podcast: WWDC 2026 thoughts from Apple Park

Engadget 1h ago

German court holds Google liable for false AI Overview answers

Engadget 1h ago

CVE-Bench: testing LLM agents on real-world vulnerability patches

Related Stories

Google will save your Lens photos, Search Live recordings, and Translate audio for AI training

ASML to Cut Fewer Jobs Than Planned After Union Negotiations

Engadget Podcast: WWDC 2026 thoughts from Apple Park

German court holds Google liable for false AI Overview answers