Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA

Hacker News Friday 29 May 2026, 19:38 UTC By yu3zhou4 1 min read

Key Points

Summary: The article introduces Tiny-vLLM, a high-performance LLM inference engine developed in C++ and CUDA. The engine is designed to provide efficient and scalable inference for large language models, making it suitable for various applications such as natural language processing and machine learning. The article highlights the key features and benefits of Tiny-vLLM, including its ability to handle large models and its compatibility with various hardware platforms.

Article URL: https://github.com/jmaczan/tiny-vllm

Comments URL: https://news.ycombinator.com/item?id=48328184

Points: 5

# Comments: 0

Originally published by Hacker News Read original →

Four men charged after drive-by slingshot attacks on strangers across Brisbane Thu 11 Jun 2026 at 6:33am Police have charged four men after a series of alleged drive-by attacks using handmade slingshots on members of the public across Brisbane's western suburbs. It is alleged that the group used the slingshots to fire metal objects at people from a vehicle on multiple nights between May 27 and May 31. Police said a 22-year-old woman was struck on the forehead and a 36-year-old woman was...

ABC Australia 27m ago

Factory Robot Startup Mujin Raising Funds Ahead of IPO by 2030

Demonstration of a robot arm at the Mujin headquarters.

Bloomberg Technology 31m ago

Trump Risks Key Surveillance Authority Over ‘Unqualified’ Spy-Chief Pick

A sweeping warrantless surveillance authority remains on track to expire Friday, with no clear path to a deal, after President Donald Trump refused this week to abandon his pick of housing official Bill Pulte to temporarily lead the US intelligence community—even tasking Pulte with gutting the Office of the Director of National Intelligence in a DOGE-style “downsizing“ before a permanent director is named. In a Truth Social post after his second White House meeting in two days with House...

Wired 32m ago

Palisades Fire trial begins with L.A. prosecutors blaming man distraught over his New Year’s plans

LOS ANGELES — A man distraught over a failed relationship “maliciously” started a New Year’s Day fire in the first hours of 2025 that, a week later, metastasized into one of the worst disasters in Los Angeles history, federal prosecutors said Wednesday in opening statements. Jonathan Rinderknecht, 30, stands accused of three federal charges of destruction of property by means of fire, arson affecting property used in interstate commerce and timber set afire. He pleaded not guilty in October...

NBC News 33m ago

Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA

Related Stories

Four men charged after drive-by slingshot attacks on strangers across Brisbane

Factory Robot Startup Mujin Raising Funds Ahead of IPO by 2030

Trump Risks Key Surveillance Authority Over ‘Unqualified’ Spy-Chief Pick

Palisades Fire trial begins with L.A. prosecutors blaming man distraught over his New Year’s plans