vllm-project/vllm
↓ decelerating · vllm-project/vllm
vLLM dropped to 15.9 stars per day from 23.1 with -7.3, propelled by v0.20.0's CUDA 13.0 default that boosts GPU throughput but highlights compatibility gaps. The deceleration follows peers like Ollama's -7.7, yet its 96 daily gains trail llama.cpp's 360. Investors eyeing inference should note this as potential for rebound via OpenAI API merges, differentiating from cohort saturation in local runtimes.
From the briefing: 2026-04-27 · Inference runtimes decelerate amid platform acceleration