ggml-org/llama.cpp

↓ decelerating · ggml-org/llama.cpp

Velocity dropped from 47.4 to 26.0 per day with -21.4 acceleration, driven by saturation after commits like f84270e delivering speedups and 0f1bb60 fixing model scales, which resolved key bottlenecks but reduced immediate hype. This stems from a post-merge lull, as benchmarks in the PR showed token generation gains that met developer needs without prompting further rapid adoption. Compared to vLLM's -7.3 deceleration on 14.3 velocity, llama.cpp's sharper drop highlights its CPU focus vulnerability amid GPU-centric peers, signaling investors that category evaluation should prioritize hardware-agnostic scalability to weather such cycles.


From the briefing: 2026-04-28 · AI OSS Momentum Decelerates Across Key Projects