ggml-org/llama.cpp
↓ decelerating · ggml-org/llama.cpp
llama.cpp decelerated sharply to 31.0 stars per day from 45.6 with -14.6, following commit f84270e's tile buffer optimizations that delivered token generation speedups but failed to sustain momentum. The slowdown reflects post-GGUF saturation, lagging vLLM's 15.9 despite higher absolute velocity. In the runtime cohort, this positions it as cooling after breakout, advising investors to evaluate forks or extensions for CPU-specific niches where peers like Ollama also decelerate at -7.7.
From the briefing: 2026-04-27 · Inference runtimes decelerate amid platform acceleration