ggml-org/llama.cpp
↓ decelerating · ggml-org/llama.cpp
Llama.cpp decelerated to 43.0 daily stars from 54.6, marking -11.6 acceleration after a release tag finalizing Qwen support without major new features, indicating post-peak consolidation in CPU inference. This follows high prior velocity from GGUF expansions, but now trails vLLM's +5.0 gain in similar workloads. Cohort-wise, it still leads Ollama's 36.4 but shows rotation to specialized tools. Investors should read this as maturation in local runtimes, suggesting opportunities to fund extensions like enhanced MLX integrations rather than core forks.
From the briefing: 2026-04-13 · Inference Runtimes Drive OSS AI Momentum Surge