#
eagle3
Here are 5 public repositories matching this topic...
AMD ROCm (gfx1030) inference fork with RotorQuant/TurboQuant KV compression, PHANTOM-X zero-copy draft speculation, EAGLE3 speculative decoding, 12 RDNA2 crash fixes, and PrismML Bonsai Q1_0_G128 1-bit GGUF support.
triton hip bonsai rocm amd-gpu gguf speculative-decoding sglang rdna2 eagle3 turboquant prismml gfx1030 p-eagle radix-cache
-
Updated
Apr 16, 2026 - Python
🚀 Harness mini-SGLang to power efficient inference for Large Language Models with a lightweight, high-performance framework that prioritizes clarity and speed.
theme ruby-gem github-pages icon transformer moe attention mini diffusion ini-reader vlm ini-file ini-generator llm deepseek eagle3
-
Updated
Apr 17, 2026 - Python
Improve this page
Add a description, image, and links to the eagle3 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the eagle3 topic, visit your repo's landing page and select "manage topics."