HIP/ROCm fork optimized for AMD RDNA2 (gfx1030) with PrismML Q1_0_G128 1-bit quant support, RotorQuant, TurboQuant, EAGLE3 and P-EAGLE speculative decoding, and full Wave32 kernel optimizations.
-
Updated
Apr 16, 2026 - C++
HIP/ROCm fork optimized for AMD RDNA2 (gfx1030) with PrismML Q1_0_G128 1-bit quant support, RotorQuant, TurboQuant, EAGLE3 and P-EAGLE speculative decoding, and full Wave32 kernel optimizations.
AMD ROCm (gfx1030) inference fork with RotorQuant/TurboQuant KV compression, PHANTOM-X zero-copy draft speculation, EAGLE3 speculative decoding, 12 RDNA2 crash fixes, and PrismML Bonsai Q1_0_G128 1-bit GGUF support.
Custom SPIR-V kernel factory for PHANTOM speculative decoding — LLVM IR to GPU (SPIR-V/HIP) and CPU (native x86) cross-target compilation, RDNA2/gfx1030 optimized pre-compiled kernels, dynamic kernel swapping, zero-JIT inference pipeline
Add a description, image, and links to the gfx1030 topic page so that developers can more easily learn about it.
To associate your repository with the gfx1030 topic, visit your repo's landing page and select "manage topics."