sm120

Here are 7 public repositories matching this topic...

Navi-AI-Lab / nvllm

(Experimental) A high-throughput and memory-efficient inference and serving engine for LLMs optimized for GB10 homelabs

nvidia cuda-kernels cutlass local-inference vllm llm-inference qwen paged-attention self-hosted-ai gb10 sm120 nvfp4 dgx-spark fp4-quantization attention-kernel fp8-kv-cache

Updated Apr 15, 2026
Python

informatico-madrid / blackwell-linux-infra-optimizer

Star

Optimized vLLM deployment for NVIDIA Blackwell (RTX 5090) on Linux Kernel 6.14. Resolves SM_120 kernel incompatibilities, P2P deadlocks, and memory fragmentation for high-performance LLM inference.

infrastructure linux-kernel cuda blackwell llm vllm deepseek rtx5090 sm120

Updated Jan 17, 2026
Dockerfile

Rust-native MoE inference runtime with custom CUDA kernels for Blackwell GPUs. Includes DFlash speculative decoding, multi-tier Engram memory, and entropy-adaptive routing. Targets Qwen3.5-35B-A3B on a single RTX 5060 Ti 16GB.

rust ffi cuda inference moe quantization mamba state-space-models deltanet blackwell engram llm qwen speculative-decoding sm120 mamba2 nemotron-h hybrid-ssm

Updated Apr 14, 2026
Rust

Yyyzk123 / torch-cu128-sm120

Star

VPN-free Prebuilt PyTorch 2.9.0 for CUDA 12.8 + sm_120 (RTX 5080)

linux deep-learning cuda torch rtx5080 sm120

Updated Aug 5, 2025
Shell

Natfii / onnxruntime-gpu-blackwell

Star

Pre-built onnxruntime-gpu 1.24.1 with Blackwell sm_120 CUDA kernels (RTX 5090/5080/5070)

machine-learning gpu cuda nvidia python-wheel onnxruntime blackwell rtx-5070 rtx-5090 rtx-5080 sm120

Updated Feb 14, 2026

AIdevsmartdata / ik_llama.cpp

Star

llama.cpp fork with additional SOTA quants and improved performance

cuda inference llama cuda-kernels quantization ssm mamba state-space-models blackwell llama-cpp gguf sm120 mamba2 nemotron-h hybrid-ssm

Updated Apr 7, 2026
C++

webportalim / ComfyUI-Hunyuan3DWrapper-RTX-50xx-Blackwell-Installation-Guide

Star

Complete installation guide for ComfyUI-Hunyuan3DWrapper on NVIDIA Blackwell GPUs (RTX 5070 Ti, 5080, 5090) Covers custom_rasterizer manual compilation for sm_120 / compute_120 architecture.

windows cuda cuda-toolkit cuda-support windows-11 windows-package blackwell comfyui hunyuan3d hunyuan3dwrapper blackwell-gpu sm120 rtx-5070-ti

Updated Apr 12, 2026

Improve this page

Add a description, image, and links to the sm120 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sm120 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sm120

Here are 7 public repositories matching this topic...

Navi-AI-Lab / nvllm

informatico-madrid / blackwell-linux-infra-optimizer

AIdevsmartdata / chimere

Yyyzk123 / torch-cu128-sm120

Natfii / onnxruntime-gpu-blackwell

AIdevsmartdata / ik_llama.cpp

webportalim / ComfyUI-Hunyuan3DWrapper-RTX-50xx-Blackwell-Installation-Guide

Improve this page

Add this topic to your repo