quantizations

Here are 6 public repositories matching this topic...

pheonix-delta / axiom-voice-agent

Run a <400ms latency Voice Agent on just 4GB VRAM. Fully offline, no API keys required. Optimized for GTX 1650 and edge robotics with zero-copy inference. (Apache 2.0)

Updated Feb 22, 2026
Python

jianhayes / NESTQUANT

Star

NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN [IEEE TMC 2025]

machine-learning deep-neural-networks deep-learning quantizations

Updated Oct 10, 2025
Python

thnguyen996 / fault-injection

Star

Implementation of "Low-Cost and Effective Fault-Tolerance Enhancement Techniques for Emerging Memories-Based Deep Neural Networks." 2021 58th ACM/IEEE Design Automation Conference (DAC).

deep-neural-networks memristor stuck-at-faults quantizations

Updated Feb 8, 2023
Python

High-performance late-interaction retrieval engine for on-prem AI. ColBERT/ColPali multi-vector search with Rust fused MaxSim, Triton GPU kernels, ROQ quantization, LEMUR routing, WAL-backed CRUD, and a FastAPI server — single machine, CPU or GPU.

semantic-search-engine on-premise colbert-ai multivector quantizations retrieval-augmented-generation vector-databases rag-pipeline multimodal-rag colpali triton-kernels late-interaction multivector-search multivector-embeddings

Updated Apr 15, 2026
Python

serverdaun / rag-w-binary-quant

Star

RAG with Binary Quantization for enhanced performance

gradio quantizations rag-chatbot

Updated Aug 9, 2025
Python

DeboJp / QLoRA-Fine-Tuning-FLAN-T5-Large-for-Stance-Classification-An-Exploration

Star

Batched QLoRA fine-tuning of FLAN-T5-Large for three-way stance classification, with systematic evaluation of clustering, embedding probes, and full model inference

nlp deep-learning fine-tuning peft text-classificaiton huggingface-transformers machine-learnign llm quantizations flan-t5 qlora

Updated Aug 22, 2025

Improve this page

Add a description, image, and links to the quantizations topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the quantizations topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantizations

Here are 6 public repositories matching this topic...

pheonix-delta / axiom-voice-agent

jianhayes / NESTQUANT

thnguyen996 / fault-injection

ddickmann / voyager-index

serverdaun / rag-w-binary-quant

DeboJp / QLoRA-Fine-Tuning-FLAN-T5-Large-for-Stance-Classification-An-Exploration

Improve this page

Add this topic to your repo