Here are
17 public repositories
matching this topic...
🦞 LLM Token Compression & Reduction Tool — Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.
Updated
Mar 10, 2026
Python
📚 Collection of token-level model compression resources.
The official code for the paper: LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs
Updated
Jul 1, 2025
Python
Token-Oriented Object Notation - A compact data format for reducing token consumption when sending structured data to LLMs (PHP implementation)
Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"
Updated
Feb 13, 2026
Python
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
Updated
Mar 3, 2026
Python
[ICLR 2026 Oral] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging
Updated
Feb 12, 2026
Python
[ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"
Updated
Feb 24, 2026
Shell
😎 Awesome papers on token redundancy reduction
This repo integrates DyCoke's token compression method with VLMs such as Gemma3 and InternVL3
Updated
Nov 11, 2025
Python
Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model
Updated
May 13, 2025
Python
[Arxiv 2025 Preprint] HiPrune, a training-free visual token pruning method for VLM acceleration.
Updated
Nov 10, 2025
Jupyter Notebook
[ICLR 2026] Official code of PPE: Positional Preservation Embedding for Token Compression in Multimodal Large Language Models.
Updated
Feb 12, 2026
Python
🛠️ Implement TOON in PHP for efficient serialization of JSON-like data, optimizing parsing for Large Language Models while maintaining clarity and structure.
Compress React/Next.js files by ~40% for AI assistants. MCP server + encoder.
Updated
Feb 25, 2026
JavaScript
Maximum meaning, minimum tokens. Rust-based markdown compression for LLM workflows.
Updated
Mar 5, 2026
TypeScript
Improve this page
Add a description, image, and links to the
token-compression
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
token-compression
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.