#

token-compression

Here are 17 public repositories matching this topic...

aeromomo / claw-compactor

🦞 LLM Token Compression & Reduction Tool — Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.

Updated Mar 10, 2026
Python

cokeshao / Awesome-Multimodal-Token-Compression

[TMLR 2026] Survey: https://arxiv.org/pdf/2507.20198

awesome-list model-acceleration long-context mllm efficient-ai token-compression efficient-mllm

Updated Feb 22, 2026

xuyang-liu16 / Awesome-Token-level-Model-Compression

📚 Collection of token-level model compression resources.

computer-vision model-compression model-acceleration efficient-deep-learning token-pruning token-merging token-compression

Updated Sep 3, 2025

HumanMLLM / LLaVA-Scissor

The official code for the paper: LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs

video-understanding connected-components video-language-understanding mllm multimodal-large-language-models token-compression

Updated Jul 1, 2025
Python

HelgeSverre / toon-php

Token-Oriented Object Notation - A compact data format for reducing token consumption when sending structured data to LLMs (PHP implementation)

php serialization ai data-format toon llm token-compression

Updated Dec 6, 2025
PHP

HVision-NKU / GlimpsePrune

Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"

inference-efficiency lvlms mllms visual-token-pruning token-compression

Updated Feb 13, 2026
Python

YiwengXie / FluxMem

[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding

streaming-video video-understanding large-multimodal-models token-compression

Updated Mar 3, 2026
Python

Fanziyang-v / FlashVID

[ICLR 2026 Oral] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging

efficiency multimodal video-llms token-compression flashvid

Updated Feb 12, 2026
Python

hanxunyu / VisionTrim

[ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"

efficiency multimodal token-compression lightweight-vlm

Updated Feb 24, 2026
Shell

sangminwoo / awesome-token-redundancy-reduction

😎 Awesome papers on token redundancy reduction

token-pruning token-reduction token-merging token-compression token-sparsification token-redundancy-reduction

Updated Mar 12, 2025

mvish7 / dycoke_token_compression

This repo integrates DyCoke's token compression method with VLMs such as Gemma3 and InternVL3

inference-optimization vlms video-large-language-models token-compression

Updated Nov 11, 2025
Python

pzrain / DiViCo

Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model

multimodal large-vision-language-model token-compression

Updated May 13, 2025
Python

lijun2005 / Arxiv25-HiPrune

[Arxiv 2025 Preprint] HiPrune, a training-free visual token pruning method for VLM acceleration.

vision-language-model training-free-acceleration token-compression

Updated Nov 10, 2025
Jupyter Notebook

MouxiaoHuang / PPE

[ICLR 2026] Official code of PPE: Positional Preservation Embedding for Token Compression in Multimodal Large Language Models.

multimodal positional-encoding large-language-models vision-language-model token-merging token-compression iclr2026 token-clustering

Updated Feb 12, 2026
Python

nyelzkie / toon-php

🛠️ Implement TOON in PHP for efficient serialization of JSON-like data, optimizing parsing for Large Language Models while maintaining clarity and structure.

php json library csv ai decoder oop encode decode structured-data json-alternative toon php-serialize llm token-compression toon-php php-toon token-oriented-notation

Updated Mar 10, 2026
PHP

SebastianMaciel / jsx-notation

Compress React/Next.js files by ~40% for AI assistants. MCP server + encoder.

react babel ai jsx mcp nextjs vscode developer-tools cursor tsx claude tailwind llm token-compression

Updated Feb 25, 2026
JavaScript

copyleftdev / laconic

Maximum meaning, minimum tokens. Rust-based markdown compression for LLM workflows.

Updated Mar 5, 2026
TypeScript

Improve this page

Add a description, image, and links to the token-compression topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the token-compression topic, visit your repo's landing page and select "manage topics."