Skip to content

Pull requests: vllm-project/tpu-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix DP sampling padding bug by mapping correctly to JAX sharding slices ready ONLY add when PR is ready to merge/full CI is needed
#2860 opened Jun 10, 2026 by huybro Collaborator Loading…
2 of 3 tasks
[Jax][Gemma4] Fuse gate_up_proj for Gemma-4 ready ONLY add when PR is ready to merge/full CI is needed
#2859 opened Jun 9, 2026 by lk-chen Collaborator Loading…
[DSV4] Add KV compressor module
#2858 opened Jun 9, 2026 by a1yssan13 Collaborator Loading…
support selective MLCompass filtering and fix metric export crash ready ONLY add when PR is ready to merge/full CI is needed
#2857 opened Jun 9, 2026 by ortibazar Collaborator Loading…
Fix kv_cache clearing when it's a tuple ready ONLY add when PR is ready to merge/full CI is needed
#2856 opened Jun 9, 2026 by xuefgu Collaborator Loading…
Add async scheduling support for Continue Decode
#2855 opened Jun 9, 2026 by pv97 Collaborator Loading…
Implement mem_get_info for tpu_platform ready ONLY add when PR is ready to merge/full CI is needed
#2854 opened Jun 9, 2026 by lxhfirenking Collaborator Loading…
Merge shared expert all-reduce with FusedMoE all-reduce ready ONLY add when PR is ready to merge/full CI is needed
#2849 opened Jun 9, 2026 by lenscloth Collaborator Loading…
Lihao/mtp gemma4 4
#2846 opened Jun 8, 2026 by Lumosis Collaborator Draft
Fix fused MoE imports for vLLM FusedMoE/MoERunner inversion ready ONLY add when PR is ready to merge/full CI is needed
#2844 opened Jun 8, 2026 by guowei-dev Collaborator Loading…
Customized a experimental RPA kernel for DCP usecase
#2842 opened Jun 8, 2026 by weiyu0824 Collaborator Loading…
Fix Eagle3 speculative decoding for models with use_aux_hidden_state=False ready ONLY add when PR is ready to merge/full CI is needed
#2841 opened Jun 8, 2026 by huybro Collaborator Loading…
[KV Offload] Enable KV offloading for attention data parallelism
#2840 opened Jun 8, 2026 by amitkumar307d Loading…
3 tasks done
Get the right rope_scaling values from config. ready ONLY add when PR is ready to merge/full CI is needed
#2838 opened Jun 7, 2026 by lc5211 Collaborator Loading…
Add SparseCore ragged_gather_v2 + ragged_gather_reduce_v2 MoE kernels ready ONLY add when PR is ready to merge/full CI is needed
#2836 opened Jun 6, 2026 by guowei-dev Collaborator Loading…
Add Support for Kernel Tuner Specific Flags
#2834 opened Jun 6, 2026 by patrickji2014 Collaborator Loading…
Flush pending async TPU outputs on empty schedules
#2833 opened Jun 5, 2026 by YJYJLee Loading…
[MLA] Params mapping for more customized benchmarking ready ONLY add when PR is ready to merge/full CI is needed
#2831 opened Jun 5, 2026 by BirdsOfAFthr Collaborator Loading…
[Multimodal] Add support for mm-encoder-tp-mode for ViT
#2830 opened Jun 5, 2026 by kwang3939 Collaborator Loading…
Fixes for TPU device topology ready ONLY add when PR is ready to merge/full CI is needed
#2827 opened Jun 5, 2026 by dannawang0221 Collaborator Loading…
[Experiment] Patrickji.gemma4 batched rpa tuning exp
#2825 opened Jun 5, 2026 by patrickji2014 Collaborator Loading…
add short conv1d kernel ready ONLY add when PR is ready to merge/full CI is needed
#2823 opened Jun 4, 2026 by yaochengji Collaborator Loading…
3 tasks done
ProTip! Type g i on any issue or pull request to go back to the issue listing page.