-
Notifications
You must be signed in to change notification settings - Fork 210
Pull requests: vllm-project/tpu-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix DP sampling padding bug by mapping correctly to JAX sharding slices
ready
ONLY add when PR is ready to merge/full CI is needed
#2860
opened Jun 10, 2026 by
huybro
Collaborator
Loading…
2 of 3 tasks
[Jax][Gemma4] Fuse gate_up_proj for Gemma-4
ready
ONLY add when PR is ready to merge/full CI is needed
#2859
opened Jun 9, 2026 by
lk-chen
Collaborator
Loading…
support selective MLCompass filtering and fix metric export crash
ready
ONLY add when PR is ready to merge/full CI is needed
#2857
opened Jun 9, 2026 by
ortibazar
Collaborator
Loading…
Fix kv_cache clearing when it's a tuple
ready
ONLY add when PR is ready to merge/full CI is needed
#2856
opened Jun 9, 2026 by
xuefgu
Collaborator
Loading…
Add async scheduling support for Continue Decode
#2855
opened Jun 9, 2026 by
pv97
Collaborator
Loading…
Implement mem_get_info for tpu_platform
ready
ONLY add when PR is ready to merge/full CI is needed
#2854
opened Jun 9, 2026 by
lxhfirenking
Collaborator
Loading…
[Gemma4][KV Cache Host offload] Extent KV cache offload support for hybrid attention models
#2851
opened Jun 9, 2026 by
amanseervi
Contributor
•
Draft
Merge shared expert all-reduce with FusedMoE all-reduce
ready
ONLY add when PR is ready to merge/full CI is needed
#2849
opened Jun 9, 2026 by
lenscloth
Collaborator
Loading…
Fix fused MoE imports for vLLM FusedMoE/MoERunner inversion
ready
ONLY add when PR is ready to merge/full CI is needed
#2844
opened Jun 8, 2026 by
guowei-dev
Collaborator
Loading…
Customized a experimental RPA kernel for DCP usecase
#2842
opened Jun 8, 2026 by
weiyu0824
Collaborator
Loading…
Fix Eagle3 speculative decoding for models with use_aux_hidden_state=False
ready
ONLY add when PR is ready to merge/full CI is needed
#2841
opened Jun 8, 2026 by
huybro
Collaborator
Loading…
[KV Offload] Enable KV offloading for attention data parallelism
#2840
opened Jun 8, 2026 by
amitkumar307d
Loading…
3 tasks done
Get the right rope_scaling values from config.
ready
ONLY add when PR is ready to merge/full CI is needed
#2838
opened Jun 7, 2026 by
lc5211
Collaborator
Loading…
Add SparseCore ragged_gather_v2 + ragged_gather_reduce_v2 MoE kernels
ready
ONLY add when PR is ready to merge/full CI is needed
#2836
opened Jun 6, 2026 by
guowei-dev
Collaborator
Loading…
Add Support for Kernel Tuner Specific Flags
#2834
opened Jun 6, 2026 by
patrickji2014
Collaborator
Loading…
[MLA] Params mapping for more customized benchmarking
ready
ONLY add when PR is ready to merge/full CI is needed
#2831
opened Jun 5, 2026 by
BirdsOfAFthr
Collaborator
Loading…
[Multimodal] Add support for mm-encoder-tp-mode for ViT
#2830
opened Jun 5, 2026 by
kwang3939
Collaborator
Loading…
Fixes for TPU device topology
ready
ONLY add when PR is ready to merge/full CI is needed
#2827
opened Jun 5, 2026 by
dannawang0221
Collaborator
Loading…
[Experiment] Patrickji.gemma4 batched rpa tuning exp
#2825
opened Jun 5, 2026 by
patrickji2014
Collaborator
Loading…
add short conv1d kernel
ready
ONLY add when PR is ready to merge/full CI is needed
#2823
opened Jun 4, 2026 by
yaochengji
Collaborator
Loading…
3 tasks done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.