Skip to content

🐛 [Bug] dynamic_block_quantize converter received an input of (11520, 3, 2, 16, 16) shape. Supported shapes: 2D or 3D #4201

@zewenli98

Description

@zewenli98

Bug Description

I got the error when quantizing Alpamayo 1 with nvfp4:

2026-04-21 18:01:13,791 - torch_tensorrt.dynamo.conversion._TRTInterpreter - INFO - Converted node hidden_states [hidden_states] (Inputs: () | Outputs: (hidden_states: (11520, 1536)@torch.float16))
2026-04-21 18:01:13,791 - torch_tensorrt.dynamo.conversion._TRTInterpreter - INFO - Converted node visual.patch_embed/_reshape_copy [aten._reshape_copy.default] (Inputs: (hidden_states: (11520, 1536)@torch.float16, [-1, 3, 2, 16, 16]) | Outputs: (_reshape_copy: (11520, 3, 2, 16, 16)@torch.float16))
2026-04-21 18:01:13,792 - torch_tensorrt.dynamo.conversion._TRTInterpreter - INFO - Converted node visual_patch_embed_proj_input_quantizer__amax [visual.patch_embed.proj.input_quantizer._amax] (Inputs: () | Outputs: (visual_patch_embed_proj_input_quantizer__amax: ()@torch.float16))
2026-04-21 18:01:13,795 - alpamayo_r1.trt.vision - ERROR - TRT compilation failed: dynamic_block_quantize converter received an input of (11520, 3, 2, 16, 16) shape. Supported shapes: 2D or 3D

While executing %dynamic_block_quantize_op : [num_users=1] = call_function[target=torch.ops.tensorrt.dynamic_block_quantize_op.default](args = (%_reshape_copy, 16, %visual_patch_embed_proj_input_quantizer__amax, 4, 2, 8, 4), kwargs = {})
Original traceback:
File "/home/scratch.zewenl_sw/docker_workspace/cehongwang/alpamayo/src/alpamayo_r1/trt/vision.py", line 192, in forward
    hidden_states = self.visual.patch_embed(hidden_states)
  File "/home/scratch.zewenl_sw/docker_workspace/alpamayo/ar1_venv-b300/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 75, in forward
    hidden_states = self.proj(hidden_states.to(dtype=target_dtype)).view(-1, self.embed_dim)
  File "/home/scratch.zewenl_sw/docker_workspace/tmp/TensorRT-Model-Optimizer/modelopt/torch/quantization/nn/modules/quant_module.py", line 232, in forward
    return super().forward(input, *args, **kwargs)
  File "/home/scratch.zewenl_sw/docker_workspace/tmp/TensorRT-Model-Optimizer/modelopt/torch/quantization/nn/modules/quant_module.py", line 164, in forward
    input = self.input_quantizer(input)
  File "/home/scratch.zewenl_sw/docker_workspace/tmp/TensorRT-Model-Optimizer/modelopt/torch/quantization/nn/modules/tensor_quantizer.py", line 1086, in forward
    outputs = self._fake_quantize(inputs)
  File "/home/scratch.zewenl_sw/docker_workspace/tmp/TensorRT-Model-Optimizer/modelopt/torch/quantization/tensor_quant.py", line 552, in forward
    return _dynamic_block_quantize_forward(
Use tlparse to see full graph. (https://github.com/pytorch/tlparse?tab=readme-ov-file#tlparse-parse-structured-pt2-logs)
2026-04-21 18:01:13,796 - alpamayo_r1.trt.compile_trt - ERROR - Vision TRT compilation failed

To Reproduce

Expected behavior

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0):
  • PyTorch Version (e.g. 1.0):
  • CPU Architecture:
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions