Accuracy mismatch between ONNX Runtime and TensorRT for SegVit

## Description

We observed a significant accuracy mismatch when converting an SegVit ONNX model to a TensorRT engine. The issue has been narrowed down using polygraphy debug reduce and appears to originate from normalization layers (InstanceNormalization / GroupNorm pattern).

The mismatch starts from very early layers in the model and propagates through the entire network, eventually causing large output deviations.

## Environment
TensorRT version: 10.13.0.35
GPU: RTX 3080
CUDA version: 12.8
OS: Ubuntu 22.04

## Steps To Reproduce
Run Polygraphy debug reduce:
```polygraphy debug reduce vit_seg_simp.onnx --mode bisect --output reduced_model.onnx --check polygraphy run polygraphy_debug.onnx --onnxrt --trt```

Observe accuracy mismatch between ONNX Runtime and TensorRT.

Thanks!

[log.txt](https://github.com/user-attachments/files/27158847/log.txt)
[onnx file](https://drive.google.com/file/d/1R6mMJzRn9xIMSjJCOeUFfR8Z4_SMUECc/view?usp=sharing)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accuracy mismatch between ONNX Runtime and TensorRT for SegVit #4741

Description

Environment

Steps To Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Accuracy mismatch between ONNX Runtime and TensorRT for SegVit #4741

Description

Description

Environment

Steps To Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions