Skip to content

Fix issue in mul_mat_id for OpenVINO backend#163

Merged
zhaixuejun1993 merged 1 commit into
ravi9:dev_backend_openvinofrom
zhaixuejun1993:xuejun/arch-llama
May 15, 2026
Merged

Fix issue in mul_mat_id for OpenVINO backend#163
zhaixuejun1993 merged 1 commit into
ravi9:dev_backend_openvinofrom
zhaixuejun1993:xuejun/arch-llama

Conversation

@zhaixuejun1993
Copy link
Copy Markdown
Collaborator

This pull request refines the handling and shape management of expert weights and activations in the OpenVINO backend for the MUL_MAT_ID operation. The core improvements ensure that multi-expert weight tensors retain their full dimensionality, and that input/output tensor shapes are rebuilt more robustly, improving compatibility with dynamic or reshaped input graphs.

Key changes include:

Shape Handling and Tensor Materialization:

  • In ggml-decoder.cpp, when materializing non-quantized expert weights, the code now preserves the full reversed 4D shape, ensuring that the expert dimension is not collapsed. This prevents issues where later operations (like Gather/MatMul) would only see a single expert slice.

Dynamic Shape Reconstruction and Reshaping:

  • In mul_mat_id.cpp, the logic for squeezing singleton axes from weights, activations, and ids is replaced with explicit dynamic shape reconstruction using ShapeOf and Reshape. This makes the code robust to input tensors that may have undergone reshaping or view operations, ensuring correct logical ranks regardless of input shape permutations.
  • The output shape of the MatMul result is now explicitly constructed using dynamic shape information and the expected output rank, replacing previous fixed unsqueeze/squeeze logic. This ensures that the output tensor always matches the required 4D shape, with checks for static rank and row dimension.

General Improvements:

  • Added missing include for concat.hpp to support new shape concatenation logic.## Overview

Additional information

Requirements

@zhaixuejun1993 zhaixuejun1993 merged commit 5f58d5d into ravi9:dev_backend_openvino May 15, 2026
3 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant