Skip to content

[AMDGPU] Avoid invalid bitcasts for odd-width vector types#196609

Open
PlutoDog95 wants to merge 1 commit intollvm:mainfrom
PlutoDog95:amdgpu-fix-odd-vector-bitcast
Open

[AMDGPU] Avoid invalid bitcasts for odd-width vector types#196609
PlutoDog95 wants to merge 1 commit intollvm:mainfrom
PlutoDog95:amdgpu-fix-odd-vector-bitcast

Conversation

@PlutoDog95
Copy link
Copy Markdown

Fix an assertion failure in AMDGPULateCodeGenPrepare caused by
invalid bitcasts involving vectors with non-byte-aligned integer
element types such as <3 x i31>.

The live register optimization attempts to repack vectors through
bitcasts, but these transformations are not valid for odd-width
element types.

Skip the optimization for vector types whose element widths are
not byte aligned and gracefully bail out when conversion is not
possible.

Add a regression test for the crash.

Reproducer:

define void @repro(ptr %p, <3 x i1> %mask) {
  %x = call <3 x i31> @llvm.masked.load.v3i31.p0(
    ptr %p, <3 x i1> %mask, <3 x i31> zeroinitializer)

  call void @llvm.masked.store.v3i31.p0(
    <3 x i31> %x, ptr null, <3 x i1> %mask)

  ret void
}

Fixes #196582

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmorg-github-actions llvmorg-github-actions Bot added backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:static analyzer labels May 8, 2026
@llvmorg-github-actions
Copy link
Copy Markdown

llvmorg-github-actions Bot commented May 8, 2026

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-backend-amdgpu

Author: Vineet Agarwal (PlutoDog95)

Changes

Fix an assertion failure in AMDGPULateCodeGenPrepare caused by
invalid bitcasts involving vectors with non-byte-aligned integer
element types such as &lt;3 x i31&gt;.

The live register optimization attempts to repack vectors through
bitcasts, but these transformations are not valid for odd-width
element types.

Skip the optimization for vector types whose element widths are
not byte aligned and gracefully bail out when conversion is not
possible.

Add a regression test for the crash.

Reproducer:

define void @<!-- -->repro(ptr %p, &lt;3 x i1&gt; %mask) {
  %x = call &lt;3 x i31&gt; @<!-- -->llvm.masked.load.v3i31.p0(
    ptr %p, &lt;3 x i1&gt; %mask, &lt;3 x i31&gt; zeroinitializer)

  call void @<!-- -->llvm.masked.store.v3i31.p0(
    &lt;3 x i31&gt; %x, ptr null, &lt;3 x i1&gt; %mask)

  ret void
}

Fixes #196582


Full diff: https://github.com/llvm/llvm-project/pull/196609.diff

7 Files Affected:

  • (modified) clang/lib/Sema/Sema.cpp (+5)
  • (modified) clang/lib/StaticAnalyzer/Checkers/EnumCastOutOfRangeChecker.cpp (+7-3)
  • (added) clang/test/Analysis/Inputs/enum-system-header.h (+8)
  • (added) clang/test/Analysis/enum-cast-out-of-range-system-header.cpp (+12)
  • (added) clang/test/SemaCXX/warn-unused-constexpr-function.cpp (+24)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp (+4-1)
  • (added) llvm/test/CodeGen/AMDGPU/odd-vector-bitcast-crash.ll (+19)
diff --git a/clang/lib/Sema/Sema.cpp b/clang/lib/Sema/Sema.cpp
index 8a68f2f19bf3d..9c8a84668454a 100644
--- a/clang/lib/Sema/Sema.cpp
+++ b/clang/lib/Sema/Sema.cpp
@@ -891,6 +891,11 @@ static bool ShouldRemoveFromUnused(Sema *SemaRef, const DeclaratorDecl *D) {
     return true;
 
   if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(D)) {
+    // If a constexpr function is referenced for constant evaluation,
+    // don't warn even if it is not odr-used.
+    if (FD->isReferenced() && FD->isConstexpr())
+      return true;
+
     // If this is a function template and none of its specializations is used,
     // we should warn.
     if (FunctionTemplateDecl *Template = FD->getDescribedFunctionTemplate())
diff --git a/clang/lib/StaticAnalyzer/Checkers/EnumCastOutOfRangeChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/EnumCastOutOfRangeChecker.cpp
index 76a1470aaac44..32a73a1d5c44f 100644
--- a/clang/lib/StaticAnalyzer/Checkers/EnumCastOutOfRangeChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/EnumCastOutOfRangeChecker.cpp
@@ -85,8 +85,13 @@ void EnumCastOutOfRangeChecker::reportWarning(CheckerContext &C,
                                               const CastExpr *CE,
                                               const EnumDecl *E) const {
   assert(E && "valid EnumDecl* is expected");
-  if (const ExplodedNode *N = C.generateNonFatalErrorNode()) {
-    std::string ValueStr = "", NameStr = "the enum";
+    auto &SM = C.getSourceManager();
+
+    if (SM.isInSystemHeader(CE->getExprLoc()))
+      return;
+
+    if (const ExplodedNode *N = C.generateNonFatalErrorNode()) {
+      std::string ValueStr = "", NameStr = "the enum";
 
     // Try to add details to the message:
     const auto ConcreteValue =
@@ -101,7 +106,6 @@ void EnumCastOutOfRangeChecker::reportWarning(CheckerContext &C,
     std::string Msg = formatv("The value{0} provided to the cast expression is "
                               "not in the valid range of values for {1}",
                               ValueStr, NameStr);
-
     auto BR = std::make_unique<PathSensitiveBugReport>(EnumValueCastOutOfRange,
                                                        Msg, N);
     bugreporter::trackExpressionValue(N, CE->getSubExpr(), *BR);
diff --git a/clang/test/Analysis/Inputs/enum-system-header.h b/clang/test/Analysis/Inputs/enum-system-header.h
new file mode 100644
index 0000000000000..53c6bdac24093
--- /dev/null
+++ b/clang/test/Analysis/Inputs/enum-system-header.h
@@ -0,0 +1,8 @@
+enum MyEnum {
+  A = 1,
+  B = 2
+};
+
+static inline MyEnum bad_cast(int x) {
+  return (MyEnum)x;
+}
diff --git a/clang/test/Analysis/enum-cast-out-of-range-system-header.cpp b/clang/test/Analysis/enum-cast-out-of-range-system-header.cpp
new file mode 100644
index 0000000000000..073cde677bcd1
--- /dev/null
+++ b/clang/test/Analysis/enum-cast-out-of-range-system-header.cpp
@@ -0,0 +1,12 @@
+// RUN: %clang_analyze_cc1 \
+// RUN:   -analyzer-checker=core,optin.core.EnumCastOutOfRange \
+// RUN:   -isystem %S/Inputs \
+// RUN:   -verify %s
+
+#include "enum-system-header.h"
+
+void test() {
+  bad_cast(100);
+}
+
+// expected-no-diagnostics
diff --git a/clang/test/SemaCXX/warn-unused-constexpr-function.cpp b/clang/test/SemaCXX/warn-unused-constexpr-function.cpp
new file mode 100644
index 0000000000000..329cfe954c1a6
--- /dev/null
+++ b/clang/test/SemaCXX/warn-unused-constexpr-function.cpp
@@ -0,0 +1,24 @@
+// RUN: %clang_cc1 -std=c++14 -Wunused-function -fsyntax-only %s
+
+static constexpr bool returnInt(int) { return true; }
+
+template <bool B>
+struct select;
+
+template <>
+struct select<true> {
+  using type = int;
+};
+
+template <>
+struct select<false> {
+  using type = float;
+};
+
+template <typename T>
+typename select<returnInt(T{})>::type make() {
+  return T{};
+}
+
+int makeInt() { return make<int>(); }
+float makeFloat() { return make<float>(); }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
index 63e265612cbf7..f9cd8c9a68139 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
@@ -250,6 +250,8 @@ Type *LiveRegOptimizer::calculateConvertType(Type *OriginalType) {
 Value *LiveRegOptimizer::convertToOptType(Instruction *V,
                                           BasicBlock::iterator &InsertPt) {
   FixedVectorType *VTy = cast<FixedVectorType>(V->getType());
+  if (VTy->getScalarSizeInBits() % 8 != 0)
+    return nullptr;
   Type *NewTy = calculateConvertType(V->getType());
 
   TypeSize OriginalSize = DL.getTypeSizeInBits(VTy);
@@ -382,7 +384,8 @@ bool LiveRegOptimizer::optimizeLiveType(
     if (!ValMap.contains(D)) {
       BasicBlock::iterator InsertPt = std::next(D->getIterator());
       Value *ConvertVal = convertToOptType(D, InsertPt);
-      assert(ConvertVal);
+      if (!ConvertVal)
+        return false;
       ValMap[D] = ConvertVal;
     }
   }
diff --git a/llvm/test/CodeGen/AMDGPU/odd-vector-bitcast-crash.ll b/llvm/test/CodeGen/AMDGPU/odd-vector-bitcast-crash.ll
new file mode 100644
index 0000000000000..030c2fd9d506a
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/odd-vector-bitcast-crash.ll
@@ -0,0 +1,19 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -O2 %s -o /dev/null
+
+target triple = "amdgcn-amd-amdhsa"
+
+define void @repro(ptr %p, <3 x i1> %mask) {
+  %x = call <3 x i31> @llvm.masked.load.v3i31.p0(
+    ptr %p, <3 x i1> %mask, <3 x i31> zeroinitializer)
+
+  call void @llvm.masked.store.v3i31.p0(
+    <3 x i31> %x, ptr null, <3 x i1> %mask)
+
+  ret void
+}
+
+declare <3 x i31> @llvm.masked.load.v3i31.p0(
+  ptr, <3 x i1>, <3 x i31>)
+
+declare void @llvm.masked.store.v3i31.p0(
+  <3 x i31>, ptr, <3 x i1>)

AMDGPU late codegen prepare may attempt to create invalid
bitcasts for vectors with non-byte-aligned integer element
types such as `<3 x i31>`.

Skip the live register optimization for vector types whose
element bit widths are not byte aligned.

Add a regression test for the assertion crash.
@PlutoDog95 PlutoDog95 force-pushed the amdgpu-fix-odd-vector-bitcast branch from f39733f to 338e709 Compare May 8, 2026 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:static analyzer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AMDGPU] llc assertion failure Invalid cast! on masked <3 x i31> load/store

1 participant