Skip to content

[VPlan] Unify inner and outer loop paths (NFCI).#192868

Merged
fhahn merged 12 commits intollvm:mainfrom
fhahn:vplan-unify-native-path
May 8, 2026
Merged

[VPlan] Unify inner and outer loop paths (NFCI).#192868
fhahn merged 12 commits intollvm:mainfrom
fhahn:vplan-unify-native-path

Conversation

@fhahn
Copy link
Copy Markdown
Contributor

@fhahn fhahn commented Apr 19, 2026

Move combine the logic of tryToBuildVPlanWithVPRecipes and tryToBuildVPlan, as well as planInVPlanNativePath and plan.

This unifies the code paths to construct plans for both inner and outer loop vectorization, and removes some duplication. It also ensures we run almost the same VPlan-transformations in both modes. Currently a few code paths need to be guarded with a check if we are dealing with an inner and outer loop.

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Apr 19, 2026

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Move combine the logic of tryToBuildVPlanWithVPRecipes and tryToBuildVPlan, as well as planInVPlanNativePath and plan.

This unifies the code paths to construct plans for both inner and outer loop vectorization, and removes some duplication. It also ensures we run almost the same VPlan-transformations in both modes. Currently a few code paths need to be guarded with a check if we are dealing with an inner and outer loop.


Patch is 29.93 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/192868.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h (+10-26)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+164-248)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (-21)
  • (modified) llvm/test/Transforms/LoopVectorize/explicit_outer_detection.ll (+1-1)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h b/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
index 56a2fc8ecd07a..58b642b54a2ec 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
@@ -562,10 +562,6 @@ class LoopVectorizationPlanner {
   /// interleaving should be avoided up-front, no plans are generated.
   void plan(ElementCount UserVF, unsigned UserIC);
 
-  /// Use the VPlan-native path to plan how to best vectorize, return the best
-  /// VF and its cost.
-  VectorizationFactor planInVPlanNativePath(ElementCount UserVF);
-
   /// Return the VPlan for \p VF. At the moment, there is always a single VPlan
   /// for each VF.
   VPlan &getPlanFor(ElementCount VF) const;
@@ -654,34 +650,22 @@ class LoopVectorizationPlanner {
       unsigned OrigLoopInvocationWeight, unsigned EstimatedVFxUF,
       bool DisableRuntimeUnroll);
 
-protected:
-  /// Build VPlans for power-of-2 VF's between \p MinVF and \p MaxVF inclusive,
-  /// according to the information gathered by Legal when it checked if it is
-  /// legal to vectorize the loop.
-  void buildVPlans(ElementCount MinVF, ElementCount MaxVF);
-
 private:
-  /// Build a VPlan according to the information gathered by Legal. \return a
-  /// VPlan for vectorization factors \p Range.Start and up to \p Range.End
-  /// exclusive, possibly decreasing \p Range.End. If no VPlan can be built for
-  /// the input range, set the largest included VF to the maximum VF for which
-  /// no plan could be built.
-  VPlanPtr tryToBuildVPlan(VFRange &Range);
-
-  /// Build a VPlan using VPRecipes according to the information gather by
-  /// Legal. This method is only used for the legacy inner loop vectorizer.
-  /// \p Range's largest included VF is restricted to the maximum VF the
-  /// returned VPlan is valid for. If no VPlan can be built for the input range,
-  /// set the largest included VF to the maximum VF for which no plan could be
-  /// built. Each VPlan is built starting from a copy of \p InitialPlan, which
-  /// is a plain CFG VPlan wrapping the original scalar loop.
+  /// Build a VPlan using VPRecipes according to the information gathered by
+  /// Legal and VPlan-based analysis. For outer loops, performs basic recipe
+  /// conversion only. For inner loops, \p Range's largest included VF is
+  /// restricted to the maximum VF the returned VPlan is valid for. If no VPlan
+  /// can be built for the input range, set the largest included VF to the
+  /// maximum VF for which no plan could be built. Each VPlan is built starting
+  /// from a copy of \p InitialPlan, which is a plain CFG VPlan wrapping the
+  /// original scalar loop.
   VPlanPtr tryToBuildVPlanWithVPRecipes(VPlanPtr InitialPlan, VFRange &Range,
                                         LoopVersioning *LVer);
 
   /// Build VPlans for power-of-2 VF's between \p MinVF and \p MaxVF inclusive,
   /// according to the information gathered by Legal when it checked if it is
-  /// legal to vectorize the loop. This method creates VPlans using VPRecipes.
-  void buildVPlansWithVPRecipes(ElementCount MinVF, ElementCount MaxVF);
+  /// legal to vectorize the loop.
+  void buildVPlans(ElementCount MinVF, ElementCount MaxVF);
 
   /// Add ComputeReductionResult recipes to the middle block to compute the
   /// final reduction results. Add Select recipes to the latch block when
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index e17a5b5434664..e13acdfa6bc9d 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -3446,8 +3446,61 @@ FixedScalableVFPair LoopVectorizationCostModel::computeFeasibleMaxVF(
   return Result;
 }
 
+// This function will select a scalable VF if the target supports scalable
+// vectors and a fixed one otherwise.
+// TODO: we could return a pair of values that specify the max VF and
+// min VF, to be used in `buildVPlans(MinVF, MaxVF)` instead of
+// `buildVPlans(VF, VF)`. We cannot do it because VPLAN at the moment
+// doesn't have a cost model that can choose which plan to execute if
+// more than one is generated.
+static ElementCount determineVPlanVF(const TargetTransformInfo &TTI,
+                                     LoopVectorizationCostModel &CM) {
+  auto [_, WidestType] = CM.getSmallestAndWidestTypes();
+
+  auto RegKind = TTI.enableScalableVectorization()
+                     ? TargetTransformInfo::RGK_ScalableVector
+                     : TargetTransformInfo::RGK_FixedWidthVector;
+
+  TypeSize RegSize = TTI.getRegisterBitWidth(RegKind);
+  unsigned N = RegSize.getKnownMinValue() / WidestType;
+  return ElementCount::get(N, RegSize.isScalable());
+}
+
 FixedScalableVFPair
 LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
+  // For outer loops, use simple type-based heuristic VF. No cost model or
+  // memory dependence analysis is available.
+  if (!TheLoop->isInnermost()) {
+    ElementCount VF = UserVF;
+    if (VF.isZero()) {
+      VF = determineVPlanVF(TTI, *this);
+      LLVM_DEBUG(dbgs() << "LV: VPlan computed VF " << VF << ".\n");
+
+      // Make sure we have a VF > 1 for stress testing.
+      if (VPlanBuildStressTest && VF.isScalar()) {
+        LLVM_DEBUG(dbgs() << "LV: VPlan stress testing: "
+                          << "overriding computed VF.\n");
+        VF = ElementCount::getFixed(4);
+      }
+    } else if (VF.isScalable() && !TTI.supportsScalableVectors() &&
+               !ForceTargetSupportsScalableVectors) {
+      reportVectorizationFailure(
+          "Scalable vectorization requested but not supported by the target",
+          "the scalable user-specified vectorization width for outer-loop "
+          "vectorization cannot be used because the target does not support "
+          "scalable vectors.",
+          "ScalableVFUnfeasible", ORE, TheLoop);
+      return FixedScalableVFPair::getNone();
+    }
+    assert(isPowerOf2_32(VF.getKnownMinValue()) &&
+           "VF needs to be a power of two");
+    if (VF.isScalar())
+      return FixedScalableVFPair::getNone();
+    LLVM_DEBUG(dbgs() << "LV: Using " << (!UserVF.isZero() ? "user " : "")
+                      << "VF " << VF << " to build VPlans.\n");
+    return FixedScalableVFPair(VF);
+  }
+
   if (Legal->getRuntimePointerChecking()->Need && TTI.hasBranchDivergence()) {
     // TODO: It may be useful to do since it's still likely to be dynamically
     // uniform if the target can skip.
@@ -6532,85 +6585,7 @@ void LoopVectorizationCostModel::collectInLoopReductions() {
   }
 }
 
-// This function will select a scalable VF if the target supports scalable
-// vectors and a fixed one otherwise.
-// TODO: we could return a pair of values that specify the max VF and
-// min VF, to be used in `buildVPlans(MinVF, MaxVF)` instead of
-// `buildVPlans(VF, VF)`. We cannot do it because VPLAN at the moment
-// doesn't have a cost model that can choose which plan to execute if
-// more than one is generated.
-static ElementCount determineVPlanVF(const TargetTransformInfo &TTI,
-                                     LoopVectorizationCostModel &CM) {
-  unsigned WidestType;
-  std::tie(std::ignore, WidestType) = CM.getSmallestAndWidestTypes();
-
-  TargetTransformInfo::RegisterKind RegKind =
-      TTI.enableScalableVectorization()
-          ? TargetTransformInfo::RGK_ScalableVector
-          : TargetTransformInfo::RGK_FixedWidthVector;
-
-  TypeSize RegSize = TTI.getRegisterBitWidth(RegKind);
-  unsigned N = RegSize.getKnownMinValue() / WidestType;
-  return ElementCount::get(N, RegSize.isScalable());
-}
-
-VectorizationFactor
-LoopVectorizationPlanner::planInVPlanNativePath(ElementCount UserVF) {
-  ElementCount VF = UserVF;
-  // Outer loop handling: They may require CFG and instruction level
-  // transformations before even evaluating whether vectorization is profitable.
-  // Since we cannot modify the incoming IR, we need to build VPlan upfront in
-  // the vectorization pipeline.
-  if (!OrigLoop->isInnermost()) {
-    // If the user doesn't provide a vectorization factor, determine a
-    // reasonable one.
-    if (UserVF.isZero()) {
-      VF = determineVPlanVF(TTI, CM);
-      LLVM_DEBUG(dbgs() << "LV: VPlan computed VF " << VF << ".\n");
-
-      // Make sure we have a VF > 1 for stress testing.
-      if (VPlanBuildStressTest && (VF.isScalar() || VF.isZero())) {
-        LLVM_DEBUG(dbgs() << "LV: VPlan stress testing: "
-                          << "overriding computed VF.\n");
-        VF = ElementCount::getFixed(4);
-      }
-    } else if (UserVF.isScalable() && !TTI.supportsScalableVectors() &&
-               !ForceTargetSupportsScalableVectors) {
-      LLVM_DEBUG(dbgs() << "LV: Not vectorizing. Scalable VF requested, but "
-                        << "not supported by the target.\n");
-      reportVectorizationFailure(
-          "Scalable vectorization requested but not supported by the target",
-          "the scalable user-specified vectorization width for outer-loop "
-          "vectorization cannot be used because the target does not support "
-          "scalable vectors.",
-          "ScalableVFUnfeasible", ORE, OrigLoop);
-      return VectorizationFactor::Disabled();
-    }
-    assert(EnableVPlanNativePath && "VPlan-native path is not enabled.");
-    assert(isPowerOf2_32(VF.getKnownMinValue()) &&
-           "VF needs to be a power of two");
-    LLVM_DEBUG(dbgs() << "LV: Using " << (!UserVF.isZero() ? "user " : "")
-                      << "VF " << VF << " to build VPlans.\n");
-    buildVPlans(VF, VF);
-
-    if (VPlans.empty())
-      return VectorizationFactor::Disabled();
-
-    // For VPlan build stress testing, we bail out after VPlan construction.
-    if (VPlanBuildStressTest)
-      return VectorizationFactor::Disabled();
-
-    return {VF, 0 /*Cost*/, 0 /* ScalarCost */};
-  }
-
-  LLVM_DEBUG(
-      dbgs() << "LV: Not vectorizing. Inner loops aren't supported in the "
-                "VPlan-native path.\n");
-  return VectorizationFactor::Disabled();
-}
-
 void LoopVectorizationPlanner::plan(ElementCount UserVF, unsigned UserIC) {
-  assert(OrigLoop->isInnermost() && "Inner loop expected.");
   CM.collectValuesToIgnore();
   CM.collectElementTypesForWidening();
 
@@ -6618,6 +6593,16 @@ void LoopVectorizationPlanner::plan(ElementCount UserVF, unsigned UserIC) {
   if (!MaxFactors) // Cases that should not to be vectorized nor interleaved.
     return;
 
+  if (!OrigLoop->isInnermost()) {
+    // For outer loops, computeMaxVF returns a single non-scalar VF; build a
+    // plan for only that VF.
+    ElementCount VF =
+        MaxFactors.FixedVF ? MaxFactors.FixedVF : MaxFactors.ScalableVF;
+    buildVPlans(VF, VF);
+    LLVM_DEBUG(printPlans(dbgs()));
+    return;
+  }
+
   // Invalidate interleave groups if all blocks of loop will be predicated.
   if (CM.blockNeedsPredicationForAnyReason(OrigLoop->getHeader()) &&
       !useMaskedInterleavedAccesses(TTI)) {
@@ -6656,9 +6641,9 @@ void LoopVectorizationPlanner::plan(ElementCount UserVF, unsigned UserIC) {
             ElementCount::isKnownLT(EpilogueUserVF, UserVF) &&
             CM.selectUserVectorizationFactor(EpilogueUserVF)) {
           // Build a separate plan for the forced epilogue VF.
-          buildVPlansWithVPRecipes(EpilogueUserVF, EpilogueUserVF);
+          buildVPlans(EpilogueUserVF, EpilogueUserVF);
         }
-        buildVPlansWithVPRecipes(UserVF, UserVF);
+        buildVPlans(UserVF, UserVF);
         LLVM_DEBUG(printPlans(dbgs()));
         return;
       }
@@ -6677,13 +6662,11 @@ void LoopVectorizationPlanner::plan(ElementCount UserVF, unsigned UserIC) {
     VFCandidates.push_back(VF);
 
   CM.collectInLoopReductions();
-  for (const auto &VF : VFCandidates) {
-    // Collect Uniform and Scalar instructions after vectorization with VF.
+  for (auto VF : VFCandidates)
     CM.collectNonVectorizedAndSetWideningDecisions(VF);
-  }
 
-  buildVPlansWithVPRecipes(ElementCount::getFixed(1), MaxFactors.FixedVF);
-  buildVPlansWithVPRecipes(ElementCount::getScalable(1), MaxFactors.ScalableVF);
+  buildVPlans(ElementCount::getFixed(1), MaxFactors.FixedVF);
+  buildVPlans(ElementCount::getScalable(1), MaxFactors.ScalableVF);
 
   LLVM_DEBUG(printPlans(dbgs()));
 }
@@ -6917,6 +6900,12 @@ LoopVectorizationPlanner::computeBestVF() {
     }
   }
 
+  // For outer loops, the plan has a single vector VF determined by the
+  // heuristic. Return it directly since there is no scalar VF plan for cost
+  // comparison.
+  if (!OrigLoop->isInnermost())
+    return {VectorizationFactor(FirstPlan.getSingleVF(), 0, 0), &FirstPlan};
+
   LLVM_DEBUG(dbgs() << "LV: Computing best VF using cost kind: "
                     << (CM.CostKind == TTI::TCK_RecipThroughput
                             ? "Reciprocal Throughput\n"
@@ -7657,34 +7646,42 @@ VPRecipeBuilder::tryToCreateWidenNonPhiRecipe(VPSingleDefRecipe *R,
 // optimizations.
 static void printOptimizedVPlan(VPlan &) {}
 
-void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
-                                                        ElementCount MaxVF) {
+void LoopVectorizationPlanner::buildVPlans(ElementCount MinVF,
+                                           ElementCount MaxVF) {
   if (ElementCount::isKnownGT(MinVF, MaxVF))
     return;
 
-  assert(OrigLoop->isInnermost() && "Inner loop expected.");
-
-  const LoopAccessInfo *LAI = Legal->getLAI();
-  LoopVersioning LVer(*LAI, LAI->getRuntimePointerChecking()->getChecks(),
-                      OrigLoop, LI, DT, PSE.getSE());
-  if (!LAI->getRuntimePointerChecking()->getChecks().empty() &&
-      !LAI->getRuntimePointerChecking()->getDiffChecks()) {
-    // Only use noalias metadata when using memory checks guaranteeing no
-    // overlap across all iterations.
-    LVer.prepareNoAliasMetadata();
+  bool IsInnerLoop = OrigLoop->isInnermost();
+
+  // Set up loop versioning for inner loops with memory runtime checks.
+  // Outer loops don't have LoopAccessInfo since canVectorizeMemory() is not
+  // called for them.
+  std::optional<LoopVersioning> LVer;
+  if (IsInnerLoop) {
+    const LoopAccessInfo *LAI = Legal->getLAI();
+    LVer.emplace(*LAI, LAI->getRuntimePointerChecking()->getChecks(), OrigLoop,
+                 LI, DT, PSE.getSE());
+    if (!LAI->getRuntimePointerChecking()->getChecks().empty() &&
+        !LAI->getRuntimePointerChecking()->getDiffChecks()) {
+      // Only use noalias metadata when using memory checks guaranteeing no
+      // overlap across all iterations.
+      LVer->prepareNoAliasMetadata();
+    }
   }
 
   // Create initial base VPlan0, to serve as common starting point for all
   // candidates built later for specific VF ranges.
   auto VPlan0 = VPlanTransforms::buildVPlan0(
       OrigLoop, *LI, Legal->getWidestInductionType(),
-      getDebugLocFromInstOrOperands(Legal->getPrimaryInduction()), PSE, &LVer);
+      getDebugLocFromInstOrOperands(Legal->getPrimaryInduction()), PSE,
+      LVer ? &*LVer : nullptr);
 
-  // Create recipes for header phis.
-  RUN_VPLAN_PASS(VPlanTransforms::createHeaderPhiRecipes, *VPlan0, PSE,
-                 *OrigLoop, Legal->getInductionVars(),
-                 Legal->getReductionVars(), Legal->getFixedOrderRecurrences(),
-                 CM.getInLoopReductions(), Hints.allowReordering());
+  // Create recipes for header phis. For outer loops, reductions, recurrences
+  // and in-loop reductions are empty since legality doesn't detect them.
+  RUN_VPLAN_PASS(VPlanTransforms::createHeaderPhiRecipes,
+      *VPlan0, PSE, *OrigLoop, Legal->getInductionVars(),
+      Legal->getReductionVars(), Legal->getFixedOrderRecurrences(),
+      CM.getInLoopReductions(), Hints.allowReordering());
 
   RUN_VPLAN_PASS(VPlanTransforms::simplifyRecipes, *VPlan0);
   // If we're vectorizing a loop with an uncountable exit, make sure that the
@@ -7707,40 +7704,59 @@ void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
   RUN_VPLAN_PASS(VPlanTransforms::createLoopRegions, *VPlan0);
   if (CM.foldTailByMasking())
     RUN_VPLAN_PASS(VPlanTransforms::foldTailByMasking, *VPlan0);
-  RUN_VPLAN_PASS(VPlanTransforms::introduceMasksAndLinearize, *VPlan0);
+  // introduceMasksAndLinearize does not support nested loop regions yet.
+  if (IsInnerLoop)
+    RUN_VPLAN_PASS(VPlanTransforms::introduceMasksAndLinearize,
+                             *VPlan0);
 
   auto MaxVFTimes2 = MaxVF * 2;
   for (ElementCount VF = MinVF; ElementCount::isKnownLT(VF, MaxVFTimes2);) {
     VFRange SubRange = {VF, MaxVFTimes2};
-    if (auto Plan = tryToBuildVPlanWithVPRecipes(
-            std::unique_ptr<VPlan>(VPlan0->duplicate()), SubRange, &LVer)) {
-      // Now optimize the initial VPlan.
-      VPlanTransforms::hoistPredicatedLoads(*Plan, PSE, OrigLoop);
-      VPlanTransforms::sinkPredicatedStores(*Plan, PSE, OrigLoop);
-      RUN_VPLAN_PASS(VPlanTransforms::truncateToMinimalBitwidths, *Plan,
-                     CM.getMinimalBitwidths());
-      RUN_VPLAN_PASS(VPlanTransforms::optimize, *Plan);
-      // TODO: try to put addExplicitVectorLength close to addActiveLaneMask
-      if (CM.foldTailWithEVL()) {
-        RUN_VPLAN_PASS(VPlanTransforms::addExplicitVectorLength, *Plan,
-                       CM.getMaxSafeElements());
-        RUN_VPLAN_PASS(VPlanTransforms::optimizeEVLMasks, *Plan);
-      }
+    auto Plan = tryToBuildVPlanWithVPRecipes(
+        std::unique_ptr<VPlan>(VPlan0->duplicate()), SubRange,
+        LVer ? &*LVer : nullptr);
+    VF = SubRange.End;
 
-      if (auto P = VPlanTransforms::narrowInterleaveGroups(*Plan, TTI))
-        VPlans.push_back(std::move(P));
+    if (!Plan)
+      continue;
 
-      RUN_VPLAN_PASS_NO_VERIFY(printOptimizedVPlan, *Plan);
-      assert(verifyVPlanIsValid(*Plan) && "VPlan is invalid");
-      VPlans.push_back(std::move(Plan));
+    VPlanTransforms::hoistPredicatedLoads(*Plan, PSE, OrigLoop);
+    VPlanTransforms::sinkPredicatedStores(*Plan, PSE, OrigLoop);
+    RUN_VPLAN_PASS(VPlanTransforms::truncateToMinimalBitwidths, *Plan,
+                   CM.getMinimalBitwidths());
+    RUN_VPLAN_PASS(VPlanTransforms::optimize, *Plan);
+    // TODO: try to put addExplicitVectorLength close to addActiveLaneMask
+    if (CM.foldTailWithEVL()) {
+      RUN_VPLAN_PASS(VPlanTransforms::addExplicitVectorLength, *Plan,
+                     CM.getMaxSafeElements());
+      RUN_VPLAN_PASS(VPlanTransforms::optimizeEVLMasks, *Plan);
     }
-    VF = SubRange.End;
+
+    if (auto P = VPlanTransforms::narrowInterleaveGroups(*Plan, TTI))
+      VPlans.push_back(std::move(P));
+
+    RUN_VPLAN_PASS_NO_VERIFY(printOptimizedVPlan, *Plan);
+    assert(verifyVPlanIsValid(*Plan) && "VPlan is invalid");
+    VPlans.push_back(std::move(Plan));
   }
 }
 
 VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
     VPlanPtr Plan, VFRange &Range, LoopVersioning *LVer) {
 
+  // For outer loops, the plan only needs basic recipe conversion and induction
+  // live-out optimization; the full inner-loop recipe building below does not
+  // apply (no widening decisions, interleave groups, reductions, etc.).
+  if (!OrigLoop->isInnermost()) {
+    for (ElementCount VF : Range)
+      Plan->addVF(VF);
+    if (!VPlanTransforms::tryToConvertVPInstructionsToVPRecipes(*Plan, *TLI))
+      return nullptr;
+    VPlanTransforms::optimizeInductionLiveOutUsers(*Plan, PSE,
+                                                   /*FoldTail=*/false);
+    return Plan;
+  }
+
   using namespace llvm::VPlanPatternMatch;
   SmallPtrSet<const InterleaveGroup<Instruction> *, 1> InterleaveGroups;
 
@@ -7973,46 +7989,6 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
   return Plan;
 }
 
-VPlanPtr LoopVectorizationPlanner::tryToBuildVPlan(VFRange &Range) {
-  // Outer loop handling: They may require CFG and instruction level
-  // transformations before even evaluating whether vectorization is profitable.
-  // Since we cannot modify the incoming IR, we need to build VPlan upfront in
-  // the vectorization pipeline.
-  assert(!OrigLoop->isInnermost());
-  assert(EnableVPlanNativePath && "VPlan-native path is not enabled.");
-
-  auto Plan = VPlanTransforms::buildVPlan0(
-      OrigLoop, *LI, Legal->getWidestInductionType(),
-      getDebugLocFromInstOrOperands(Legal->getPrimaryInduction()), PSE);
-
-  VPlanTransform...
[truncated]

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 19, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

Move combine the logic of tryToBuildVPlanWithVPRecipes and
tryToBuildVPlan, as well as planInVPlanNativePath and plan.

This unifies the code paths to construct plans for both inner and outer
loop vectorization, and removes some duplication. It also ensures we run
almost the same VPlan-transformations in both modes. Currently a few
code paths need to be guarded with a check if we are dealing with an
inner and outer loop.
@fhahn fhahn force-pushed the vplan-unify-native-path branch from 81b93f5 to bba732e Compare April 19, 2026 19:46
Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Outdated
Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Outdated
Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Outdated
Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 19, 2026

🐧 Linux x64 Test Results

  • 195084 tests passed
  • 5189 tests skipped

✅ The build succeeded and all tests passed.

Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
fhahn added a commit that referenced this pull request Apr 24, 2026
)

Reduce nesting by using early continue, split off from
#192868
llvm-sync Bot pushed a commit to arm/arm-toolchain that referenced this pull request Apr 24, 2026
…NFC). (#193979)

Reduce nesting by using early continue, split off from
llvm/llvm-project#192868
llvm-upstreamsync Bot pushed a commit to qualcomm/cpullvm-toolchain that referenced this pull request Apr 24, 2026
…NFC). (#193979)

Reduce nesting by using early continue, split off from
llvm/llvm-project#192868
Copy link
Copy Markdown
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the patch is much clearer now, and it LGTM! Excellent improvement!

Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
cpullvm-upstream-sync Bot pushed a commit to navaneethshan/cpullvm-toolchain-1 that referenced this pull request Apr 24, 2026
…NFC). (#193979)

Reduce nesting by using early continue, split off from
llvm/llvm-project#192868
Copy link
Copy Markdown
Collaborator

@ayalz ayalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post-approval comments.

// For outer loops, the plan only needs basic recipe conversion and induction
// live-out optimization; the full inner-loop recipe building below does not
// apply (no widening decisions, interleave groups, reductions, etc.).
if (!OrigLoop->isInnermost()) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once Plan is built, could it inform whether it models an outer or innermost loop, rather than continuing to rely on OrigLoop?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done by adding hasOuterLoop() checking if the top-level loop region contains another loop region

Comment on lines +2977 to +2981
// TODO: we could return a pair of values that specify the max VF and
// min VF, to be used in `buildVPlans(MinVF, MaxVF)` instead of
// `buildVPlans(VF, VF)`. We cannot do it because VPLAN at the moment
// doesn't have a cost model that can choose which plan to execute if
// more than one is generated.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TODO retains the existing one, but going forward outerloops should compute their Max/VF using the methods doing so for innermost loops. Allowing only VF-agnostic decisions in outerloops should result in a single VPlan for the entire feasible range, from which the desired VF can be selected, even w/o / before cost model support.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

// `buildVPlans(VF, VF)`. We cannot do it because VPLAN at the moment
// doesn't have a cost model that can choose which plan to execute if
// more than one is generated.
static ElementCount determineVPlanVF(const TargetTransformInfo &TTI,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static ElementCount determineVPlanVF(const TargetTransformInfo &TTI,
static ElementCount computeVPlanOuterloopVF(const TargetTransformInfo &TTI,

this retains the existing version, but worth renaming more accurately, and consistent with other compute*VF()'s.

Could this type-based-only factor serve the innermost Max VF computations too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the name. Will check if we can re-use this for inner loop as well, but the logic there is quite a bit more involved

Comment on lines +3000 to +3026
ElementCount VF = UserVF;
if (VF.isZero()) {
VF = determineVPlanVF(TTI, Config);
LLVM_DEBUG(dbgs() << "LV: VPlan computed VF " << VF << ".\n");

// Make sure we have a VF > 1 for stress testing.
if (VPlanBuildStressTest && VF.isScalar()) {
LLVM_DEBUG(dbgs() << "LV: VPlan stress testing: "
<< "overriding computed VF.\n");
VF = ElementCount::getFixed(4);
}
} else if (VF.isScalable() && !Config.supportsScalableVectors()) {
reportVectorizationFailure(
"Scalable vectorization requested but not supported by the target",
"the scalable user-specified vectorization width for outer-loop "
"vectorization cannot be used because the target does not support "
"scalable vectors.",
"ScalableVFUnfeasible", ORE, TheLoop);
return FixedScalableVFPair::getNone();
}
assert(isPowerOf2_32(VF.getKnownMinValue()) &&
"VF needs to be a power of two");
if (VF.isScalar())
return FixedScalableVFPair::getNone();
LLVM_DEBUG(dbgs() << "LV: Using " << (!UserVF.isZero() ? "user " : "")
<< "VF " << VF << " to build VPlans.\n");
return FixedScalableVFPair(VF);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This retains the current code, but worth considering folding it all under computeVPlanOuterloopVF().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, also moved to VFSelectionContext, thanks

Comment on lines +5844 to +5845
// For outer loops, computeMaxVF returns a single non-scalar VF; build a
// plan for only that VF.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the current outerloop behavior be consistent with UserVF/UserIC-dictated innermost behavior, both resulting in a single plan. E.g., should computeMaxVF() also indicate if it returned a single MaxVF (scalable or not) which is also the MinVF.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave this separate in the initial patch, as there's more logic we need to run when selecting the user VF for inner loops

@@ -6153,6 +6139,15 @@ LoopVectorizationPlanner::computeBestVF() {
return {VectorizationFactor::Disabled(), nullptr};
// If there is a single VPlan with a single VF, return it directly.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be good to return to the above behavior: if no plans exist, return null; if only one - assert either UsedVF or outerloop - return it. (If two - check if UserVF and EpilogUserVF, etc.).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged, thanks


// For outer loops, the plan has a single vector VF determined by the
// heuristic.
if (!OrigLoop->isInnermost()) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better ask FirstPlan if it models an outerloop, than OrigLoop.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged, thanks

if (!OrigLoop->isInnermost()) {
for (ElementCount VF : Range)
Plan->addVF(VF);
if (!VPlanTransforms::tryToConvertVPInstructionsToVPRecipes(*Plan, *TLI))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plan will be discarded later, automatically?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep


// Analyze interleaved memory accesses.
if (UseInterleaved)
if (UseInterleaved && IsInnerLoop)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should IsInnerLoop be folded into UseInterleaved?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks

if (ORE->allowExtraAnalysis(LV_NAME))
// For VPlan build stress testing of outer loops, bail after plan
// construction.
if (!IsInnerLoop && VPlanBuildStressTest)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should !IsInnerLoop be folded into VPlanBuildStressTest, better renamed VPlanBuildOuterloopStressTest.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortuantely given that VPlanBuildStressTest is an option, so I don't think there's a good way to set this at a single place.

@fhahn fhahn force-pushed the vplan-unify-native-path branch from efc0a5c to 47e9f9b Compare April 28, 2026 09:25
yingopq pushed a commit to yingopq/llvm-project that referenced this pull request Apr 29, 2026
KHicketts pushed a commit to KHicketts/llvm-project that referenced this pull request Apr 30, 2026
@fhahn fhahn merged commit b84f58e into llvm:main May 8, 2026
9 of 10 checks passed
@fhahn fhahn deleted the vplan-unify-native-path branch May 8, 2026 19:37
@llvm-ci
Copy link
Copy Markdown

llvm-ci commented May 8, 2026

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-ubuntu running on as-builder-4 while building llvm at step 7 "test-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/187/builds/19953

Here is the relevant piece of the build log for the reference
Step 7 (test-check-all) failure: Test just built components: check-all completed (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/outer_loop_prefer_scalable.ll' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 2
/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt -S -mtriple aarch64 -mattr=+sve -passes=loop-vectorize -enable-vplan-native-path < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/outer_loop_prefer_scalable.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/outer_loop_prefer_scalable.ll
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt -S -mtriple aarch64 -mattr=+sve -passes=loop-vectorize -enable-vplan-native-path
# .---command stderr------------
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
# | Stack dump:
# | 0.	Program arguments: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt -S -mtriple aarch64 -mattr=+sve -passes=loop-vectorize -enable-vplan-native-path
# | 1.	Running pass "function(loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>)" on module "<stdin>"
# | 2.	Running pass "loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>" on function "foo"
# |   #0 0x0000641370f14e58 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x5251e58)
# |   #1 0x0000641370f11c71 llvm::sys::RunSignalHandlers() (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x524ec71)
# |   #2 0x0000641370f15ff1 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
# |   #3 0x00007ea3e8a45330 (/lib/x86_64-linux-gnu/libc.so.6+0x45330)
# |   #4 0x0000641372645b61 llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b61)
# |   #5 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |   #6 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |   #7 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |   #8 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |   #9 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #10 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #11 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #12 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #13 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #14 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #15 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #16 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #17 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #18 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #19 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #20 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #21 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #22 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #23 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #24 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #25 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #26 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #27 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #28 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #29 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #30 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #31 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #32 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #33 0x0000641372645b9e llvm::VPTypeAnalysis::inferScalarTypeForRecipe(llvm::VPInstruction const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982b9e)
# |  #34 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
# |  #35 0x0000641372645960 llvm::VPTypeAnalysis::inferScalarType(llvm::VPValue const*) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt+0x6982960)
...

llvm-sync Bot pushed a commit to arm/arm-toolchain that referenced this pull request May 8, 2026
Move combine the logic of tryToBuildVPlanWithVPRecipes and
tryToBuildVPlan, as well as planInVPlanNativePath and plan.

This unifies the code paths to construct plans for both inner and outer
loop vectorization, and removes some duplication. It also ensures we run
almost the same VPlan-transformations in both modes. Currently a few
code paths need to be guarded with a check if we are dealing with an
inner and outer loop.

PR: llvm/llvm-project#192868
llvm-upstreamsync Bot pushed a commit to qualcomm/cpullvm-toolchain that referenced this pull request May 8, 2026
Move combine the logic of tryToBuildVPlanWithVPRecipes and
tryToBuildVPlan, as well as planInVPlanNativePath and plan.

This unifies the code paths to construct plans for both inner and outer
loop vectorization, and removes some duplication. It also ensures we run
almost the same VPlan-transformations in both modes. Currently a few
code paths need to be guarded with a check if we are dealing with an
inner and outer loop.

PR: llvm/llvm-project#192868
fhahn added a commit that referenced this pull request May 8, 2026
…6634)

For phis check if any of the operands are VPIRValues or we already have
cached types. If so, return them.

This fixes a verification stack overflow in the VPlan outer loop path
after #192868.
llvm-upstreamsync Bot pushed a commit to qualcomm/cpullvm-toolchain that referenced this pull request May 8, 2026
…plans. (#196634)

For phis check if any of the operands are VPIRValues or we already have
cached types. If so, return them.

This fixes a verification stack overflow in the VPlan outer loop path
after llvm/llvm-project#192868.
cpullvm-upstream-sync Bot pushed a commit to navaneethshan/cpullvm-toolchain-1 that referenced this pull request May 8, 2026
Move combine the logic of tryToBuildVPlanWithVPRecipes and
tryToBuildVPlan, as well as planInVPlanNativePath and plan.

This unifies the code paths to construct plans for both inner and outer
loop vectorization, and removes some duplication. It also ensures we run
almost the same VPlan-transformations in both modes. Currently a few
code paths need to be guarded with a check if we are dealing with an
inner and outer loop.

PR: llvm/llvm-project#192868
cpullvm-upstream-sync Bot pushed a commit to navaneethshan/cpullvm-toolchain-1 that referenced this pull request May 8, 2026
…plans. (#196634)

For phis check if any of the operands are VPIRValues or we already have
cached types. If so, return them.

This fixes a verification stack overflow in the VPlan outer loop path
after llvm/llvm-project#192868.
llvm-sync Bot pushed a commit to arm/arm-toolchain that referenced this pull request May 8, 2026
…plans. (#196634)

For phis check if any of the operands are VPIRValues or we already have
cached types. If so, return them.

This fixes a verification stack overflow in the VPlan outer loop path
after llvm/llvm-project#192868.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants