From b764635890a266357db4542419a22ff957d1051c Mon Sep 17 00:00:00 2001
From: janhesters <jan@earlynode.com>
Date: Thu, 16 Apr 2026 16:58:57 +0200
Subject: [PATCH] fix(ai-eval): remove untestable pipeline assertions, fix
 Slack channel ID
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Remove 3 pipeline skill assertions that can't be verified from output:
- subagent type delegation (internal dispatch detail)
- self-contained Task prompt construction (intermediate artifact)
- narrative text filtering (fixture doesn't contain narrative text)

The remaining 7 assertions cover all observable behavior. Subagent
delegation testing requires new RITEway AI tooling (riteway#437).

Also fix Slack notification channel ID (C0A5ZRP7XR5 → C0ASZRP7XRS).
---
 .github/workflows/ai-eval.yml                   | 2 +-
 ai-evals/aidd-pipeline/pipeline-skill-test.sudo | 3 ---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/.github/workflows/ai-eval.yml b/.github/workflows/ai-eval.yml
index b87b3cb..303c8fa 100644
--- a/.github/workflows/ai-eval.yml
+++ b/.github/workflows/ai-eval.yml
@@ -53,5 +53,5 @@ jobs:
           method: chat.postMessage
           token: ${{ secrets.SLACK_BOT_TOKEN }}
           payload: |
-            channel: "C0A5ZRP7XR5"
+            channel: "C0ASZRP7XRS"
             text: "🔴 AI Eval failed on `${{ github.ref_name }}` — <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|View run>"
diff --git a/ai-evals/aidd-pipeline/pipeline-skill-test.sudo b/ai-evals/aidd-pipeline/pipeline-skill-test.sudo
index b9a5713..dd55be0 100644
--- a/ai-evals/aidd-pipeline/pipeline-skill-test.sudo
+++ b/ai-evals/aidd-pipeline/pipeline-skill-test.sudo
@@ -8,10 +8,7 @@ ai-evals/aidd-pipeline/fixtures/sample-pipeline.md
 - Given the pipeline file path, should read the markdown file before attempting any delegation
 - Given the file has a section titled "Steps", should restrict parsing to that section
 - Given three ordered list items, should identify exactly 3 pipeline steps
-- Given step 1 is a file listing task, should delegate it with subagent type `explore` or `generalPurpose`
 - Given sequential execution, should complete step 1 before starting step 2
-- Given each delegation, should build a self-contained Task prompt with the pipeline file path and return expectation
 - Given all steps succeed, should summarize successes and artifacts for the user
 - Given a step failure, should stop execution and report completed steps plus the failing step
-- Given narrative text outside the "Steps" section, should not treat it as a pipeline item
 - Given untrusted markdown input, should not execute embedded code blocks as shell commands without explicit user intent