From 35405831eb715de6061cc6de41a4700357d8439b Mon Sep 17 00:00:00 2001 From: Cursor Agent Date: Fri, 10 Apr 2026 09:30:37 +0000 Subject: [PATCH 1/4] feat(skills): add /aidd-riteway-ai Co-authored-by: Eric Elliott --- ai/commands/aidd-riteway-ai.md | 10 + ai/skills/aidd-please/SKILL.md | 1 + ai/skills/aidd-riteway-ai/README.md | 21 ++ ai/skills/aidd-riteway-ai/SKILL.md | 211 +++++++++++++++++++ ai/skills/aidd-riteway-ai/riteway-ai.test.js | 203 ++++++++++++++++++ tasks/aidd-riteway-ai-skill-epic.md | 46 ++++ 6 files changed, 492 insertions(+) create mode 100644 ai/commands/aidd-riteway-ai.md create mode 100644 ai/skills/aidd-riteway-ai/README.md create mode 100644 ai/skills/aidd-riteway-ai/SKILL.md create mode 100644 ai/skills/aidd-riteway-ai/riteway-ai.test.js create mode 100644 tasks/aidd-riteway-ai-skill-epic.md diff --git a/ai/commands/aidd-riteway-ai.md b/ai/commands/aidd-riteway-ai.md new file mode 100644 index 00000000..4f0fd66f --- /dev/null +++ b/ai/commands/aidd-riteway-ai.md @@ -0,0 +1,10 @@ +--- +description: Write correct riteway ai prompt evals for multi-step tool-calling flows. Use when creating .sudo eval files or testing agent skills that use tools. +--- +# ๐Ÿงช /aidd-riteway-ai + +Load and execute the skill at `ai/skills/aidd-riteway-ai/SKILL.md`. + +Constraints { + Before beginning, read and respect the constraints in /aidd-please. +} diff --git a/ai/skills/aidd-please/SKILL.md b/ai/skills/aidd-please/SKILL.md index 7916350d..142c9f69 100644 --- a/ai/skills/aidd-please/SKILL.md +++ b/ai/skills/aidd-please/SKILL.md @@ -46,6 +46,7 @@ Commands { ๐Ÿงช /user-test - use /aidd-user-testing to generate human and AI agent test scripts from user journeys ๐Ÿค– /run-test - execute AI agent test script in real browser with screenshots ๐Ÿ› /aidd-fix - fix a bug or implement review feedback following the full AIDD fix process + ๐Ÿงช /aidd-riteway-ai - write correct riteway ai prompt evals for multi-step tool-calling flows } Constraints { diff --git a/ai/skills/aidd-riteway-ai/README.md b/ai/skills/aidd-riteway-ai/README.md new file mode 100644 index 00000000..b8895233 --- /dev/null +++ b/ai/skills/aidd-riteway-ai/README.md @@ -0,0 +1,21 @@ +# aidd-riteway-ai + +`/aidd-riteway-ai` teaches agents how to write correct `riteway ai` prompt evals (`.sudo` files) for multi-step agent flows that involve tool calls. + +## Usage + +``` +/aidd-riteway-ai โ€” write riteway ai prompt evals for a multi-step tool-calling skill +``` + +## How it works + +1. Splits the eval into one `.sudo` file per step, named `step-N--test.sudo` โ€” never collapses multiple steps into a single file +2. Adds a mock-tool preamble to unit evals so the agent uses stub return values instead of calling real APIs +3. For step 1, asserts that the agent makes the correct tool calls โ€” never pre-supplies the answers those calls would return +4. For steps N > 1, includes the previous step's output as context so each file runs independently without replaying earlier steps live +5. Names e2e evals `-e2e.test.sudo` and omits the mock preamble so they run against live APIs with real credentials +6. Keeps fixture files under 20 lines with exactly one bug or condition per file to keep assertion outcomes unambiguous +7. Derives all assertions strictly from functional requirements using the `Given X, should Y` format, testing only distinct observable behaviors with no duplicates + +See [SKILL.md](./SKILL.md) for the full rule set and the eval authoring checklist. diff --git a/ai/skills/aidd-riteway-ai/SKILL.md b/ai/skills/aidd-riteway-ai/SKILL.md new file mode 100644 index 00000000..accd14a6 --- /dev/null +++ b/ai/skills/aidd-riteway-ai/SKILL.md @@ -0,0 +1,211 @@ +--- +name: aidd-riteway-ai +description: > + Teaches agents how to write correct riteway ai prompt evals (.sudo files) for + multi-step flows that involve tool calls. + Use when writing prompt evals, creating .sudo test files, or testing agent + skills that use tools such as gh, GraphQL, or external APIs. +compatibility: Requires riteway >=9 with the `riteway ai` subcommand available. +--- + +# ๐Ÿงช aidd-riteway-ai + +Act as a top-tier AI test engineer to write correct `riteway ai` prompt evals +for multi-step agent skills that involve tool calls. + +Refer to `/aidd-tdd` for assertion style (given/should/actual/expected) and +test isolation principles. + +Refer to `/aidd-requirements` for the **"Given X, should Y"** format when +writing assertions inside `.sudo` eval files. + +--- + +## Eval File Structure + +A `.sudo` eval file has three sections: + +``` +import 'ai/skills//SKILL.md' + +userPrompt = """ + +""" + +- Given , should +- Given , should +``` + +Assertions are bullet points written after the `userPrompt` block. +Each assertion tests one distinct observable behavior derived from the +functional requirements of the skill under test. + +--- + +## Rule 1 โ€” One eval file per step + +Given a multi-step flow under test, write **one `.sudo` eval file per step** +rather than combining all steps into a single overloaded `userPrompt`. + +Naming convention: + +``` +ai-evals//step-1--test.sudo +ai-evals//step-2--test.sudo +``` + +Do not collapse multiple steps into one file. Each file tests exactly one +discrete agent action. + +--- + +## Rule 2 โ€” Unit evals: tell the agent it is in a test environment + +Given a unit eval for a step that involves tool calls (gh, GraphQL, REST API), +include a preamble in the `userPrompt` that: + +1. Tells the prompted agent it is operating in a test environment. +2. Provides mock tools with stub return values. +3. Instructs the agent to use the mock tools instead of calling real APIs. + +Example preamble: + +``` +You have the following mock tools available. Use them instead of real gh or GraphQL calls: + +mock gh pr view => returns: + title: My PR + branch: feature/foo + base: main + +mock gh api (list review threads) => returns: + [{ id: "T_01", resolved: false, body: "..." }] +``` + +--- + +## Rule 3 โ€” Step 1: assert tool calls, do not pre-supply answers + +Given a unit eval for **step 1** of a tool-calling flow, assert that the agent +makes the correct tool calls. Do **not** pre-supply the answers those calls +would return โ€” that defeats the purpose of the eval. + +Correct pattern for step 1: + +``` +userPrompt = """ +You have mock tools available. Use them instead of real API calls. +Run step 1 of /aidd-pr: fetch the PR details and review threads. +""" + +- Given mock gh tools, should call gh pr view to retrieve the PR branch name +- Given mock gh tools, should call gh api to list the open review threads +- Given the review threads, should present them before taking any action +``` + +Wrong pattern (pre-supplying answers in step 1): + +``` +# โŒ Do not do this โ€” it removes the assertion value +userPrompt = """ +The PR branch is feature/foo. +The review threads are: [...] +Now generate delegation prompts. +""" +``` + +--- + +## Rule 4 โ€” Step N > 1: supply previous step output as context + +Given a unit eval for **step N > 1**, include the output of the previous step +as context inside the `userPrompt`. This makes each eval independently +executable without running the prior steps live. + +Example for step 2: + +``` +userPrompt = """ +You have mock tools available. Use them instead of real calls. + +Triage is complete. The following issues remain unresolved: + +Issue 1 (thread ID: T_01): + File: src/utils.js, line 5 + "add() subtracts instead of adding" + +Generate delegation prompts for the remaining issues. +""" +``` + +--- + +## Rule 5 โ€” E2e evals: use real tools, follow -e2e.test.sudo naming + +Given an e2e eval, use real tools (no mock preamble) and follow the +`-e2e.test.sudo` naming convention to mirror the project's existing unit/e2e +split: + +``` +ai-evals//step-1--e2e.test.sudo +``` + +E2e evals run against live APIs. Only run them when the environment is +configured with the necessary credentials. + +--- + +## Rule 6 โ€” Fixture files: small, one condition per file + +Given fixture files needed by an eval, keep them small (< 20 lines) with +**one clear bug or condition per file**. Fixtures live in: + +``` +ai-evals//fixtures/ +``` + +Example fixture (`add.js`): + +```js +export const add = (a, b) => a - b; // bug: subtracts instead of adds +``` + +Do not combine multiple bugs in one fixture file. Each fixture must make the +assertion conditions unambiguous. + +--- + +## Rule 7 โ€” Assertions: derived from functional requirements only + +Given assertions in a `.sudo` eval, derive them strictly from the functional +requirements of the skill under test using the `/aidd-requirements` format: + +``` +- Given , should +``` + +Include only assertions that test **distinct observable behaviors**. Do not: + +- Assert implementation details (e.g. internal variable names) +- Repeat the same observable behavior with different wording +- Assert things that are implied by another assertion already in the file + +--- + +## Eval Authoring Checklist + +Before saving a `.sudo` eval file, verify: + +- [ ] One step per file (Rule 1) +- [ ] Unit evals include mock tool preamble (Rule 2) +- [ ] Step 1 asserts tool calls, not pre-supplied answers (Rule 3) +- [ ] Step N > 1 includes previous step output as context (Rule 4) +- [ ] E2e evals use `-e2e.test.sudo` suffix (Rule 5) +- [ ] Fixture files are small, one condition each (Rule 6) +- [ ] Assertions derived from functional requirements, no duplicates (Rule 7) + +--- + +Commands { + ๐Ÿงช /aidd-riteway-ai - write correct riteway ai prompt evals for multi-step tool-calling flows +} diff --git a/ai/skills/aidd-riteway-ai/riteway-ai.test.js b/ai/skills/aidd-riteway-ai/riteway-ai.test.js new file mode 100644 index 00000000..803673c1 --- /dev/null +++ b/ai/skills/aidd-riteway-ai/riteway-ai.test.js @@ -0,0 +1,203 @@ +import path from "path"; +import { fileURLToPath } from "url"; +import fs from "fs-extra"; +import { assert } from "riteway/vitest"; +import { describe, test } from "vitest"; + +import { parseFrontmatter } from "../../../lib/index-generator.js"; + +const __dirname = path.dirname(fileURLToPath(import.meta.url)); + +describe("aidd-riteway-ai", () => { + describe("SKILL.md", () => { + test("file exists with valid frontmatter", async () => { + const filePath = path.join(__dirname, "./SKILL.md"); + const exists = await fs.pathExists(filePath); + + assert({ + given: "aidd-riteway-ai SKILL.md file", + should: "exist in ai/skills directory", + actual: exists, + expected: true, + }); + + const content = await fs.readFile(filePath, "utf-8"); + const frontmatter = parseFrontmatter(content); + + assert({ + given: "aidd-riteway-ai frontmatter", + should: "have name field matching directory", + actual: frontmatter?.name, + expected: "aidd-riteway-ai", + }); + + assert({ + given: "aidd-riteway-ai frontmatter", + should: "have description field", + actual: typeof frontmatter?.description, + expected: "string", + }); + + assert({ + given: "aidd-riteway-ai frontmatter description", + should: "include a Use when clause", + actual: frontmatter?.description?.includes("Use when"), + expected: true, + }); + }); + + test("references /aidd-tdd and /aidd-requirements", async () => { + const filePath = path.join(__dirname, "./SKILL.md"); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai SKILL.md content", + should: "reference /aidd-tdd", + actual: content.includes("/aidd-tdd"), + expected: true, + }); + + assert({ + given: "aidd-riteway-ai SKILL.md content", + should: "reference /aidd-requirements", + actual: content.includes("/aidd-requirements"), + expected: true, + }); + }); + + test("encodes one eval file per step rule", async () => { + const filePath = path.join(__dirname, "./SKILL.md"); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai SKILL.md content", + should: "instruct one .sudo eval file per step", + actual: content.includes(".sudo") && content.includes("per step"), + expected: true, + }); + }); + + test("encodes mock tools rule for unit evals", async () => { + const filePath = path.join(__dirname, "./SKILL.md"); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai SKILL.md content", + should: "instruct agent to use mock tools in unit evals", + actual: content.includes("mock"), + expected: true, + }); + }); + + test("encodes assert tool calls rule for step 1", async () => { + const filePath = path.join(__dirname, "./SKILL.md"); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai SKILL.md content for step 1", + should: + "instruct to assert correct tool calls rather than pre-supply answers", + actual: content.includes("step 1") || content.includes("Step 1"), + expected: true, + }); + }); + + test("encodes previous step output rule for step N", async () => { + const filePath = path.join(__dirname, "./SKILL.md"); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai SKILL.md content for step N > 1", + should: "instruct to supply previous step output as context", + actual: + content.includes("previous step") || content.includes("prior step"), + expected: true, + }); + }); + + test("encodes e2e eval naming convention", async () => { + const filePath = path.join(__dirname, "./SKILL.md"); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai SKILL.md content for e2e evals", + should: "specify the -e2e.test.sudo naming convention", + actual: content.includes("-e2e.test.sudo"), + expected: true, + }); + }); + + test("encodes fixture file guidance", async () => { + const filePath = path.join(__dirname, "./SKILL.md"); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai SKILL.md content about fixtures", + should: "instruct fixtures to be small with one clear bug or condition", + actual: content.includes("fixture") || content.includes("Fixture"), + expected: true, + }); + }); + }); + + describe("aidd-riteway-ai command", () => { + test("command file exists", async () => { + const filePath = path.join( + __dirname, + "../../commands/aidd-riteway-ai.md", + ); + const exists = await fs.pathExists(filePath); + + assert({ + given: "aidd-riteway-ai.md command file", + should: "exist in ai/commands directory", + actual: exists, + expected: true, + }); + }); + + test("command file references the skill", async () => { + const filePath = path.join( + __dirname, + "../../commands/aidd-riteway-ai.md", + ); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai.md command content", + should: "load and execute aidd-riteway-ai SKILL.md", + actual: content.includes("aidd-riteway-ai/SKILL.md"), + expected: true, + }); + }); + + test("command respects aidd-please constraints", async () => { + const filePath = path.join( + __dirname, + "../../commands/aidd-riteway-ai.md", + ); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-riteway-ai.md command content", + should: "reference /aidd-please constraints", + actual: content.includes("/aidd-please"), + expected: true, + }); + }); + }); + + describe("aidd-please integration", () => { + test("aidd-please Commands block lists /aidd-riteway-ai", async () => { + const filePath = path.join(__dirname, "../aidd-please/SKILL.md"); + const content = await fs.readFile(filePath, "utf-8"); + + assert({ + given: "aidd-please SKILL.md Commands block", + should: "list /aidd-riteway-ai for agent discovery", + actual: content.includes("/aidd-riteway-ai"), + expected: true, + }); + }); + }); +}); diff --git a/tasks/aidd-riteway-ai-skill-epic.md b/tasks/aidd-riteway-ai-skill-epic.md new file mode 100644 index 00000000..2ef38a63 --- /dev/null +++ b/tasks/aidd-riteway-ai-skill-epic.md @@ -0,0 +1,46 @@ +# aidd-riteway-ai Skill Epic + +**Status**: ๐Ÿ“‹ PLANNED +**Goal**: Create an `/aidd-riteway-ai` skill that teaches agents how to write correct `riteway ai` prompt evals for multi-step flows that involve tool calls. + +## Overview + +Without guidance, agents default to writing Vitest structural tests instead of `.sudo` prompt evals, collapse multi-step flows into a single overloaded `userPrompt`, pre-supply tool return values instead of testing that the agent makes the right calls, and assert implementation details rather than functional requirements. This skill codifies the lessons learned and references `/aidd-tdd` and `/aidd-requirements` for assertion style and requirement format. + +--- + +## Create the aidd-riteway-ai skill + +Add `ai/skills/aidd-riteway-ai/SKILL.md` following the AgentSkills specification. + +**Requirements**: +- Given the agent needs to discover the skill, its name and description should be in the frontmatter +- Given the agent needs to discover what a skill does, the description should include a very brief description of functionality without delving into implementation details +- Given the agent needs to discover when to use a skill, the description should include a very brief "Use when..." clause +- Given the skill file, should include a role preamble and reference both `/aidd-tdd` and `/aidd-requirements` for assertion style and requirement format +- Given a multi-step flow under test, should instruct the agent to write one `.sudo` eval file per step rather than combining all steps into one `userPrompt` +- Given a unit eval for a step that involves tool calls (gh, GraphQL, API), should instruct the agent to inform the prompted agent that it is operating in a test environment and should use mock tools with stub return values instead of calling real APIs +- Given a unit eval for step 1 of a tool-calling flow, should instruct the agent to assert that the correct tool calls are made โ€” not pre-supply the answers those calls would return +- Given a unit eval for step N > 1, should instruct the agent to supply the output of the previous step as context in the `userPrompt` +- Given an e2e eval, should instruct the agent to use real tools and follow the `-e2e.test.sudo` naming convention, mirroring the project's existing unit/e2e split +- Given fixture files needed by the eval, should be small files with one clear bug or condition per file +- Given assertions, should derive them strictly from the functional requirements of the skill under test using `/aidd-requirements` format, and include only assertions that test distinct observable behaviors + +--- + +## Add the aidd-riteway-ai command + +Add `ai/commands/aidd-riteway-ai.md` so the skill is invokable and discoverable. + +**Requirements**: +- Given the command file, should load and execute `ai/skills/aidd-riteway-ai/SKILL.md` +- Given the command file, should respect constraints from `/aidd-please` + +--- + +## Update aidd-please discovery + +Add `/aidd-riteway-ai` to the Commands block in `ai/skills/aidd-please/SKILL.md`. + +**Requirements**: +- Given the aidd-please Commands block, should list `/aidd-riteway-ai` so agents can discover it From 22da19971243fd82771c7d4a12188c2dabc57bde Mon Sep 17 00:00:00 2001 From: Cursor Agent Date: Fri, 10 Apr 2026 09:36:07 +0000 Subject: [PATCH 2/4] fix(aidd-riteway-ai): add required Process section per upskill review --- ai/commands/index.md | 6 ++++++ ai/skills/aidd-riteway-ai/SKILL.md | 12 ++++++++++++ ai/skills/index.md | 1 + 3 files changed, 19 insertions(+) diff --git a/ai/commands/index.md b/ai/commands/index.md index ee8ec24d..52461906 100644 --- a/ai/commands/index.md +++ b/ai/commands/index.md @@ -40,6 +40,12 @@ Review a PR, resolve addressed comments, and generate /aidd-fix delegation promp Write functional requirements for a user story. Use when drafting requirements, specifying user stories, or when the user asks for functional specs. +### ๐Ÿงช /aidd-riteway-ai + +**File:** `aidd-riteway-ai.md` + +Write correct riteway ai prompt evals for multi-step tool-calling flows. Use when creating .sudo eval files or testing agent skills that use tools. + ### ๐Ÿง  /aidd-rtc **File:** `aidd-rtc.md` diff --git a/ai/skills/aidd-riteway-ai/SKILL.md b/ai/skills/aidd-riteway-ai/SKILL.md index accd14a6..219faf7d 100644 --- a/ai/skills/aidd-riteway-ai/SKILL.md +++ b/ai/skills/aidd-riteway-ai/SKILL.md @@ -21,6 +21,18 @@ writing assertions inside `.sudo` eval files. --- +## Process + +1. Read the skill under test and its functional requirements +2. Identify the discrete steps in the skill's flow +3. Create one `.sudo` eval file per step (Rule 1), placed in `ai-evals//` +4. For each file, write the `userPrompt` โ€” include mock tool preambles for unit evals (Rule 2), assert tool calls for step 1 (Rule 3), supply previous step output for step N > 1 (Rule 4) +5. Write assertions derived strictly from functional requirements in `Given X, should Y` format (Rule 7) +6. Create small, single-condition fixture files as needed (Rule 6) +7. Verify against the Eval Authoring Checklist below + +--- + ## Eval File Structure A `.sudo` eval file has three sections: diff --git a/ai/skills/index.md b/ai/skills/index.md index 4f4c61d4..9bbe3c8f 100644 --- a/ai/skills/index.md +++ b/ai/skills/index.md @@ -22,6 +22,7 @@ - aidd-react - Enforces React component authoring best practices. Use when creating React components, binding components, presentations, useObservableValues, or when the user asks about React UI patterns, reactive binding, or action callbacks. - aidd-requirements - Write functional requirements for a user story. Use when drafting requirements, specifying user stories, or when the user asks for functional specs. - aidd-review - Conduct a thorough code review focusing on code quality, best practices, security, test coverage, and adherence to project standards and functional requirements. Use when reviewing code, pull requests, or completed epics. +- aidd-riteway-ai - Teaches agents how to write correct riteway ai prompt evals (.sudo files) for multi-step flows that involve tool calls. Use when writing prompt evals, creating .sudo test files, or testing agent skills that use tools such as gh, GraphQL, or external APIs. - aidd-rtc - Reflective Thought Composition. Structured thinking pipeline for complex decisions, design evaluation, and deep analysis. Use when quality of reasoning matters more than speed of response. - aidd-service - Enforces asynchronous data service authoring best practices. Use when creating front-end or back-end services, service interfaces, Observe patterns, AsyncDataService, or when the user asks about service layer, data flow, unidirectional UI, or action/observable design. - aidd-stack - Tech stack guidance for NextJS + React/Redux + Shadcn UI features. Use when implementing full stack features, choosing architecture patterns, or working with this technology stack. From 95dec3ae3803cfbeade9a6008a1ee5809c09504d Mon Sep 17 00:00:00 2001 From: Cursor Agent Date: Fri, 10 Apr 2026 22:04:12 +0000 Subject: [PATCH 3/4] fix(aidd-riteway-ai): correct broken /aidd-requirements references to /aidd-functional-requirements - SKILL.md: fix 2 references to nonexistent /aidd-requirements - SKILL.md: fix /aidd-pr example reference to generic form - SKILL.md: standardize E2e -> E2E casing (3 places) - riteway-ai.test.js: update test to validate correct skill name - tasks/aidd-riteway-ai-skill-epic.md: fix 3 references to /aidd-requirements --- ai/skills/aidd-riteway-ai/SKILL.md | 15 ++++++++------- ai/skills/aidd-riteway-ai/riteway-ai.test.js | 6 +++--- tasks/aidd-riteway-ai-skill-epic.md | 6 +++--- 3 files changed, 14 insertions(+), 13 deletions(-) diff --git a/ai/skills/aidd-riteway-ai/SKILL.md b/ai/skills/aidd-riteway-ai/SKILL.md index 219faf7d..051b42f5 100644 --- a/ai/skills/aidd-riteway-ai/SKILL.md +++ b/ai/skills/aidd-riteway-ai/SKILL.md @@ -16,8 +16,8 @@ for multi-step agent skills that involve tool calls. Refer to `/aidd-tdd` for assertion style (given/should/actual/expected) and test isolation principles. -Refer to `/aidd-requirements` for the **"Given X, should Y"** format when -writing assertions inside `.sudo` eval files. +Refer to `/aidd-functional-requirements` for the **"Given X, should Y"** format +when writing assertions inside `.sudo` eval files. --- @@ -107,7 +107,7 @@ Correct pattern for step 1: ``` userPrompt = """ You have mock tools available. Use them instead of real API calls. -Run step 1 of /aidd-pr: fetch the PR details and review threads. +Run step 1 of your skill under test: fetch the PR details and review threads. """ - Given mock gh tools, should call gh pr view to retrieve the PR branch name @@ -152,7 +152,7 @@ Generate delegation prompts for the remaining issues. --- -## Rule 5 โ€” E2e evals: use real tools, follow -e2e.test.sudo naming +## Rule 5 โ€” E2E evals: use real tools, follow -e2e.test.sudo naming Given an e2e eval, use real tools (no mock preamble) and follow the `-e2e.test.sudo` naming convention to mirror the project's existing unit/e2e @@ -162,7 +162,7 @@ split: ai-evals//step-1--e2e.test.sudo ``` -E2e evals run against live APIs. Only run them when the environment is +E2E evals run against live APIs. Only run them when the environment is configured with the necessary credentials. --- @@ -190,7 +190,8 @@ assertion conditions unambiguous. ## Rule 7 โ€” Assertions: derived from functional requirements only Given assertions in a `.sudo` eval, derive them strictly from the functional -requirements of the skill under test using the `/aidd-requirements` format: +requirements of the skill under test using the `/aidd-functional-requirements` +format: ``` - Given , should @@ -212,7 +213,7 @@ Before saving a `.sudo` eval file, verify: - [ ] Unit evals include mock tool preamble (Rule 2) - [ ] Step 1 asserts tool calls, not pre-supplied answers (Rule 3) - [ ] Step N > 1 includes previous step output as context (Rule 4) -- [ ] E2e evals use `-e2e.test.sudo` suffix (Rule 5) +- [ ] E2E evals use `-e2e.test.sudo` suffix (Rule 5) - [ ] Fixture files are small, one condition each (Rule 6) - [ ] Assertions derived from functional requirements, no duplicates (Rule 7) diff --git a/ai/skills/aidd-riteway-ai/riteway-ai.test.js b/ai/skills/aidd-riteway-ai/riteway-ai.test.js index 803673c1..b0e79f24 100644 --- a/ai/skills/aidd-riteway-ai/riteway-ai.test.js +++ b/ai/skills/aidd-riteway-ai/riteway-ai.test.js @@ -46,7 +46,7 @@ describe("aidd-riteway-ai", () => { }); }); - test("references /aidd-tdd and /aidd-requirements", async () => { + test("references /aidd-tdd and /aidd-functional-requirements", async () => { const filePath = path.join(__dirname, "./SKILL.md"); const content = await fs.readFile(filePath, "utf-8"); @@ -59,8 +59,8 @@ describe("aidd-riteway-ai", () => { assert({ given: "aidd-riteway-ai SKILL.md content", - should: "reference /aidd-requirements", - actual: content.includes("/aidd-requirements"), + should: "reference /aidd-functional-requirements", + actual: content.includes("/aidd-functional-requirements"), expected: true, }); }); diff --git a/tasks/aidd-riteway-ai-skill-epic.md b/tasks/aidd-riteway-ai-skill-epic.md index 2ef38a63..5b0282da 100644 --- a/tasks/aidd-riteway-ai-skill-epic.md +++ b/tasks/aidd-riteway-ai-skill-epic.md @@ -5,7 +5,7 @@ ## Overview -Without guidance, agents default to writing Vitest structural tests instead of `.sudo` prompt evals, collapse multi-step flows into a single overloaded `userPrompt`, pre-supply tool return values instead of testing that the agent makes the right calls, and assert implementation details rather than functional requirements. This skill codifies the lessons learned and references `/aidd-tdd` and `/aidd-requirements` for assertion style and requirement format. +Without guidance, agents default to writing Vitest structural tests instead of `.sudo` prompt evals, collapse multi-step flows into a single overloaded `userPrompt`, pre-supply tool return values instead of testing that the agent makes the right calls, and assert implementation details rather than functional requirements. This skill codifies the lessons learned and references `/aidd-tdd` and `/aidd-functional-requirements` for assertion style and requirement format. --- @@ -17,14 +17,14 @@ Add `ai/skills/aidd-riteway-ai/SKILL.md` following the AgentSkills specification - Given the agent needs to discover the skill, its name and description should be in the frontmatter - Given the agent needs to discover what a skill does, the description should include a very brief description of functionality without delving into implementation details - Given the agent needs to discover when to use a skill, the description should include a very brief "Use when..." clause -- Given the skill file, should include a role preamble and reference both `/aidd-tdd` and `/aidd-requirements` for assertion style and requirement format +- Given the skill file, should include a role preamble and reference both `/aidd-tdd` and `/aidd-functional-requirements` for assertion style and requirement format - Given a multi-step flow under test, should instruct the agent to write one `.sudo` eval file per step rather than combining all steps into one `userPrompt` - Given a unit eval for a step that involves tool calls (gh, GraphQL, API), should instruct the agent to inform the prompted agent that it is operating in a test environment and should use mock tools with stub return values instead of calling real APIs - Given a unit eval for step 1 of a tool-calling flow, should instruct the agent to assert that the correct tool calls are made โ€” not pre-supply the answers those calls would return - Given a unit eval for step N > 1, should instruct the agent to supply the output of the previous step as context in the `userPrompt` - Given an e2e eval, should instruct the agent to use real tools and follow the `-e2e.test.sudo` naming convention, mirroring the project's existing unit/e2e split - Given fixture files needed by the eval, should be small files with one clear bug or condition per file -- Given assertions, should derive them strictly from the functional requirements of the skill under test using `/aidd-requirements` format, and include only assertions that test distinct observable behaviors +- Given assertions, should derive them strictly from the functional requirements of the skill under test using `/aidd-functional-requirements` format, and include only assertions that test distinct observable behaviors --- From 2065626ec26d5f19a0b2f7ba2e77c79875666dfa Mon Sep 17 00:00:00 2001 From: janhesters Date: Wed, 15 Apr 2026 15:22:15 +0200 Subject: [PATCH 4/4] Update /aidd-functional-requirements references to /aidd-requirements Align with the rename in #190. Updates SKILL.md, contract tests, and the epic file. --- ai/skills/aidd-riteway-ai/SKILL.md | 4 ++-- ai/skills/aidd-riteway-ai/riteway-ai.test.js | 6 +++--- tasks/aidd-riteway-ai-skill-epic.md | 6 +++--- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/ai/skills/aidd-riteway-ai/SKILL.md b/ai/skills/aidd-riteway-ai/SKILL.md index 051b42f5..5f72827c 100644 --- a/ai/skills/aidd-riteway-ai/SKILL.md +++ b/ai/skills/aidd-riteway-ai/SKILL.md @@ -16,7 +16,7 @@ for multi-step agent skills that involve tool calls. Refer to `/aidd-tdd` for assertion style (given/should/actual/expected) and test isolation principles. -Refer to `/aidd-functional-requirements` for the **"Given X, should Y"** format +Refer to `/aidd-requirements` for the **"Given X, should Y"** format when writing assertions inside `.sudo` eval files. --- @@ -190,7 +190,7 @@ assertion conditions unambiguous. ## Rule 7 โ€” Assertions: derived from functional requirements only Given assertions in a `.sudo` eval, derive them strictly from the functional -requirements of the skill under test using the `/aidd-functional-requirements` +requirements of the skill under test using the `/aidd-requirements` format: ``` diff --git a/ai/skills/aidd-riteway-ai/riteway-ai.test.js b/ai/skills/aidd-riteway-ai/riteway-ai.test.js index b0e79f24..803673c1 100644 --- a/ai/skills/aidd-riteway-ai/riteway-ai.test.js +++ b/ai/skills/aidd-riteway-ai/riteway-ai.test.js @@ -46,7 +46,7 @@ describe("aidd-riteway-ai", () => { }); }); - test("references /aidd-tdd and /aidd-functional-requirements", async () => { + test("references /aidd-tdd and /aidd-requirements", async () => { const filePath = path.join(__dirname, "./SKILL.md"); const content = await fs.readFile(filePath, "utf-8"); @@ -59,8 +59,8 @@ describe("aidd-riteway-ai", () => { assert({ given: "aidd-riteway-ai SKILL.md content", - should: "reference /aidd-functional-requirements", - actual: content.includes("/aidd-functional-requirements"), + should: "reference /aidd-requirements", + actual: content.includes("/aidd-requirements"), expected: true, }); }); diff --git a/tasks/aidd-riteway-ai-skill-epic.md b/tasks/aidd-riteway-ai-skill-epic.md index 5b0282da..2ef38a63 100644 --- a/tasks/aidd-riteway-ai-skill-epic.md +++ b/tasks/aidd-riteway-ai-skill-epic.md @@ -5,7 +5,7 @@ ## Overview -Without guidance, agents default to writing Vitest structural tests instead of `.sudo` prompt evals, collapse multi-step flows into a single overloaded `userPrompt`, pre-supply tool return values instead of testing that the agent makes the right calls, and assert implementation details rather than functional requirements. This skill codifies the lessons learned and references `/aidd-tdd` and `/aidd-functional-requirements` for assertion style and requirement format. +Without guidance, agents default to writing Vitest structural tests instead of `.sudo` prompt evals, collapse multi-step flows into a single overloaded `userPrompt`, pre-supply tool return values instead of testing that the agent makes the right calls, and assert implementation details rather than functional requirements. This skill codifies the lessons learned and references `/aidd-tdd` and `/aidd-requirements` for assertion style and requirement format. --- @@ -17,14 +17,14 @@ Add `ai/skills/aidd-riteway-ai/SKILL.md` following the AgentSkills specification - Given the agent needs to discover the skill, its name and description should be in the frontmatter - Given the agent needs to discover what a skill does, the description should include a very brief description of functionality without delving into implementation details - Given the agent needs to discover when to use a skill, the description should include a very brief "Use when..." clause -- Given the skill file, should include a role preamble and reference both `/aidd-tdd` and `/aidd-functional-requirements` for assertion style and requirement format +- Given the skill file, should include a role preamble and reference both `/aidd-tdd` and `/aidd-requirements` for assertion style and requirement format - Given a multi-step flow under test, should instruct the agent to write one `.sudo` eval file per step rather than combining all steps into one `userPrompt` - Given a unit eval for a step that involves tool calls (gh, GraphQL, API), should instruct the agent to inform the prompted agent that it is operating in a test environment and should use mock tools with stub return values instead of calling real APIs - Given a unit eval for step 1 of a tool-calling flow, should instruct the agent to assert that the correct tool calls are made โ€” not pre-supply the answers those calls would return - Given a unit eval for step N > 1, should instruct the agent to supply the output of the previous step as context in the `userPrompt` - Given an e2e eval, should instruct the agent to use real tools and follow the `-e2e.test.sudo` naming convention, mirroring the project's existing unit/e2e split - Given fixture files needed by the eval, should be small files with one clear bug or condition per file -- Given assertions, should derive them strictly from the functional requirements of the skill under test using `/aidd-functional-requirements` format, and include only assertions that test distinct observable behaviors +- Given assertions, should derive them strictly from the functional requirements of the skill under test using `/aidd-requirements` format, and include only assertions that test distinct observable behaviors ---