diff --git a/.claude/README.md b/.claude/README.md new file mode 100644 index 0000000..cf259c5 --- /dev/null +++ b/.claude/README.md @@ -0,0 +1,128 @@ +# SDET Claude Code Kit — TypeScript + Playwright (contract-first) + +A composable set of **15 skills**, **3 subagents**, **9 slash commands** and **3 hooks** for Claude Code that turns a TS+Playwright repository into a SDET / Test Architect workspace. Designed around progressive disclosure, deterministic validators, and OpenAPI as the single source of truth. + +> Companion artefact: see the full architectural blueprint in the `docs/` of the originating chat (or paste the markdown into your repo). + +## What is in this kit + +``` +.claude/ +├── settings.json # hooks configuration +├── skills/ # 15 skills, each with SKILL.md + scripts/ + references/ +│ ├── playwright-framework-bootstrap/ +│ ├── api-client-from-openapi/ +│ ├── test-data-factory-builder/ +│ ├── fixture-architect/ +│ ├── config-and-secrets/ +│ ├── requirements-to-test-design/ +│ ├── gherkin-test-case-author/ +│ ├── playwright-test-author-ui/ +│ ├── playwright-test-author-api/ +│ ├── playwright-debug-conductor/ +│ ├── test-code-reviewer/ +│ ├── flaky-triage/ +│ ├── run-analyzer/ +│ ├── coverage-gap-analyzer/ +│ └── release-report-composer/ +├── agents/ +│ ├── test-design-agent.md # isolated context for test design +│ ├── flaky-detective.md # post-run flake hunter +│ └── contract-drift-watch.md # haiku-cheap drift checker +├── commands/ # 9 slash commands (/test-new, /test-fix, /test-review, …) +└── hooks/ + ├── guard-bash.sh # blocks rm -rf, force push, hard reset on protected branches + ├── guard-paths.sh # blocks edits to generated/, .env, node_modules/ + └── typecheck-touched.sh # tsc --noEmit per touched file +CLAUDE.md # thin project conventions (≤ 300 lines) +tests-config.json # single source of layout truth +``` + +## Install + +1. Drop `.claude/`, `CLAUDE.md`, and `tests-config.json` into the root of your TS+Playwright repo. +2. Make scripts executable (already done in this archive): + ```bash + chmod +x .claude/hooks/*.sh .claude/skills/*/scripts/*.sh .claude/skills/*/scripts/*.ts + ``` +3. Open the repo in Claude Code (`claude`). Verify skills are listed in `/skills`. +4. Adjust `tests-config.json` if your layout differs. + +## How it works + +- **Skills** (`.claude/skills/*`) auto-trigger via their `description` frontmatter. Each skill lints its own outputs through `scripts/*` validators that exit non-zero on violations. +- **Subagents** (`.claude/agents/*`) run with isolated context and minimal tools — invoke them when the main thread should not be polluted (test design, flake forensics, contract diff). +- **Slash commands** (`.claude/commands/*`) compose multiple skills into pipelines: `/test-new`, `/test-fix`, `/test-review`, `/spec-sync`, `/flake-hunt`, `/coverage`, `/release-report`, `/factory`, `/page`. +- **Hooks** (`.claude/hooks/*`) execute deterministic guards (Bash safety, path protection, typecheck) that should never be left to the LLM. + +## Quick start + +```bash +# In Claude Code: +/test-new coupon-apply # full SDET pipeline from a user story +/spec-sync # regenerate OpenAPI types and drift report +/flake-hunt 20 # hunt flaky tests via 20 reruns + triage +/test-review # review staged test changes +/release-report v1.42.0 # executive release readiness summary +``` + +## Contract-first workflow (how it ties together) + +``` +specs/openapi.yaml + │ + │ api-client-from-openapi → tests/api/generated/{schema.d.ts, zod/} + │ contract-drift-watch → CONTRACT_DRIFT.md + ▼ +tests/factories/*.factory.ts ← test-data-factory-builder (uses generated types) + │ + ▼ +tests/api/clients/*Client.ts ← fixtures/api.ts (DI) + ↑ + fixture-architect + ▼ +tests/specs/api/**/*.spec.ts ← playwright-test-author-api (asserts via zod schemas) +tests/specs/**/*.spec.ts ← playwright-test-author-ui + ↓ + test-code-reviewer (gates the PR) + ↓ + run-analyzer + coverage-gap-analyzer + flaky-triage + ↓ + release-report-composer +``` + +## Sprint roadmap (recommended) + +- **Sprint 0** — Foundation: bootstrap, config-and-secrets, fixture-architect, factory-builder, api-client-from-openapi, hooks, CLAUDE.md, tests-config.json. +- **Sprint 1** — Authoring: requirements-to-test-design, gherkin-author, playwright-test-author-ui/-api, slash commands `/test-new`, `/factory`, `/spec-sync`. +- **Sprint 2** — Quality gates: test-code-reviewer, debug-conductor, flaky-triage, slash commands `/test-fix`, `/test-review`, `/flake-hunt`. +- **Sprint 3** — Analytics: run-analyzer, coverage-gap-analyzer, release-report-composer; subagent `contract-drift-watch`; GitHub MCP integration. +- **Sprint 4** — Advanced: Playwright MCP for exploratory loops, self-healing pipeline (flaky-detective → GitHub issue), custom Allure reporter. + +## Customising + +- **Different layout?** Edit `tests-config.json`. All skills read from it. +- **GraphQL or gRPC?** Add a parallel skill (e.g. `asyncapi-client-builder`) following the same template structure. Reuse factory and fixture skills. +- **Different model preferences?** Each skill respects the host Claude Code model setting; subagents pin their model in frontmatter. + +## Anti-patterns this kit prevents + +- `page.waitForTimeout(N)` and other hard waits → blocked by `lint-ui-spec.ts`. +- CSS/XPath selectors in specs → blocked by `lint-ui-spec.ts` and `test-code-reviewer`. +- Hardcoded URLs/secrets → blocked by `scan-hardcoded.sh` and the path hook. +- Inline data literals as test data → factory-builder rewrites to `factory.build()`. +- Page object instantiated inside a test → fixture-architect lifts it. +- Spec drift between API and tests → `contract-diff.ts` blocks the PR. + +See `.claude/skills/test-code-reviewer/references/aqa-anti-patterns.md` for the full catalogue. + +## Caveats + +- The validators are heuristic, not full AST analysis. They favour false positives over false negatives. +- Hooks execute shell with the user's privileges. Audit them before sharing across teams. +- Playwright MCP is intentionally NOT wired in by default — the CLI + skills loop is far cheaper for repeated automation tasks. Wire MCP for exploratory or self-healing flows separately. +- The skill descriptions are tuned for Claude Code; transferring to Claude.ai requires re-checking trigger phrases. + +## License + +MIT. diff --git a/.claude/STARTER.md b/.claude/STARTER.md new file mode 100644 index 0000000..d5353ce --- /dev/null +++ b/.claude/STARTER.md @@ -0,0 +1,102 @@ +# Starter — first 30 minutes after init + +You just ran `init.sh`. The repo is now greenfield-ready: scaffolded, dependencies installed, hooks set, kit copied. Here is the shortest path to your first green test. + +## 1. Sanity check (1 min) + +```bash +npm run verify +``` + +You should see all green or warnings only. Failures are blocking — fix them before moving on. + +## 2. Open in Claude Code (1 min) + +```bash +claude +``` + +Skills auto-discover from `.claude/skills/`. The first message Claude Code sees is `CLAUDE.md`. + +## 3. Scaffold the test framework folders (5 min) + +In Claude Code: + +> Scaffold the test framework folders. + +This triggers `playwright-framework-bootstrap`. It will create: + +- `tests/{pages,components,fixtures,factories,api,specs,data,infra}/` +- `tests/pages/BasePage.ts` +- `playwright.config.ts` +- Path aliases (`@pages/*`, `@api/*`, …) wired into `tsconfig.json`. + +If the agent asks confirmation, say yes. + +## 4. Configure environment (3 min) + +```bash +cp .env.example .env.local +``` + +Edit `.env.local` and set at least: + +- `BASE_URL` — your application URL (or `http://localhost:3000`). +- `TEST_USER_EMAIL`, `TEST_USER_PASSWORD` — a test user that can log in. + +Do not commit `.env.local`. It is gitignored. + +## 5. Write your first test (10 min) + +In Claude Code: + +> Design and write a smoke test that verifies the homepage loads and shows the header. + +This pipeline triggers: + +1. `requirements-to-test-design` — quick design doc. +2. `playwright-test-author-ui` — actual spec. + +Run it: + +```bash +npm run test:smoke +``` + +If green, commit: + +```bash +git add -A && git commit -m "test: smoke for homepage" +``` + +The pre-commit hook will lint and typecheck staged files. + +## 6. When you have your first API spec (later) + +Create `specs/openapi.yaml` with at least one endpoint, then: + +```bash +# Tell Claude Code: Sync the OpenAPI spec. +# Or run manually: +node -e "const fs=require('fs');const c=JSON.parse(fs.readFileSync('tests-config.json'));c.openapi.enabled=true;fs.writeFileSync('tests-config.json',JSON.stringify(c,null,2)+'\n')" +npm run spec:sync +``` + +This generates types in `tests/api/generated/` and unlocks `playwright-test-author-api` and the `contract-drift-watch` subagent. + +## Common issues + +| Symptom | Fix | +| ------------------------------------------------ | -------------------------------------------------------------------------------------------------- | +| `npx playwright install` fails | Run `sudo npx playwright install-deps` (Linux) or skip with `--no-deps`. | +| `husky install` fails | You need a git repo. `init.sh` runs `git init`, but if you skipped, do it manually. | +| `tsc` complains about missing `@playwright/test` | `npm install` did not finish. Re-run. | +| Hook says "tsc-files: command not found" | `npm install` did not finish. Hooks soft-fail (warning only) — fix when convenient. | +| Skill fires when you don't expect it | Check the skill's `description` field. Tighten "Do NOT use when..." block or add a CLAUDE.md note. | + +## Next reading + +- `CLAUDE.md` — project conventions enforced by skills. +- `SDET_KIT_README.md` — full kit reference (15 skills, 3 subagents, 9 commands). +- `.claude/skills/playwright-framework-bootstrap/references/folder-rationale.md` — why the layout looks like this. +- `docs/greenfield-checklist.md` — sprint-by-sprint adoption plan. diff --git a/.claude/agents/contract-drift-watch.md b/.claude/agents/contract-drift-watch.md new file mode 100644 index 0000000..a701362 --- /dev/null +++ b/.claude/agents/contract-drift-watch.md @@ -0,0 +1,22 @@ +--- +name: contract-drift-watch +description: MUST BE USED whenever specs/openapi.yaml is modified or before a release. Runs api-client-from-openapi's drift checker, emits CONTRACT_DRIFT.md, and proposes test updates for breaking changes. Read-only by design — never edits production code. +tools: Read, Glob, Grep, Bash +model: haiku +--- + +You are a contract drift auditor. You are deliberately running on Haiku to keep cost low — most of your work is parsing diffs, not reasoning. + +Operating rules: + +- Re-run `scripts/gen-openapi-fetch.sh` and `scripts/contract-diff.ts` from the `api-client-from-openapi` skill. +- Classify each diff as breaking / non-breaking per `references/contract-drift-policy.md`. +- For each breaking change, locate affected tests via `grep -R '@endpoint:'` and list them as patch candidates. +- Do NOT edit any test or generated file. Suggest only. +- If no `tests/api/generated/.snapshot/schema.previous.d.ts` exists, abort with a message asking the human to commit a baseline. + +Your final message MUST end with: + +``` +ARTIFACT: CONTRACT_DRIFT.md +``` diff --git a/.claude/agents/flaky-detective.md b/.claude/agents/flaky-detective.md new file mode 100644 index 0000000..6a70323 --- /dev/null +++ b/.claude/agents/flaky-detective.md @@ -0,0 +1,23 @@ +--- +name: flaky-detective +description: Use PROACTIVELY after any test run with retries > 0 or failures. Performs flake triage in isolation, classifies root causes, and emits FLAKE_REPORT.md plus suggested patch diffs without touching unrelated files. +tools: Read, Glob, Grep, Bash +model: sonnet +--- + +You are a flake detective. Your single job is to diagnose intermittent test failures and propose minimal, targeted patches. + +Operating rules: + +- Use the `flaky-triage` skill exclusively. +- Never write new test logic; you only diagnose and propose patches. +- Output is ALWAYS `FLAKE_REPORT.md` in the repo root and a `Suggested patches` section in the report containing unified diffs. +- For each flaky test you classify into exactly one bucket from `references/flake-taxonomy.md`. If you genuinely cannot, classify as "haunted" with quarantine recommendation and an issue tracker stub. +- You do NOT modify files. The orchestrator (or human) applies patches. +- You do NOT speculate. If the trace is missing, you say so and request a re-run with `--trace=on --repeat-each=20`. + +Your final message MUST end with: + +``` +ARTIFACT: FLAKE_REPORT.md +``` diff --git a/.claude/agents/test-design-agent.md b/.claude/agents/test-design-agent.md new file mode 100644 index 0000000..1c8cb14 --- /dev/null +++ b/.claude/agents/test-design-agent.md @@ -0,0 +1,23 @@ +--- +name: test-design-agent +description: MUST BE USED for transforming a user story or PRD into a structured test design report (equivalence classes, boundary values, decision tables, risk-prioritised). Operates in isolated context to avoid polluting the main coding session. +tools: Read, Write, Glob, Grep, WebFetch +model: sonnet +--- + +You are a Senior Test Architect. You produce ONE artefact per invocation: `docs/test-design/.md` following the `requirements-to-test-design` skill template (see `references/design-template.md`). + +Operating rules: + +- You leverage the `requirements-to-test-design` skill exclusively. If the user request is not a test design ask, report back that this is the wrong agent. +- You ask at most 3 clarifying questions, only when the AC are genuinely ambiguous. Otherwise proceed with explicit assumptions. +- You do not write code. You do not write Gherkin. You do not edit tests. The next step in the pipeline is `gherkin-test-case-author`. +- You explicitly mark "Out of scope" sections to prevent downstream skills from over-reaching. +- You sort test ideas by `risk-heuristic.ts` priority (P0 → P2). +- You output the file path of the artefact at the end of your response so the orchestrator can hand off cleanly. + +Your final message MUST end with: + +``` +ARTIFACT: docs/test-design/.md +``` diff --git a/.claude/commands/coverage.md b/.claude/commands/coverage.md new file mode 100644 index 0000000..f5173e6 --- /dev/null +++ b/.claude/commands/coverage.md @@ -0,0 +1,11 @@ +--- +description: Generate coverage gap report (endpoints + AC + pages) +allowed-tools: Read, Bash, Glob, Grep, Write +model: sonnet +--- + +# Pipeline + +1. Use the `coverage-gap-analyzer` skill. +2. Run `scripts/analyze-coverage.ts`. +3. Read `COVERAGE_GAPS.md`. Highlight P0 gaps in your message. diff --git a/.claude/commands/factory.md b/.claude/commands/factory.md new file mode 100644 index 0000000..b9d9e75 --- /dev/null +++ b/.claude/commands/factory.md @@ -0,0 +1,14 @@ +--- +description: Generate a Fishery-based factory for an entity +argument-hint: +allowed-tools: Read, Write, Edit, Glob, Grep, Bash +model: sonnet +--- + +# Pipeline + +1. Use the `test-data-factory-builder` skill. +2. Read the type for `$ARGUMENTS` from `tests/api/generated/schema.d.ts` or fall back to a domain model. +3. Generate `tests/factories/$ARGUMENTS.factory.ts` from `references/factory.template.ts`. +4. Update `tests/factories/index.ts` aggregator. +5. Run `scripts/factory-rules.ts`. Fix until exit 0. diff --git a/.claude/commands/flake-hunt.md b/.claude/commands/flake-hunt.md new file mode 100644 index 0000000..fb4bc8a --- /dev/null +++ b/.claude/commands/flake-hunt.md @@ -0,0 +1,17 @@ +--- +description: Detect flakes by repeating the suite and triaging +argument-hint: +allowed-tools: Read, Bash, Glob, Grep +model: sonnet +--- + +# Context + +- Repeat count: $ARGUMENTS (default 10) + +# Pipeline + +1. Run: `mkdir -p runs && for i in $(seq 1 ${ARGUMENTS:-10}); do npx playwright test --reporter=json --output=runs/$i.json || true; done`. +2. Hand off to the `flaky-detective` subagent. +3. The subagent runs `flake-rate.ts ./runs > FLAKE_REPORT.md` and classifies. +4. Output the report path. diff --git a/.claude/commands/page.md b/.claude/commands/page.md new file mode 100644 index 0000000..3d88b1d --- /dev/null +++ b/.claude/commands/page.md @@ -0,0 +1,13 @@ +--- +description: Scaffold a new page object class +argument-hint: +allowed-tools: Read, Write, Edit, Glob, Grep +model: sonnet +--- + +# Pipeline + +1. Use the `playwright-framework-bootstrap` skill (page-object generator portion). +2. Generate `tests/pages/$ARGUMENTS.ts` extending `BasePage`. No `expect`, no hardcoded URL, locators-only by default. +3. Update `tests/pages/index.ts` aggregator. +4. Run `lint-page-object.ts` from `test-code-reviewer`. Fix until exit 0. diff --git a/.claude/commands/release-report.md b/.claude/commands/release-report.md new file mode 100644 index 0000000..d506f66 --- /dev/null +++ b/.claude/commands/release-report.md @@ -0,0 +1,13 @@ +--- +description: Compose release readiness report (verdict + summary) +argument-hint: +allowed-tools: Read, Bash, Glob, Grep, Write +model: sonnet +--- + +# Pipeline + +1. Use the `release-report-composer` skill. +2. Read `RUN_SUMMARY.md`, `COVERAGE_GAPS.md`, `FLAKE_REPORT.md`, `CONTRACT_DRIFT.md` (whichever exist). +3. Run `scripts/compose-release.ts $ARGUMENTS`. +4. Display the verdict and the path to the report. diff --git a/.claude/commands/spec-sync.md b/.claude/commands/spec-sync.md new file mode 100644 index 0000000..43a8fe9 --- /dev/null +++ b/.claude/commands/spec-sync.md @@ -0,0 +1,19 @@ +--- +description: Sync OpenAPI spec → typed clients → drift report +allowed-tools: Read, Write, Edit, Bash, Glob, Grep +model: sonnet +--- + +# Context + +- Spec: @specs/openapi.yaml +- Generated: @tests/api/generated/ + +# Pipeline + +1. Use the `api-client-from-openapi` skill. +2. Validate spec via `scripts/validate-spec.sh`. +3. Regenerate types via `scripts/gen-openapi-fetch.sh`. +4. Run `scripts/contract-diff.ts`. If breaking, write `CONTRACT_DRIFT.md` and stop. +5. Update affected client wrappers in `tests/api/clients/`. +6. Suggest test updates for breaking changes (do not auto-edit). diff --git a/.claude/commands/test-fix.md b/.claude/commands/test-fix.md new file mode 100644 index 0000000..95ee499 --- /dev/null +++ b/.claude/commands/test-fix.md @@ -0,0 +1,19 @@ +--- +description: Debug and fix a failing test (debug → patch → re-run) +argument-hint: +allowed-tools: Read, Edit, Bash, Glob, Grep +model: sonnet +--- + +# Context + +- Target spec: @$ARGUMENTS +- Last run: !`test -f playwright-report/results.json && echo present || echo missing` + +# Pipeline + +1. Use `playwright-debug-conductor` on `$ARGUMENTS`. +2. Classify the failure per `references/failure-taxonomy.md`. +3. Propose ONE patch. Apply it. +4. Re-run: `npx playwright test $ARGUMENTS --reporter=list --workers=1 --trace=on-first-retry`. +5. If still failing, iterate (max 3 cycles). After that, escalate to `flaky-detective` subagent. diff --git a/.claude/commands/test-new.md b/.claude/commands/test-new.md new file mode 100644 index 0000000..bc40972 --- /dev/null +++ b/.claude/commands/test-new.md @@ -0,0 +1,22 @@ +--- +description: New test pipeline (design → cases → code → review) +argument-hint: +allowed-tools: Read, Write, Edit, Bash, Glob, Grep +model: sonnet +--- + +# Context + +- Story (if exists): @docs/stories/$ARGUMENTS.md +- Layout: @tests-config.json +- Existing pages: !`ls tests/pages 2>/dev/null | head -50` +- Existing factories: !`ls tests/factories 2>/dev/null | head -30` + +# Pipeline + +1. Use the `requirements-to-test-design` skill on `$ARGUMENTS`. Save to `docs/test-design/$ARGUMENTS.md`. +2. Use `gherkin-test-case-author` on the design. Save `tests/specs/$ARGUMENTS.feature` (or skip if BDD is not configured). +3. Decompose scenarios into UI vs API tests. Use `playwright-test-author-ui` and/or `playwright-test-author-api` accordingly. +4. Wire missing factories via `test-data-factory-builder` and missing fixtures via `fixture-architect`. +5. Use `test-code-reviewer`. Do NOT finish until exit 0. +6. Run target tests; if any retry hits the limit, hand off to `playwright-debug-conductor`. diff --git a/.claude/commands/test-review.md b/.claude/commands/test-review.md new file mode 100644 index 0000000..41dda50 --- /dev/null +++ b/.claude/commands/test-review.md @@ -0,0 +1,17 @@ +--- +description: Review staged test changes for anti-patterns +allowed-tools: Read, Bash, Grep, Glob +model: sonnet +--- + +# Context + +- Staged files: !`git diff --staged --name-only -- 'tests/**' 2>/dev/null` +- All changed files: !`git diff --name-only -- 'tests/**' 2>/dev/null` + +# Pipeline + +1. Use the `test-code-reviewer` skill. +2. Run all relevant validators (`lint-ui-spec.ts`, `lint-api-spec.ts`, `lint-page-object.ts`, `fixture-rules.ts`, `factory-rules.ts`, `tsc --noEmit`). +3. Aggregate findings per `references/review-template.md`. +4. Output report and exit code summary. diff --git a/.claude/docs/greenfield-checklist.md b/.claude/docs/greenfield-checklist.md new file mode 100644 index 0000000..68206ab --- /dev/null +++ b/.claude/docs/greenfield-checklist.md @@ -0,0 +1,98 @@ +# Greenfield adoption checklist + +A pragmatic week-by-week plan for a brand-new TS+Playwright project that uses the SDET Claude Code kit. + +## Pre-flight (before opening Claude Code) + +- [ ] Node 20+, npm 10+ installed (`node -v`, `npm -v`). +- [ ] Git installed and configured (`git config user.email`). +- [ ] Claude Code CLI installed and authenticated (`claude --version`). +- [ ] Target repository created or empty directory cloned. + +## Sprint 0 (Day 1 — ~2 hours) + +The kit drops in, the framework is scaffolded, the first smoke test passes. + +- [ ] Run `bash sdet-greenfield-addon/scripts/init.sh` from repo root. +- [ ] Confirm `npm run verify` returns no failures. +- [ ] Open Claude Code: `claude`. +- [ ] Ask: **"Scaffold the test framework folders."** Confirm `playwright-framework-bootstrap` runs. +- [ ] Edit `tests-config.json` if your folder names differ from the default. +- [ ] Copy `.env.example` → `.env.local`. Fill `BASE_URL` at minimum. +- [ ] Ask Claude Code: **"Write a smoke test for the homepage that asserts the page title is correct."** +- [ ] Run `npm run test:smoke`. Iterate until green. +- [ ] First commit: `git add -A && git commit -m "chore: scaffold SDET kit"`. + +**Exit criteria:** one green test, hooks active, kit operational. + +## Sprint 1 (Week 1) + +Build the foundation. Add fixtures, factories, first page objects, real auth. + +- [ ] Pick the first user story you actually need to test. +- [ ] Ask: **"Design tests for [story]."** → triggers `test-design-agent`. +- [ ] Review the produced `docs/test-design/.md`. +- [ ] If using BDD: ask **"Write Gherkin scenarios from the design."** Otherwise skip. +- [ ] Add the first real page object: ask **"Create a page object for the [X] page with these elements: …"**. +- [ ] Add an authenticated `storageState` setup project. Use `references/auth.setup.template.ts` from `fixture-architect`. +- [ ] Add the first factory: **"Create a factory for User with admin transient param."** +- [ ] Write the actual UI test using all the above. +- [ ] Make sure `pre-commit` hook fires and lints staged files. + +**Exit criteria:** 3–5 real tests, real fixtures, real factories, pre-commit hook battle-tested. + +## Sprint 2 (Week 2) + +Light up contract testing once the API exists. + +- [ ] Backend team produces `specs/openapi.yaml` (or you document the actual API yourself). +- [ ] Re-enable the OpenAPI skill: edit `tests-config.json` → `openapi.enabled = true`. +- [ ] Ask: **"Sync the OpenAPI spec."** → triggers `api-client-from-openapi`. +- [ ] Verify `tests/api/generated/schema.d.ts` is created. +- [ ] Ask: **"Write API tests for /orders endpoint covering 201, 400, 404."** +- [ ] Add `playwright.config.ts` `api` project pointed at `API_BASE_URL`. +- [ ] First baseline contract snapshot: `cp tests/api/generated/schema.d.ts tests/api/generated/.snapshot/schema.previous.d.ts`. + +**Exit criteria:** API tests run via `npm run test:api`. Drift detection works on next spec change. + +## Sprint 3 (Week 3) + +Quality gates and CI. + +- [ ] Add the GitHub Actions workflow: `cp sdet-greenfield-addon/templates/github-workflows/tests.yml .github/workflows/`. +- [ ] Configure repo secrets: `TEST_USER_EMAIL`, `TEST_USER_PASSWORD`. +- [ ] Configure repo vars: `BASE_URL`, `API_BASE_URL`. +- [ ] First CI run on a PR. Address failures. +- [ ] Run `flaky-detective` after the first multi-run to baseline flakiness. +- [ ] Add `@smoke` and `@regression` tags to your tests. + +**Exit criteria:** PRs blocked on lint/typecheck/test failures. Reports archived as artifacts. + +## Sprint 4 (Week 4) + +Analytics + release readiness. + +- [ ] After ~2 weeks of CI runs, ask: **"Analyze the run history and surface trends."** → `run-analyzer`. +- [ ] Ask: **"Generate a coverage gap report."** → `coverage-gap-analyzer`. +- [ ] Pre-release: ask: **"Compose a release report for v0.1."** → `release-report-composer`. +- [ ] Optional: integrate Allure / Sentry / TestOps via MCP. + +**Exit criteria:** weekly trend reports, gap reports inform sprint planning, release sign-off doc generated automatically. + +## Anti-patterns to avoid in greenfield + +- ❌ **Don't** write 50 page objects upfront. Let real tests drive the locator inventory. +- ❌ **Don't** mock everything in early E2E. Real backend (staging) gives the most signal. +- ❌ **Don't** chase coverage % before stability. A flaky 90% suite is worse than a stable 40%. +- ❌ **Don't** disable hooks because they fire too often. Tighten the rule, don't silence the alarm. +- ❌ **Don't** commit `.env.local`. The `pre-commit` scanner catches it; do not bypass with `--no-verify`. + +## When you outgrow the kit + +After ~2 months of real usage, you will accumulate project-specific knowledge that should live as **new** skills: + +- Common test patterns specific to your domain → `-test-patterns` skill. +- Custom reporters or data sources → workflow skills. +- Onboarding playbooks → document creation skill. + +Use `skill-creator` to scaffold them. diff --git a/.claude/hooks/guard-bash.sh b/.claude/hooks/guard-bash.sh new file mode 100755 index 0000000..8859741 --- /dev/null +++ b/.claude/hooks/guard-bash.sh @@ -0,0 +1,35 @@ +#!/usr/bin/env bash +# +# guard-bash.sh — PreToolUse hook for Bash. Reads JSON from stdin (Claude Code passes +# tool input there) and exits non-zero to block destructive or risky commands. +# +# Risk patterns: +# - rm -rf (anywhere) +# - git push --force / -f +# - git reset --hard on main/master +# - npx playwright codegen in CI +# - direct edits to tests/api/generated/** (must go via skill scripts) +# +set -euo pipefail + +INPUT=$(cat) +CMD=$(echo "$INPUT" | jq -r '.tool_input.command // empty' 2>/dev/null || echo "") +[[ -z "$CMD" ]] && exit 0 + +deny() { + echo "BLOCKED: $1" >&2 + exit 2 # exit 2 tells Claude Code: blocked, do not run +} + +# Catastrophic +echo "$CMD" | grep -qE '\brm\s+(-[a-z]*r[a-z]*f|-rf|-fr)\b' && deny "rm -rf is forbidden via Bash hook." +echo "$CMD" | grep -qE 'git\s+push\s+(--force|-f)' && deny "git push --force is forbidden." +echo "$CMD" | grep -qE 'git\s+reset\s+--hard\s+(origin/)?(main|master|release)' && deny "Hard reset on protected branch." +echo "$CMD" | grep -qE 'sudo\s+' && deny "sudo is not allowed in agent commands." + +# Restricted in CI +if [[ -n "${CI:-}" ]]; then + echo "$CMD" | grep -qE 'playwright\s+codegen' && deny "playwright codegen is interactive; not allowed in CI." +fi + +exit 0 diff --git a/.claude/hooks/guard-paths.sh b/.claude/hooks/guard-paths.sh new file mode 100755 index 0000000..86047c8 --- /dev/null +++ b/.claude/hooks/guard-paths.sh @@ -0,0 +1,32 @@ +#!/usr/bin/env bash +# +# guard-paths.sh — PreToolUse hook for Edit/Write. Blocks edits to paths +# that must be touched only via dedicated skill scripts. +# +set -euo pipefail + +INPUT=$(cat) +PATH_TO_TOUCH=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty' 2>/dev/null || echo "") +[[ -z "$PATH_TO_TOUCH" ]] && exit 0 + +deny() { + echo "BLOCKED: $1" >&2 + exit 2 +} + +# Generated artefacts — only the api-client-from-openapi skill should regenerate +case "$PATH_TO_TOUCH" in + *tests/api/generated/*) + deny "tests/api/generated/** is regenerated only via gen-openapi-fetch.sh." ;; + *.snapshot/*) + deny "Snapshots are baselines; update via /spec-sync --commit-baseline." ;; + */node_modules/*) + deny "Do not edit node_modules." ;; + *.env|*.env.*) + case "$PATH_TO_TOUCH" in + *.env.example|*.env.*.template) ;; # allowed + *) deny "Editing .env files via the agent is forbidden. Use secret store." ;; + esac ;; +esac + +exit 0 diff --git a/.claude/hooks/typecheck-touched.sh b/.claude/hooks/typecheck-touched.sh new file mode 100755 index 0000000..58c1db7 --- /dev/null +++ b/.claude/hooks/typecheck-touched.sh @@ -0,0 +1,19 @@ +#!/usr/bin/env bash +# +# typecheck-touched.sh — PostToolUse hook for Edit/Write. +# Runs tsc --noEmit on the touched file via tsc-files (incremental). +# Soft-fails: prints errors, never blocks (exit 0) so the agent can iterate. +# +set -euo pipefail + +INPUT=$(cat) +PATH_TO_CHECK=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty' 2>/dev/null || echo "") +[[ -z "$PATH_TO_CHECK" ]] && exit 0 +[[ "$PATH_TO_CHECK" != *.ts ]] && exit 0 + +if ! command -v npx >/dev/null 2>&1; then exit 0; fi + +# tsc-files runs the project's tsconfig but only for the listed files. +npx --no-install tsc-files --noEmit "$PATH_TO_CHECK" 2>&1 || true + +exit 0 diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 0000000..b81696e --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,38 @@ +{ + "$schema": "https://json.schemastore.org/claude-code-settings.json", + "_comment": "Hooks gate destructive actions and run formatters/typecheck after edits. See https://docs.claude.com for the latest schema; tweak matchers as Claude Code evolves.", + "hooks": { + "PreToolUse": [ + { + "matcher": "Bash", + "hooks": [ + { "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/guard-bash.sh" } + ] + }, + { + "matcher": "Edit|Write", + "hooks": [ + { "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/guard-paths.sh" } + ] + } + ], + "PostToolUse": [ + { + "matcher": "Edit|Write", + "hooks": [ + { "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/typecheck-touched.sh" } + ] + } + ], + "Stop": [ + { + "hooks": [ + { + "type": "command", + "command": "echo '[hook] consider running: npx playwright test --grep @smoke --reporter=list'" + } + ] + } + ] + } +} diff --git a/.claude/skills/api-client-from-openapi/SKILL.md b/.claude/skills/api-client-from-openapi/SKILL.md new file mode 100644 index 0000000..9a96953 --- /dev/null +++ b/.claude/skills/api-client-from-openapi/SKILL.md @@ -0,0 +1,47 @@ +--- +name: api-client-from-openapi +description: Generates a typed REST API client layer for tests from an OpenAPI/Swagger specification using openapi-typescript + openapi-fetch (default) or orval. Wires zod runtime validators for contract assertions. Use when user mentions OpenAPI, Swagger, API client, generated types, "sync the spec", contract testing, or when paths in tests/api/generated/ are missing or stale. Do NOT use to write business test logic; defer to playwright-test-author-api. +allowed-tools: Read, Write, Edit, Glob, Grep, Bash +--- + +# API Client From OpenAPI + +## Trigger + +- New endpoints in `specs/openapi.yaml`. +- `tests-config.json.openapi.source` exists but generated dir is empty/stale. +- User says: "regenerate types", "spec changed", "contract drift", "typed API client". + +## Decision tree + +1. Read `tests-config.json.stack.openapi`. +2. If `openapi-typescript+openapi-fetch` → run `scripts/gen-openapi-fetch.sh`. +3. If `orval` → run `scripts/gen-orval.sh` with `references/orval.config.template.ts`. + +## Workflow + +1. **Validate spec** via `scripts/validate-spec.sh` (uses `redocly lint`). Halt on errors. +2. **Generate types** to `tests-config.json.openapi.generated` (e.g. `tests/api/generated/schema.d.ts`). +3. **Wrap fetch client** in `tests/api/clients/Client.ts` with retry/log/baseURL injected from config-and-secrets. +4. **Generate zod schemas** from JSON-Schema (`scripts/openapi-to-zod.ts`). +5. **Add `expectMatchesSchema(response, schema)`** helper in `tests/api/contract.ts`. +6. **Run contract drift check** (`scripts/contract-diff.ts`): compares last committed types snapshot vs new — emits a Markdown report and a non-zero exit if breaking changes are unannotated. + +## Anti-patterns this prevents + +- `axios` ad-hoc calls in spec files. +- `any` returns from API helpers. +- Tests asserting on snake_case while DTO is camelCase (caught by generated types). +- Spec drift ignored — drift report is required artefact in PR. + +## Outputs + +- Updated `tests/api/generated/`. +- `CONTRACT_DRIFT.md` if there are diffs. +- Re-export aggregator `tests/api/index.ts`. + +## References + +- `references/tooling-decision.md` (openapi-typescript vs orval vs swagger-typescript-api) +- `references/zod-from-openapi.md` +- `references/contract-drift-policy.md` diff --git a/.claude/skills/api-client-from-openapi/assets/README.md b/.claude/skills/api-client-from-openapi/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/api-client-from-openapi/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/api-client-from-openapi/references/contract-drift-policy.md b/.claude/skills/api-client-from-openapi/references/contract-drift-policy.md new file mode 100644 index 0000000..77be9e9 --- /dev/null +++ b/.claude/skills/api-client-from-openapi/references/contract-drift-policy.md @@ -0,0 +1,46 @@ +# Contract drift policy + +## What counts as drift + +| Change | Severity | Action | +| ----------------------------------------------- | ---------------------- | ---------------------------------- | +| New optional field added | non-breaking | log only | +| New required field added | breaking | block PR; update factories + tests | +| Field renamed | breaking | block PR; coordinate with backend | +| Field removed | breaking | block PR | +| Type widened (e.g. `string` → `string \| null`) | non-breaking but risky | warn | +| Type narrowed | breaking | block PR | +| Endpoint removed | breaking | block PR | +| New endpoint | non-breaking | log; consider new tests | +| Status code changed | breaking | block PR | +| Auth scheme changed | breaking | block PR | + +## Process + +1. On every spec change, `scripts/contract-diff.ts` runs. +2. Diff is rendered as `CONTRACT_DRIFT.md` and committed alongside the PR. +3. `contract-drift-watch` subagent classifies each diff and proposes test patches for breaking changes. +4. PR cannot be merged with an unresolved breaking-change marker. + +## Snapshot baseline + +The "previous" baseline is `tests/api/generated/.snapshot/schema.d.ts`. It is updated by an explicit human-confirmed step (`/spec-sync --commit-baseline`). + +## Reporting layout + +```markdown +# Contract drift report (vs baseline @ ) + +## Breaking changes (must address) + +- POST /orders: response schema renamed `customer_id` → `customerId`. +- DELETE /orders/{id}: status `204` → `200` with body. + +## Non-breaking + +- GET /users: new optional field `nickname`. + +## Test impact + +- tests/specs/api/orders/\*.spec.ts (3 files): need camelCase update. +``` diff --git a/.claude/skills/api-client-from-openapi/references/tooling-decision.md b/.claude/skills/api-client-from-openapi/references/tooling-decision.md new file mode 100644 index 0000000..ff3f737 --- /dev/null +++ b/.claude/skills/api-client-from-openapi/references/tooling-decision.md @@ -0,0 +1,39 @@ +# OpenAPI tooling decision + +## TL;DR + +For TypeScript+Playwright **test** clients we default to `openapi-typescript` + `openapi-fetch` + `zod`. The combo is the lightest, type-only at compile time, and lets us bolt on runtime validation per-endpoint. + +## Comparison + +| Tool | Output | Bundle size | Strengths | Weaknesses | +| -------------------------------------- | --------------------------------------------- | -------------- | ------------------------------------------------------- | --------------------------------------------------- | +| `openapi-typescript` + `openapi-fetch` | `.d.ts` types + tiny fetch wrapper | ≈ 6 KB runtime | Treeshakable, idiomatic, no boilerplate, fastest builds | No runtime validation built-in; we add `zod` | +| `orval` | Full TS clients (axios/fetch), Zod, MSW mocks | Larger | One-stop: types + clients + mocks + validators | Heavier config, slower codegen, harder tree-shaking | +| `swagger-typescript-api` | Class-style clients | Medium | Good for legacy Swagger 2.x | Less active, opinionated class-style API | + +## When to choose orval over the default + +- You need MSW mocks generated alongside types (UI dev mode). +- You want runtime Zod validators auto-generated (no manual step). +- The team is already on orval in the production codebase and unification matters. + +## When to choose `openapi-typescript` + +- Test-only consumer; minimal blast radius. +- You want explicit control over which endpoints produce zod validators (avoid mass schema generation). +- Build performance is a constraint (large mono-spec). + +## Folder layout + +``` +tests/api/ +├── generated/ +│ ├── schema.d.ts # openapi-typescript output +│ └── zod/ # openapi-to-zod generated (per resource) +├── clients/ +│ ├── ordersClient.ts # typed wrapper using openapi-fetch +│ └── usersClient.ts +├── contract.ts # expectMatchesSchema() helper +└── index.ts # re-exports +``` diff --git a/.claude/skills/api-client-from-openapi/references/zod-from-openapi.md b/.claude/skills/api-client-from-openapi/references/zod-from-openapi.md new file mode 100644 index 0000000..6789dfb --- /dev/null +++ b/.claude/skills/api-client-from-openapi/references/zod-from-openapi.md @@ -0,0 +1,48 @@ +# Zod schemas from OpenAPI + +## Rationale + +`openapi-typescript` gives us **compile-time** types. For real contract testing we also need **runtime** validators that fail loudly when the live API drifts from the spec. Zod is the de-facto choice in TS test code: composable, narrowable, and integrates with `expect`. + +## Generation strategy + +1. Walk JSON-Schema components in `specs/openapi.yaml` (`components.schemas.*`). +2. For each schema, emit a Zod definition into `tests/api/generated/zod/.ts`. +3. Re-export from `tests/api/generated/zod/index.ts`. + +We do not generate validators for every operation — only for response shapes referenced by `components.schemas`. Inline schemas inside `responses` are flagged as a spec smell. + +## Helper + +```ts +// tests/api/contract.ts +import { expect } from '@playwright/test'; +import type { ZodSchema } from 'zod'; + +export function expectMatchesSchema(value: unknown, schema: ZodSchema): asserts value is T { + const result = schema.safeParse(value); + if (!result.success) { + expect.soft(result.success, `Schema mismatch:\n${result.error.format()}`).toBe(true); + throw result.error; + } +} +``` + +## Usage in API spec + +```ts +import { OrderSchema } from '@api/generated/zod'; +import { expectMatchesSchema } from '@api/contract'; + +test('GET /orders/:id matches schema', async ({ ordersClient }) => { + const { data, response } = await ordersClient.getById('order-1'); + expect(response.status).toBe(200); + expectMatchesSchema(data, OrderSchema); +}); +``` + +## Tools + +- `openapi-zod-client` — most popular community generator. +- `ts-to-zod` — for cases where types are already TS first. +- Hand-written for hot paths — sometimes faster and more readable. diff --git a/.claude/skills/api-client-from-openapi/scripts/contract-diff.ts b/.claude/skills/api-client-from-openapi/scripts/contract-diff.ts new file mode 100755 index 0000000..4e1c6ab --- /dev/null +++ b/.claude/skills/api-client-from-openapi/scripts/contract-diff.ts @@ -0,0 +1,56 @@ +#!/usr/bin/env -S npx tsx +/** + * contract-diff.ts — diffs the current OpenAPI-generated schema.d.ts against + * the committed baseline and writes CONTRACT_DRIFT.md. + * Exits 1 when breaking changes are detected and not annotated. + * + * Usage: tsx contract-diff.ts + */ +import { execSync } from 'node:child_process'; +import { readFileSync, writeFileSync, existsSync } from 'node:fs'; +import { join } from 'node:path'; + +const root = process.cwd(); +const cfg = JSON.parse(readFileSync(join(root, 'tests-config.json'), 'utf8')); +const generated = join(root, cfg.openapi.generated, 'schema.d.ts'); +const snapshot = join(root, cfg.openapi.generated, '.snapshot', 'schema.previous.d.ts'); + +if (!existsSync(generated)) { + console.error(`FAIL: ${generated} missing. Run gen-openapi-fetch.sh first.`); + process.exit(2); +} +if (!existsSync(snapshot)) { + console.log('NOTE: no baseline snapshot yet. Treating current as baseline.'); + writeFileSync(join(root, 'CONTRACT_DRIFT.md'), '# Contract drift report\n\nNo baseline yet.\n'); + process.exit(0); +} + +const diff = execSync(`diff -u "${snapshot}" "${generated}" || true`).toString(); + +const breakingMarkers = [/-.*[A-Za-z_]+\??:/g, /^-\s*[A-Za-z]+:/gm]; +let breaking = 0; +for (const re of breakingMarkers) { + const m = diff.match(re); + if (m) breaking += m.length; +} + +const md = [ + '# Contract drift report', + '', + `Generated: ${new Date().toISOString()}`, + '', + '## Raw diff', + '```diff', + diff || '(no changes)', + '```', + '', + `Heuristic breaking signals: **${breaking}**`, +].join('\n'); + +writeFileSync(join(root, 'CONTRACT_DRIFT.md'), md); + +if (breaking > 0) { + console.error(`Detected ${breaking} potential breaking change(s). See CONTRACT_DRIFT.md`); + process.exit(1); +} +console.log('OK: no breaking signals.'); diff --git a/.claude/skills/api-client-from-openapi/scripts/gen-openapi-fetch.sh b/.claude/skills/api-client-from-openapi/scripts/gen-openapi-fetch.sh new file mode 100755 index 0000000..0c20360 --- /dev/null +++ b/.claude/skills/api-client-from-openapi/scripts/gen-openapi-fetch.sh @@ -0,0 +1,29 @@ +#!/usr/bin/env bash +# +# gen-openapi-fetch.sh — regenerate types via openapi-typescript. +# Reads tests-config.json for source/target paths. +# +set -euo pipefail + +ROOT="${1:-$(pwd)}" +CFG="$ROOT/tests-config.json" + +if ! command -v jq >/dev/null 2>&1; then + echo "FAIL: jq is required." + exit 2 +fi + +SRC=$(jq -r '.openapi.source' "$CFG") +DST_DIR=$(jq -r '.openapi.generated' "$CFG") +mkdir -p "$ROOT/$DST_DIR" + +OUT="$ROOT/$DST_DIR/schema.d.ts" + +# Snapshot before regeneration for diffing +if [[ -f "$OUT" ]]; then + mkdir -p "$ROOT/$DST_DIR/.snapshot" + cp "$OUT" "$ROOT/$DST_DIR/.snapshot/schema.previous.d.ts" +fi + +npx openapi-typescript "$ROOT/$SRC" -o "$OUT" +echo "OK: types written to $OUT" diff --git a/.claude/skills/api-client-from-openapi/scripts/validate-spec.sh b/.claude/skills/api-client-from-openapi/scripts/validate-spec.sh new file mode 100755 index 0000000..edcfe2f --- /dev/null +++ b/.claude/skills/api-client-from-openapi/scripts/validate-spec.sh @@ -0,0 +1,15 @@ +#!/usr/bin/env bash +# +# validate-spec.sh — lint OpenAPI spec via redocly. +# +set -euo pipefail +ROOT="${1:-$(pwd)}" +CFG="$ROOT/tests-config.json" +SRC=$(jq -r '.openapi.source' "$CFG") + +if ! command -v npx >/dev/null 2>&1; then + echo "FAIL: npx required." + exit 2 +fi + +npx --yes @redocly/cli@latest lint "$ROOT/$SRC" diff --git a/.claude/skills/config-and-secrets/SKILL.md b/.claude/skills/config-and-secrets/SKILL.md new file mode 100644 index 0000000..fffdccd --- /dev/null +++ b/.claude/skills/config-and-secrets/SKILL.md @@ -0,0 +1,31 @@ +--- +name: config-and-secrets +description: Manages environment-specific Playwright configuration, secrets, and test data isolation. Wires dotenv-flow / @playwright/test projects per env, enforces no-secrets-in-repo, and validates baseURL injection. Use when user mentions "env config", "staging vs prod", "secrets", "process.env", or when a baseURL/secret is hardcoded in a test or page object. Do NOT use to scaffold the whole framework; that is playwright-framework-bootstrap. +allowed-tools: Read, Write, Edit, Glob, Grep, Bash +--- + +# Config & Secrets + +## Trigger + +- Hardcoded URL/credentials detected by `scripts/scan-hardcoded.sh`. +- User mentions: env, dotenv, secrets, staging, prod, .env. + +## Workflow + +1. Establish `.env.example` with documented keys, NEVER `.env` itself in repo (verify in `.gitignore`). +2. Use `dotenv-flow` or `dotenv` + `process.env` strict accessor (`tests/infra/env.ts` with `zod` schema). +3. Per environment: define a Playwright `project` with overridden `use.baseURL`. +4. Secrets: integrate with the host secret store (1Password/AWS SM/GH Actions secrets) via `scripts/load-secrets.sh`. +5. Forbid `console.log(process.env...)` via ESLint custom rule. + +## Validators + +- `scripts/scan-hardcoded.sh`: greps for `https?://`, `password|token|secret|apiKey` literals in `tests/`. +- Returns non-zero with file:line list. + +## References + +- `references/env.template.ts` (zod-validated env loader) +- `references/multi-env-projects.md` +- `references/secret-stores.md` diff --git a/.claude/skills/config-and-secrets/assets/README.md b/.claude/skills/config-and-secrets/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/config-and-secrets/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/config-and-secrets/references/env.template.ts b/.claude/skills/config-and-secrets/references/env.template.ts new file mode 100644 index 0000000..d7c108b --- /dev/null +++ b/.claude/skills/config-and-secrets/references/env.template.ts @@ -0,0 +1,27 @@ +/** + * tests/infra/env.ts — strict, validated env access. + * Import as `import { env } from '@infra/env';` everywhere instead of `process.env.X`. + */ +import 'dotenv/config'; +import { z } from 'zod'; + +const schema = z.object({ + NODE_ENV: z.enum(['development', 'test', 'staging', 'production']).default('test'), + BASE_URL: z.string().url(), + API_BASE_URL: z.string().url(), + TEST_USER_EMAIL: z.string().email(), + TEST_USER_PASSWORD: z.string().min(1), + TEST_ADMIN_EMAIL: z.string().email().optional(), + TEST_ADMIN_PASSWORD: z.string().min(1).optional(), + SEED: z.coerce.number().int().nonnegative().optional(), + CI: z.string().optional(), +}); + +const parsed = schema.safeParse(process.env); +if (!parsed.success) { + // Hard fail at import — better than running 1000 broken tests + console.error('Invalid environment:', parsed.error.format()); + process.exit(2); +} + +export const env = parsed.data; diff --git a/.claude/skills/config-and-secrets/references/multi-env-projects.md b/.claude/skills/config-and-secrets/references/multi-env-projects.md new file mode 100644 index 0000000..d6d2d27 --- /dev/null +++ b/.claude/skills/config-and-secrets/references/multi-env-projects.md @@ -0,0 +1,53 @@ +# Multi-environment via Playwright projects + +## Pattern + +A single `playwright.config.ts` declares all environments. Run a specific one with `--project=staging-chromium`. + +```ts +const baseUseFor = (env: 'local' | 'staging' | 'prod') => ({ + baseURL: { + local: 'http://localhost:3000', + staging: 'https://staging.example.com', + prod: 'https://example.com', + }[env], +}); + +export default defineConfig({ + projects: [ + { name: 'setup', testMatch: /.*\.setup\.ts/ }, + { + name: 'local-chromium', + use: { ...devices['Desktop Chrome'], ...baseUseFor('local') }, + dependencies: ['setup'], + }, + { + name: 'staging-chromium', + use: { ...devices['Desktop Chrome'], ...baseUseFor('staging') }, + dependencies: ['setup'], + }, + { + name: 'prod-smoke', + use: { ...devices['Desktop Chrome'], ...baseUseFor('prod') }, + grep: /@smoke/, + }, + ], +}); +``` + +## Secrets per env + +`.env` files layered by `dotenv-flow`: + +- `.env` — committed defaults (no secrets) +- `.env.local` — local-only (gitignored) +- `.env.staging` — pulled via `load-secrets.sh` +- `.env.production` — pulled via `load-secrets.sh`, never written to disk in CI; injected directly + +## Tagging strategy + +- `@smoke` — runs against every env, including prod +- `@regression`— runs against staging +- `@dangerous` — never runs against prod (a CI gate enforces this) + +Prefer `grep:`/`grepInvert:` in projects over `test.skip()` to avoid 1000 skipped lines in reports. diff --git a/.claude/skills/config-and-secrets/references/secret-stores.md b/.claude/skills/config-and-secrets/references/secret-stores.md new file mode 100644 index 0000000..4f5b0b1 --- /dev/null +++ b/.claude/skills/config-and-secrets/references/secret-stores.md @@ -0,0 +1,47 @@ +# Secret stores + +Never commit secrets. Pull them at runtime from a real secret store. + +## GitHub Actions + +```yaml +- name: Run tests + env: + BASE_URL: ${{ vars.BASE_URL }} + TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }} + TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }} + run: npx playwright test +``` + +## 1Password CLI (local) + +```bash +# scripts/load-secrets.sh +op signin +op run --env-file=.env.staging.template -- npx playwright test --project=staging-chromium +``` + +`.env.staging.template`: + +``` +BASE_URL=https://staging.example.com +TEST_USER_EMAIL=op://Engineering/test-user/email +TEST_USER_PASSWORD=op://Engineering/test-user/password +``` + +## AWS Secrets Manager / Vault + +```ts +// tests/infra/load-secrets.ts (worker setup project) +import { SecretsManagerClient, GetSecretValueCommand } from '@aws-sdk/client-secrets-manager'; +const client = new SecretsManagerClient({}); +const { SecretString } = await client.send(new GetSecretValueCommand({ SecretId: 'qa/test-user' })); +process.env.TEST_USER_PASSWORD = JSON.parse(SecretString!).password; +``` + +## Hard rules + +- `.env*` (except `.env.example` and `.env.*.template`) → `.gitignore`. +- A pre-commit hook scans staged files for `password|secret|token|apiKey|AKIA[0-9A-Z]{16}`. +- CI logs are scrubbed (`echo "::add-mask::$SECRET"` in GitHub Actions). +- No secret enters Claude Code context. Reference by env var name only. diff --git a/.claude/skills/config-and-secrets/scripts/scan-hardcoded.sh b/.claude/skills/config-and-secrets/scripts/scan-hardcoded.sh new file mode 100755 index 0000000..f4f3646 --- /dev/null +++ b/.claude/skills/config-and-secrets/scripts/scan-hardcoded.sh @@ -0,0 +1,35 @@ +#!/usr/bin/env bash +# +# scan-hardcoded.sh — fail when hardcoded URLs / credentials slip into tests/. +# +set -euo pipefail +ROOT="${1:-$(pwd)}" +TARGET_DIRS=("tests") + +violations=0 + +for dir in "${TARGET_DIRS[@]}"; do + [[ -d "$ROOT/$dir" ]] || continue + + # Hardcoded https/http literals (allow localhost & .env.example references) + if grep -RInE "https?://[^'\"\` ]+" "$ROOT/$dir" \ + --include='*.ts' --include='*.js' \ + --exclude-dir=generated --exclude-dir=node_modules \ + | grep -vE "(localhost|127\.0\.0\.1|baseURL|@example\.com|process\.env|env\.[A-Z_]+|//\s*example|tests/infra/env\.ts)"; then + echo "FAIL: hardcoded URL(s) above. Move to env config." + violations=$((violations + 1)) + fi + + # Hardcoded credentials + if grep -RInE "(password|token|secret|apiKey|api_key)\s*[:=]\s*['\"][^'\"]{6,}" "$ROOT/$dir" \ + --include='*.ts' --include='*.js' \ + --exclude-dir=generated --exclude-dir=node_modules; then + echo "FAIL: hardcoded credential(s) above." + violations=$((violations + 1)) + fi +done + +if [[ $violations -gt 0 ]]; then + exit 1 +fi +echo "OK: no hardcoded secrets/URLs." diff --git a/.claude/skills/coverage-gap-analyzer/SKILL.md b/.claude/skills/coverage-gap-analyzer/SKILL.md new file mode 100644 index 0000000..8a80a24 --- /dev/null +++ b/.claude/skills/coverage-gap-analyzer/SKILL.md @@ -0,0 +1,30 @@ +--- +name: coverage-gap-analyzer +description: Cross-references OpenAPI endpoints + UI pages + Gherkin scenarios with executed tests to surface untested paths and uncovered acceptance criteria. Use when user asks "coverage gaps", "what's not tested", "which endpoints have no tests", before a release sign-off. Do NOT use for line/branch coverage of production code (that's a different tool). +allowed-tools: Read, Glob, Grep, Bash +--- + +# Coverage Gap Analyzer + +## Workflow + +1. Build inventory: + - Endpoints: parse `specs/openapi.yaml` for `paths × methods`. + - Pages: glob `tests-config.json.layout.pages/**/*Page.ts`. + - AC: parse `docs/test-design/*.md` for "AC-x" tags. +2. Build executed map: parse last `playwright-report/results.json` + spec annotations (`test('... @endpoint:GET /orders @ac:AC-3')`). +3. Diff inventory vs executed. +4. Output `COVERAGE_GAPS.md` with prioritised list (risk × frequency). + +## Tagging convention (must follow) + +```ts +test('cancel pending order @ac:AC-12 @endpoint:DELETE /orders/{id}', async ({ ... }) => { ... }); +``` + +The skill grep-extracts `@ac:`, `@endpoint:`, `@page:` tags and matches them to the inventory. Untagged tests do not contribute coverage. + +## References + +- `references/test-tagging-convention.md` +- `references/coverage-template.md` diff --git a/.claude/skills/coverage-gap-analyzer/assets/README.md b/.claude/skills/coverage-gap-analyzer/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/coverage-gap-analyzer/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/coverage-gap-analyzer/references/coverage-template.md b/.claude/skills/coverage-gap-analyzer/references/coverage-template.md new file mode 100644 index 0000000..b3a2d5c --- /dev/null +++ b/.claude/skills/coverage-gap-analyzer/references/coverage-template.md @@ -0,0 +1,33 @@ +# Coverage gaps — + +> Generated by `coverage-gap-analyzer`. + +## Summary + +| Inventory | Total | Covered | Missing | % | +| ------------------- | ----: | ------: | ------: | --: | +| OpenAPI endpoints | 0 | 0 | 0 | — | +| Acceptance criteria | 0 | 0 | 0 | — | +| Page objects | 0 | 0 | 0 | — | + +## Critical gaps (P0) + +| Endpoint / AC / Page | Risk note | +| -------------------- | --------- | + +## Major gaps (P1) + +| ... | + +## Tag hygiene issues + +| File:Line | Issue | +| -------------------------- | --------------------------------------------- | +| tests/specs/foo.spec.ts:12 | Test missing @feature tag | +| tests/specs/bar.spec.ts:30 | @endpoint:DELETE /unknown does not match spec | + +## Recommendation + +1. Address P0 gaps before release sign-off. +2. Schedule P1 gaps in next sprint. +3. Fix tag hygiene before coverage numbers can be trusted. diff --git a/.claude/skills/coverage-gap-analyzer/references/test-tagging-convention.md b/.claude/skills/coverage-gap-analyzer/references/test-tagging-convention.md new file mode 100644 index 0000000..9a565cc --- /dev/null +++ b/.claude/skills/coverage-gap-analyzer/references/test-tagging-convention.md @@ -0,0 +1,51 @@ +# Test tagging convention + +Tags are how the coverage analyser knows what each test covers. Without tags, a test contributes nothing to coverage metrics. + +## Required tags + +| Tag | Format | Required when | +| --------------------------- | ------------------------------- | ----------------------------------------- | +| `@ac:` | `@ac:AC-12` | The test verifies an acceptance criterion | +| `@endpoint: ` | `@endpoint:DELETE /orders/{id}` | API spec | +| `@page:` | `@page:CheckoutPage` | UI spec | +| `@feature:` | `@feature:auth` | All specs (used for grouping) | +| `@owner:` | `@owner:qa-platform` | All specs | + +## Optional tags + +- `@smoke` — runs against every env including prod. +- `@regression` — runs in nightly suite. +- `@external` — depends on an external service; CI may skip when unreachable. +- `@quarantine` — unstable; allowed for ≤ 2 sprints. +- `@dangerous` — never runs against prod. +- `@fuzz` — uses randomized input. + +## How to tag + +In Playwright, embed in the test title (matched by the `grep` and parsers): + +```ts +test('cancel pending order @ac:AC-12 @endpoint:DELETE /orders/{id} @feature:orders @owner:qa-platform', async ({ ... }) => { + /* ... */ +}); +``` + +For `describe` blocks, place tags at the describe level — they apply to all child tests: + +```ts +test.describe('Order cancellation @feature:orders @owner:qa-platform', () => { + /* ... */ +}); +``` + +## Validation + +`scripts/validate-tags.ts` (in this skill) ensures: + +- Every test has at least `@feature` and `@owner`. +- `@endpoint:` tags reference real endpoints from the OpenAPI spec. +- `@ac:` tags reference IDs that exist in `docs/test-design/*.md`. +- `@page:` tags reference real page-object class names. + +Mismatches show up in the gap report. diff --git a/.claude/skills/coverage-gap-analyzer/scripts/analyze-coverage.ts b/.claude/skills/coverage-gap-analyzer/scripts/analyze-coverage.ts new file mode 100755 index 0000000..f1cdd86 --- /dev/null +++ b/.claude/skills/coverage-gap-analyzer/scripts/analyze-coverage.ts @@ -0,0 +1,104 @@ +#!/usr/bin/env -S npx tsx +/** + * analyze-coverage.ts — produce COVERAGE_GAPS.md. + * + * Strategy: parse OpenAPI for endpoints, parse test files for tags, + * intersect, emit gaps. AC tag matching is best-effort string match. + */ +import { readFileSync, readdirSync, statSync, existsSync, writeFileSync } from 'node:fs'; +import { join } from 'node:path'; + +const root = process.cwd(); +const cfg = JSON.parse(readFileSync(join(root, 'tests-config.json'), 'utf8')); +const specsDir = join(root, cfg.layout.specs); +const pagesDir = join(root, cfg.layout.pages); +const openapiPath = join(root, cfg.openapi.source); + +// --- Inventories --- + +function collect(dir: string, suffix = '.ts', acc: string[] = []): string[] { + if (!existsSync(dir)) return acc; + for (const e of readdirSync(dir)) { + const p = join(dir, e); + if (statSync(p).isDirectory()) collect(p, suffix, acc); + else if (p.endsWith(suffix)) acc.push(p); + } + return acc; +} + +// Endpoints from OpenAPI (very simple YAML scan, sufficient for a coverage gap heuristic) +function readEndpoints(): string[] { + if (!existsSync(openapiPath)) return []; + const raw = readFileSync(openapiPath, 'utf8'); + const endpoints: string[] = []; + const lines = raw.split('\n'); + let currentPath = ''; + for (const line of lines) { + const pm = /^\s{2}(\/[^:]+):\s*$/.exec(line); + if (pm) currentPath = pm[1].trim(); + const mm = /^\s{4}(get|post|put|patch|delete|head|options):\s*$/i.exec(line); + if (mm && currentPath) endpoints.push(`${mm[1].toUpperCase()} ${currentPath}`); + } + return endpoints; +} + +// Page objects +function readPages(): string[] { + return collect(pagesDir).map((p) => p.split('/').pop()!.replace(/\.ts$/, '')); +} + +// Tags in tests +const testTagRe = /@(\w+):([^\s'"]+)/g; +function readCoveredTags(): { endpoints: Set; acs: Set; pages: Set } { + const endpoints = new Set(); + const acs = new Set(); + const pages = new Set(); + for (const file of collect(specsDir, '.spec.ts')) { + const src = readFileSync(file, 'utf8'); + for (const m of src.matchAll(testTagRe)) { + const [, k, v] = m; + if (k === 'endpoint') endpoints.add(v); + else if (k === 'ac') acs.add(v); + else if (k === 'page') pages.add(v); + } + // catch endpoint tags formatted as "@endpoint:METHOD /path" + for (const m of src.matchAll(/@endpoint:([A-Z]+)\s+([^\s'"]+)/g)) { + endpoints.add(`${m[1]} ${m[2]}`); + } + } + return { endpoints, acs, pages }; +} + +const inventoryEndpoints = readEndpoints(); +const inventoryPages = readPages(); +const covered = readCoveredTags(); + +const missingEndpoints = inventoryEndpoints.filter((e) => !covered.endpoints.has(e)); +const missingPages = inventoryPages.filter((p) => !covered.pages.has(p)); + +const md: string[] = []; +md.push(`# Coverage gaps — ${new Date().toISOString().slice(0, 10)}`); +md.push(''); +md.push('## Endpoints'); +md.push( + `Inventory: ${inventoryEndpoints.length} | Covered: ${covered.endpoints.size} | Missing: ${missingEndpoints.length}`, +); +md.push(''); +if (missingEndpoints.length) { + md.push('### Missing endpoints'); + for (const e of missingEndpoints) md.push(`- [ ] ${e}`); + md.push(''); +} + +md.push('## Pages'); +md.push( + `Inventory: ${inventoryPages.length} | Covered: ${covered.pages.size} | Missing: ${missingPages.length}`, +); +md.push(''); +if (missingPages.length) { + md.push('### Page objects without any test reference'); + for (const p of missingPages) md.push(`- [ ] ${p}`); +} + +writeFileSync(join(root, 'COVERAGE_GAPS.md'), md.join('\n')); +console.log('OK: COVERAGE_GAPS.md written.'); diff --git a/.claude/skills/fixture-architect/SKILL.md b/.claude/skills/fixture-architect/SKILL.md new file mode 100644 index 0000000..03bf264 --- /dev/null +++ b/.claude/skills/fixture-architect/SKILL.md @@ -0,0 +1,46 @@ +--- +name: fixture-architect +description: Designs Playwright fixtures with proper scoping (test vs worker), composition, and dependency injection. Wires page objects, API clients, factories, authenticated storage state, and per-test cleanup. Use when user adds a new fixture, mentions "test.extend", "worker scope", "auth state", "shared context", "DI for tests", or when a spec instantiates page objects manually inside the test body. Do NOT use to write business steps; defer to playwright-test-author-ui. +allowed-tools: Read, Write, Edit, Glob, Grep, Bash +--- + +# Fixture Architect + +## Trigger + +- Spec contains `new SomePage(page)` inline. +- User: "add fixture", "worker fixture", "shared login", "extend test". + +## Decision tree + +| Need | Scope | Pattern | +| ---------------------------------------------- | -------- | ----------------------------------------------------- | +| Page object (per test) | `test` | `({ page }, use) => use(new XPage(page))` | +| API client | `test` | uses request-context fixture | +| Logged-in user (reused across tests in worker) | `worker` | `storageState` cached file | +| Test data lifecycle (create+delete) | `test` | factory + afterEach cleanup in fixture | +| Network mocks (route handlers) | `test` | `await use()` between `page.route` and `page.unroute` | + +## Workflow + +1. Read `tests-config.json.layout.fixtures`. +2. Compose into `tests/fixtures/index.ts` via `base.extend`. NEVER overwrite `page` without justification. +3. For worker-scope auth: use `references/auth.setup.template.ts` pattern (`storageState` saved by setup project, consumed by other projects). +4. Validate with `scripts/fixture-rules.ts`: + - No fixture without `await use(...)`. + - No fixture mutating global state outside its lifecycle. + - Fixture name does not collide with built-ins (`page`, `request`, `browser`, `context`). + - `worker`-scope fixtures must not depend on `test`-scope ones. + +## Anti-patterns prevented + +- Leaky fixtures (cleanup missing → next test inherits state). +- Implicit ordering between fixtures (rely on dependency graph, not file order). +- Mega-fixture file > 300 lines (skill splits per concern). + +## References + +- `references/scope-decision.md` +- `references/auth.setup.template.ts` +- `references/composing-fixtures.md` +- `references/anti-leaks-checklist.md` diff --git a/.claude/skills/fixture-architect/assets/README.md b/.claude/skills/fixture-architect/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/fixture-architect/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/fixture-architect/references/anti-leaks-checklist.md b/.claude/skills/fixture-architect/references/anti-leaks-checklist.md new file mode 100644 index 0000000..2c0c928 --- /dev/null +++ b/.claude/skills/fixture-architect/references/anti-leaks-checklist.md @@ -0,0 +1,36 @@ +# Fixture leak checklist + +A "leak" = state created in one test surfaces in the next. Symptoms: order-dependent failures, ghost data, "passes alone, fails in suite". + +## Hard checks + +1. Every fixture body has exactly one `await use(...)` call. +2. Resources created before `use` are torn down after `use` (try/finally if necessary). +3. No fixture writes to a file path shared across tests without a per-test suffix. +4. No fixture mutates `process.env` without restoring. +5. No fixture mutates the imported factory (e.g. assigns `userFactory.someProp = ...`). +6. Worker-scope fixtures avoid mutable maps/sets that grow across tests. +7. `page.route` handlers are removed via `page.unroute` or live in a `test`-scope fixture that auto-cleans on teardown. + +## Soft checks (warn) + +- Fixture name shadows a built-in (`page`, `request`, `browser`, `context`, `browserName`). +- Two fixtures depend on each other in a cycle. +- A `test`-scope fixture is consumed by more than 5 other fixtures (refactor candidate). + +## Diagnosis when a leak is suspected + +```bash +npx playwright test --workers=1 --shuffle # randomise order +npx playwright test --repeat-each=3 # detect order-dependent failures +``` + +If the suite passes with `--workers=1` but fails parallel — concurrent state mutation. +If it fails with `--shuffle` only — order-dependent leak. + +## Common offenders + +- Login that writes a single `auth.json` shared across users. +- Test data factories that return the same `id` because the sequence resets. +- `beforeAll` in spec files that mutates a module-level singleton. +- Mock servers started in `worker` scope, configured in `test` scope, never reset. diff --git a/.claude/skills/fixture-architect/references/auth.setup.template.ts b/.claude/skills/fixture-architect/references/auth.setup.template.ts new file mode 100644 index 0000000..f2bc021 --- /dev/null +++ b/.claude/skills/fixture-architect/references/auth.setup.template.ts @@ -0,0 +1,32 @@ +/** + * tests/fixtures/auth.setup.ts — runs once per project (referenced by `dependencies: ['setup']`) + * to populate tests/.auth/.json. Other tests load this file via storageState in playwright.config.ts. + * + * Why this pattern: + * - One UI/API login per worker, not per test. ~10x perf for E2E suites. + * - storageState is a pure JSON snapshot — safe to share read-only across workers. + * - Failures in setup mark the whole project as broken instead of producing 1000 cryptic test fails. + */ +import { test as setup, expect } from '@playwright/test'; +import path from 'node:path'; + +const userAuthFile = path.join(__dirname, '../.auth/user.json'); +const adminAuthFile = path.join(__dirname, '../.auth/admin.json'); + +setup('authenticate as user', async ({ page }) => { + await page.goto('/login'); + await page.getByLabel('Email').fill(process.env.TEST_USER_EMAIL!); + await page.getByLabel('Password').fill(process.env.TEST_USER_PASSWORD!); + await page.getByRole('button', { name: 'Sign in' }).click(); + await expect(page.getByRole('heading', { name: /welcome/i })).toBeVisible(); + await page.context().storageState({ path: userAuthFile }); +}); + +setup('authenticate as admin', async ({ page }) => { + await page.goto('/login'); + await page.getByLabel('Email').fill(process.env.TEST_ADMIN_EMAIL!); + await page.getByLabel('Password').fill(process.env.TEST_ADMIN_PASSWORD!); + await page.getByRole('button', { name: 'Sign in' }).click(); + await expect(page.getByRole('navigation', { name: /admin/i })).toBeVisible(); + await page.context().storageState({ path: adminAuthFile }); +}); diff --git a/.claude/skills/fixture-architect/references/composing-fixtures.md b/.claude/skills/fixture-architect/references/composing-fixtures.md new file mode 100644 index 0000000..30b44a0 --- /dev/null +++ b/.claude/skills/fixture-architect/references/composing-fixtures.md @@ -0,0 +1,80 @@ +# Composing fixtures + +Compose using `base.extend()` and split per concern. Aggregate in `tests/fixtures/index.ts`. + +## Pattern + +```ts +// tests/fixtures/pages.ts +import { test as base } from '@playwright/test'; +import { LoginPage, DashboardPage } from '@pages'; + +type PageFixtures = { + loginPage: LoginPage; + dashboardPage: DashboardPage; +}; + +export const pagesTest = base.extend({ + loginPage: async ({ page }, use) => { + await use(new LoginPage(page)); + }, + dashboardPage: async ({ page }, use) => { + await use(new DashboardPage(page)); + }, +}); +``` + +```ts +// tests/fixtures/api.ts +import { test as base } from '@playwright/test'; +import { OrdersClient } from '@api/clients/ordersClient'; + +type ApiFixtures = { + ordersClient: OrdersClient; +}; + +export const apiTest = base.extend({ + ordersClient: async ({ request }, use) => { + await use(new OrdersClient(request)); + }, +}); +``` + +```ts +// tests/fixtures/data.ts — manages lifecycle: create + cleanup +import { test as base } from '@playwright/test'; +import { userFactory } from '@factories/user.factory'; + +type DataFixtures = { + seededUser: { id: string; email: string }; +}; + +export const dataTest = base.extend({ + seededUser: async ({ ordersClient: _ }, use, testInfo) => { + const draft = userFactory.build(); + // const user = await usersClient.create(draft); + const user = { id: draft.id, email: draft.email }; + await use(user); + // teardown — guaranteed even on failure + // await usersClient.delete(user.id); + }, +}); +``` + +```ts +// tests/fixtures/index.ts +import { mergeTests } from '@playwright/test'; +import { pagesTest } from './pages'; +import { apiTest } from './api'; +import { dataTest } from './data'; + +export const test = mergeTests(pagesTest, apiTest, dataTest); +export { expect } from '@playwright/test'; +``` + +## Rules + +- Files in `fixtures/` ≤ 250 LOC. +- One concern per file (pages / api / data / mocks / auth). +- Aggregate via `mergeTests`, never manually re-extend. +- A fixture body always uses `await use(value)` — even when teardown is empty. diff --git a/.claude/skills/fixture-architect/references/scope-decision.md b/.claude/skills/fixture-architect/references/scope-decision.md new file mode 100644 index 0000000..e2f103b --- /dev/null +++ b/.claude/skills/fixture-architect/references/scope-decision.md @@ -0,0 +1,34 @@ +# Fixture scope decision + +Playwright supports two scopes: + +- `test` — default; fixture is created per test, cleaned up after. +- `worker` — fixture is created once per worker process and reused across tests. + +## When to use worker scope + +Use it when **all** of the following hold: + +1. The fixture is read-only across tests (no mutation that could leak). +2. Construction is expensive (auth flow, browser launch, large dataset preload). +3. Tests using it do not rely on a fresh state (otherwise pin to `test`). + +## Common cases + +| Resource | Scope | Why | +| ----------------------------------------------- | ----------------------------------- | -------------------------------------- | +| Logged-in `storageState` for a stable test user | `worker` | Avoid logging in 1000 times; immutable | +| API client with baseURL & auth token | `worker` | Stateless wrapper | +| Page object | `test` | Bound to a `page` (which is per-test) | +| User created via factory then deleted | `test` | Side effects must be cleaned up | +| Mock server | `worker` (start), `test` (handlers) | Server up once; handlers per test | + +## Do NOT mix scopes incorrectly + +A `worker`-scope fixture cannot consume `test`-scope fixtures. Playwright will throw at runtime. The fixture-rules.ts validator catches this statically by checking the dependency graph. + +## Caveats + +- worker-scope state is reset between projects, not between tests within a project. +- If a worker-scope fixture mutates state during a test, parallel tests in the same worker will interfere. +- Prefer multiple small worker-scope fixtures over one fat one — Playwright reuses across the dependency graph anyway. diff --git a/scripts/fixture-rules.ts b/.claude/skills/fixture-architect/scripts/fixture-rules.ts similarity index 100% rename from scripts/fixture-rules.ts rename to .claude/skills/fixture-architect/scripts/fixture-rules.ts diff --git a/.claude/skills/flaky-triage/SKILL.md b/.claude/skills/flaky-triage/SKILL.md new file mode 100644 index 0000000..5d7c353 --- /dev/null +++ b/.claude/skills/flaky-triage/SKILL.md @@ -0,0 +1,35 @@ +--- +name: flaky-triage +description: Detects, classifies, and proposes fixes for flaky Playwright tests by analysing test results JSON and traces across multiple runs. Distinguishes timing flakes, data flakes, environment flakes, and real bugs. Use when user reports "flaky", "intermittent failure", "this test passes locally", or after a CI report shows reruns. Do NOT use to write new tests. +allowed-tools: Read, Glob, Grep, Bash +--- + +# Flaky Triage + +## Inputs + +- `playwright-report/` (HTML + JSON). +- N×raw runs in `test-results/`. +- Optional: CI history exported via `scripts/export-ci-runs.sh`. + +## Workflow + +1. Aggregate pass/fail per test across runs (`scripts/flake-rate.ts`). +2. For each test with rate ∈ (0, 1): pull traces, diff failing vs passing step. +3. Classify: + - **Timing**: assertion fired before stable state → fix with `expect.poll` / web-first. + - **Data**: shared seed, sequence collision → fix with `factory.sequence` + per-test namespace. + - **Env**: external dep down/slow → quarantine + reproduce in isolation. + - **Locator**: dynamic ID/class → switch to role/test-id. + - **Real bug**: app race condition → file ticket, do NOT mark flaky. +4. Output `FLAKE_REPORT.md` + suggested patch. + +## Quarantine policy + +A flaky test may be tagged `@quarantine` for ≤ 2 sprints with an open issue link. After that, delete or fix. + +## References + +- `references/flake-taxonomy.md` +- `references/quarantine-policy.md` +- `references/repro-strategy.md` diff --git a/.claude/skills/flaky-triage/assets/README.md b/.claude/skills/flaky-triage/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/flaky-triage/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/flaky-triage/references/flake-taxonomy.md b/.claude/skills/flaky-triage/references/flake-taxonomy.md new file mode 100644 index 0000000..fd28b83 --- /dev/null +++ b/.claude/skills/flaky-triage/references/flake-taxonomy.md @@ -0,0 +1,28 @@ +# Flake taxonomy + +Each flake belongs to exactly one class. Class drives the fix. + +| Class | Hallmarks | Fix | +| ---------------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | +| **Timing** | Same step fails fast on retry; trace shows assertion before DOM stable. | Web-first; `expect.poll` for value-based polling; remove arbitrary `actionTimeout`. | +| **Animation** | Headed passes, headless fails; element fades in/out. | Disable animations via CSS (`animation: none !important`) in test fixture; or assert with `toHaveClass(/active/)`. | +| **Network** | 429/5xx in trace; passes in isolation. | Mark `@external`; mock or run only against staging. | +| **Data leak** | Fails in suite, passes alone; depends on prior test. | Per-test cleanup; per-test factory namespace; `--shuffle --workers=1` repro. | +| **Sequence collision** | Two parallel workers create the same id. | Factory uses `sequence` + worker-index suffix. | +| **Time-dependent** | Fails near midnight, daylight-saving, end-of-month. | `page.clock.install()` and pin a timestamp. | +| **Locator drift** | Trace shows multiple elements match; element re-rendered. | Switch to `getByRole`; scope inside ancestor; never `nth(N)`. | +| **Env** | Specific worker / browser / OS only. | Reproduce on the matching env; if irreproducible, classify "haunted" and quarantine. | +| **Real bug** | Reproduces via API client without UI; consistent under load. | File `BUG.md`; do not modify the test. | + +## Heuristic: probability x reproducibility + +| Reproducibility | Action | +| -------------------- | ---------------------------------------------------------- | +| 100% (deterministic) | Not flake; debug as failure. | +| 30–95% | Triage; classify. | +| < 30% | Repeat with `--repeat-each=20`; if still rare, quarantine. | +| 0% | Already fixed or environment artifact; close. | + +## Why care about classification + +Untyped flakes accumulate. After 6 months of "rerun" culture you have 200 tests with hidden bugs and no signal worth trusting. The skill enforces a class per case so the team can plot the distribution and target the worst class first. diff --git a/.claude/skills/flaky-triage/references/quarantine-policy.md b/.claude/skills/flaky-triage/references/quarantine-policy.md new file mode 100644 index 0000000..e0a1596 --- /dev/null +++ b/.claude/skills/flaky-triage/references/quarantine-policy.md @@ -0,0 +1,34 @@ +# Quarantine policy + +Quarantine is a last resort, not a coping mechanism. + +## Tag + +```ts +test('quarantine candidate @quarantine #PROJ-1234', async ({}) => { + /* ... */ +}); +``` + +## Rules + +1. Every `@quarantine` MUST have an open issue link in the title or annotation. +2. CI runs quarantined tests but does NOT fail the build on them. Failures are reported separately. +3. Maximum age: **2 sprints**. After that, the test is either fixed or deleted. +4. Quarantine cap per repo: **2%** of total tests. If exceeded, all engineering work pauses on new features until under cap. + +## Enforcement + +`scripts/check-quarantine-age.sh` runs in CI: + +- Greps for `@quarantine` tags. +- Cross-references with the issue tracker via `GH_TOKEN` and the issue id parsed from the title. +- Fails the build if any quarantined test is older than 2 sprints (configurable in days). + +## Ownership + +A quarantined test has an owner (the SDET who tagged it). The owner gets a weekly reminder until the test is removed from quarantine or the team explicitly accepts the cost. + +## Anti-pattern + +`test.describe.skip(...)` mass-disabling — the worst version of quarantine because it disappears from reports. Prefer the tag. diff --git a/.claude/skills/flaky-triage/references/repro-strategy.md b/.claude/skills/flaky-triage/references/repro-strategy.md new file mode 100644 index 0000000..a029d7a --- /dev/null +++ b/.claude/skills/flaky-triage/references/repro-strategy.md @@ -0,0 +1,30 @@ +# Reproducing flakes + +The single most useful trick is to make the flake reliably reproducible. + +## Step ladder + +1. **Repeat-each**: `npx playwright test path/spec --repeat-each=20`. Catches simple races. +2. **Shuffle**: `npx playwright test --shuffle`. Catches order-dependent leaks. +3. **Workers=1 + repeat-each=20**: isolates from parallelism. +4. **Workers=4 + repeat-each=20**: surfaces parallelism issues. +5. **Slow CI sim**: `--use-args slowMo=200` and `CPU=2` (`taskset -c 0,1`) to mimic constrained CI agents. +6. **Different timezone**: `TZ=America/Los_Angeles npx playwright test`. +7. **Different locale**: `LANG=de_DE.UTF-8 LANGUAGE=de`. +8. **Throttled network**: per-context `await context.route('**', route => /* delay */)` or use Chrome DevTools throttling profile. +9. **Different browser engine**: rerun on firefox + webkit. + +## Capture maximum data on first reliable repro + +```bash +npx playwright test path/spec \ + --workers=1 \ + --repeat-each=20 \ + --trace=on \ + --video=on \ + --reporter=list,html,json +``` + +## When you cannot reproduce + +Don't fight it. Quarantine, file an issue with the failing trace, and move on. Reproducible flakes get fixed; irreproducible "haunted" tests get deleted after 2 sprints — the cost of looking flaky outweighs the small probability of catching a real bug. diff --git a/.claude/skills/flaky-triage/scripts/flake-rate.ts b/.claude/skills/flaky-triage/scripts/flake-rate.ts new file mode 100755 index 0000000..a1eaa51 --- /dev/null +++ b/.claude/skills/flaky-triage/scripts/flake-rate.ts @@ -0,0 +1,60 @@ +#!/usr/bin/env -S npx tsx +/** + * flake-rate.ts — given a directory of Playwright JSON reports, computes + * pass/fail rate per test and emits a Markdown table sorted by flakiness. + * + * Usage: + * tsx flake-rate.ts ./runs > FLAKE_REPORT.md + */ +import { readFileSync, readdirSync, statSync } from 'node:fs'; +import { join } from 'node:path'; + +const dir = process.argv[2] ?? './runs'; + +interface Stat { + passed: number; + failed: number; + flaky: number; + total: number; +} +const stats = new Map(); + +function bump(key: string, status: string) { + const s = stats.get(key) ?? { passed: 0, failed: 0, flaky: 0, total: 0 }; + s.total++; + if (status === 'passed' || status === 'expected') s.passed++; + else if (status === 'failed' || status === 'unexpected' || status === 'timedOut') s.failed++; + else if (status === 'flaky') s.flaky++; + stats.set(key, s); +} + +function walk(node: any) { + for (const spec of node.specs ?? []) { + for (const t of spec.tests ?? []) { + const key = `${spec.file ?? '?'} > ${spec.title}`; + for (const r of t.results ?? []) bump(key, r.status); + } + } + for (const child of node.suites ?? []) walk(child); +} + +const files = readdirSync(dir).filter((f) => f.endsWith('.json')); +for (const f of files) { + const json = JSON.parse(readFileSync(join(dir, f), 'utf8')); + walk(json); +} + +const rows = [...stats.entries()] + .map(([title, s]) => ({ title, ...s, rate: s.failed / Math.max(s.total, 1) })) + .filter((r) => r.rate > 0 && r.rate < 1) + .sort((a, b) => b.rate - a.rate); + +console.log('# Flake report\n'); +console.log(`Runs analysed: ${files.length}\n`); +console.log('| Test | Total | Pass | Fail | Flake | Failure rate |'); +console.log('|---|---:|---:|---:|---:|---:|'); +for (const r of rows) { + console.log( + `| ${r.title} | ${r.total} | ${r.passed} | ${r.failed} | ${r.flaky} | ${(r.rate * 100).toFixed(0)}% |`, + ); +} diff --git a/.claude/skills/gherkin-test-case-author/SKILL.md b/.claude/skills/gherkin-test-case-author/SKILL.md new file mode 100644 index 0000000..69686a8 --- /dev/null +++ b/.claude/skills/gherkin-test-case-author/SKILL.md @@ -0,0 +1,27 @@ +--- +name: gherkin-test-case-author +description: Authors Gherkin (Given-When-Then) or structured test cases from an existing test design document. Enforces declarative style, single-purpose scenarios, ubiquitous domain language, and correct level of abstraction. Use when a docs/test-design/*.md exists and user says "write test cases", "gherkin", "feature file", "BDD", "scenarios". Do NOT use to author Playwright TypeScript code; defer to playwright-test-author-*. +allowed-tools: Read, Write, Glob, Grep +--- + +# Gherkin Test Case Author + +## Hard rules + +- One scenario = one behaviour. If you write `And ... And ... And` more than 3 times, split. +- Use **declarative** language ("user submits the form"), not **imperative** ("user clicks #submit"). +- No UI selectors in `.feature`. Selectors live in step definitions / page objects. +- Use `Background` only for setup shared by ALL scenarios in the file. Otherwise factor to fixtures. +- Scenario Outline only when the test logic is identical and only data changes. + +## Workflow + +1. Read `docs/test-design/.md`. +2. Generate `tests/specs/.feature` (or structured TC json if BDD is not used). +3. Run `scripts/gherkin-lint.sh` (gherkin-lint with strict ruleset). + +## References + +- `references/declarative-vs-imperative.md` +- `references/gherkin-smells.md` (UI leak, "And" abuse, conjunctions, abstract verbs) +- `references/feature.template.feature` diff --git a/.claude/skills/gherkin-test-case-author/assets/README.md b/.claude/skills/gherkin-test-case-author/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/gherkin-test-case-author/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/gherkin-test-case-author/references/declarative-vs-imperative.md b/.claude/skills/gherkin-test-case-author/references/declarative-vs-imperative.md new file mode 100644 index 0000000..41e49fe --- /dev/null +++ b/.claude/skills/gherkin-test-case-author/references/declarative-vs-imperative.md @@ -0,0 +1,46 @@ +# Declarative vs imperative Gherkin + +## Imperative (avoid) + +Reads like a UI script. Brittle, ages poorly, hides intent. + +```gherkin +Scenario: Login + Given I open "https://app.example.com/login" + When I type "user@example.com" into the field with id "email" + And I type "Pa$$w0rd" into the field with id "password" + And I click the button with class "btn-primary" + Then I see the text "Welcome" on the page +``` + +## Declarative (prefer) + +Describes user intent in domain language; details live in step definitions. + +```gherkin +Scenario: Authenticated user reaches the dashboard + Given a registered user + When the user signs in with valid credentials + Then the dashboard is displayed +``` + +## How to choose verbs + +- Domain verbs ("submits", "approves", "cancels") rather than UI verbs ("clicks", "types"). +- Reference roles ("the user", "an admin"), not selectors. +- Talk about outcomes ("the order is rejected"), not implementation ("the API returns 422"). + +## Where the detail goes + +- Step definitions translate intent to API/UI actions. +- Page objects encapsulate selectors. +- Factories provide data ("a registered user" → `userFactory.build()` + API seed). + +## Smell triggers + +| Smell | Fix | +| -------------------------------------- | ------------------------------------- | +| `id "..."` / `class "..."` in scenario | Move to step definition / page object | +| `wait 2 seconds` | Use web-first assertions inside step | +| Multi-paragraph `Then` | Split scenario | +| Same `Given` in 5+ scenarios | Promote to `Background` or fixture | diff --git a/.claude/skills/gherkin-test-case-author/references/feature.template.feature b/.claude/skills/gherkin-test-case-author/references/feature.template.feature new file mode 100644 index 0000000..ce079cd --- /dev/null +++ b/.claude/skills/gherkin-test-case-author/references/feature.template.feature @@ -0,0 +1,34 @@ +@feature:auth @owner:qa-platform +Feature: User authentication + + As a registered user + I want to sign in with my credentials + So that I can access my private dashboard + + Background: + Given the application is reachable + + @smoke @ac:AC-1 + Scenario: Authenticated user reaches the dashboard + Given a registered user + When the user signs in with valid credentials + Then the dashboard is displayed + + @ac:AC-2 + Scenario: Invalid password is rejected with a clear message + Given a registered user + When the user signs in with the wrong password + Then a generic invalid-credentials message is displayed + And the user remains on the sign-in page + + @ac:AC-3 + Scenario Outline: Locked accounts cannot sign in + Given a registered user whose account is "" + When the user signs in with valid credentials + Then the sign-in is denied with reason "" + + Examples: + | status | reason | + | suspended | account-suspended | + | deleted | account-not-found | + | unverified | email-not-verified | diff --git a/.claude/skills/gherkin-test-case-author/references/gherkin-smells.md b/.claude/skills/gherkin-test-case-author/references/gherkin-smells.md new file mode 100644 index 0000000..b8ac2d5 --- /dev/null +++ b/.claude/skills/gherkin-test-case-author/references/gherkin-smells.md @@ -0,0 +1,30 @@ +# Gherkin smells + +A non-exhaustive list flagged by the reviewer. + +## Structure smells + +- **And-abuse**: `When ... And ... And ... And ...` — collapse to one declarative step or split scenarios. +- **Conjunction step**: `Given the user is logged in and has 3 items in cart` — split with `And`. +- **Outline-as-data-dump**: `Examples:` table with 50 rows — move to a unit/contract test. +- **Background bloat**: `Background` containing setup specific to one scenario. +- **Tag soup**: 7+ tags on a single scenario — usually a sign two scenarios are conflated. + +## Language smells + +- **Imperative verbs**: clicks, types, scrolls — replace with domain verbs. +- **UI selectors leaking**: `#submit`, `.btn-primary` — never in a feature file. +- **Implementation leaking**: "the database row is updated" — talk about outcomes the user observes. +- **Vague verbs**: "checks", "verifies", "ensures" — say what is observed. +- **Plural assertions**: "Then the user sees the dashboard and the welcome banner and the menu" — split. + +## Scope smells + +- A single scenario covering happy path + edge case + error. +- Two scenarios that differ only by environment (lift to projects/tags instead). +- Scenario outline whose logic actually differs across rows. + +## Maintainability smells + +- Step definitions duplicated across feature files — promote to a shared step library. +- A step that takes 6+ parameters — wrap parameters in a builder/factory call. diff --git a/.claude/skills/gherkin-test-case-author/scripts/gherkin-lint.sh b/.claude/skills/gherkin-test-case-author/scripts/gherkin-lint.sh new file mode 100755 index 0000000..fcfe4e0 --- /dev/null +++ b/.claude/skills/gherkin-test-case-author/scripts/gherkin-lint.sh @@ -0,0 +1,17 @@ +#!/usr/bin/env bash +# +# gherkin-lint.sh — runs gherkin-lint over feature files (best-effort soft check). +# +set -euo pipefail +ROOT="${1:-$(pwd)}" + +if ! find "$ROOT/tests" -name '*.feature' -print -quit 2>/dev/null | grep -q .; then + echo "OK: no .feature files." + exit 0 +fi + +# Use npx with --yes to avoid interactive prompts; tolerate missing config +npx --yes gherkin-lint "tests/**/*.feature" || { + echo "Gherkin lint reported issues. Address smells in references/gherkin-smells.md." + exit 1 +} diff --git a/.claude/skills/playwright-debug-conductor/SKILL.md b/.claude/skills/playwright-debug-conductor/SKILL.md new file mode 100644 index 0000000..7ad640d --- /dev/null +++ b/.claude/skills/playwright-debug-conductor/SKILL.md @@ -0,0 +1,36 @@ +--- +name: playwright-debug-conductor +description: Conducts an interactive debugging session for a failing Playwright test using Trace Viewer, headed mode, --debug, slow-mo, and UI mode. Produces a root-cause hypothesis with evidence (screenshot, trace event, network log). Use when a test fails repeatedly, user asks "why is this flaky", "debug this test", "trace viewer says...", or after playwright-test-author-* hits 2+ retries. Do NOT use to write new tests. +allowed-tools: Read, Edit, Bash, Glob, Grep +--- + +# Playwright Debug Conductor + +## Workflow + +1. Re-run the failing spec with `--trace on --reporter=list --workers=1 --headed=$HEADED`. +2. Open `test-results/.../trace.zip` programmatically and `scripts/extract-failing-step.ts` summarises: + - failing locator + last DOM snapshot + - last 5 network calls + status + - last 20 console messages + - duration of each step +3. Classify failure: `selector`, `timing`, `data`, `env`, `app-bug`, `flake`. +4. Propose ONE fix at a time. Validate by re-running. +5. If app bug: stop, file `BUG.md` with reproducer. + +## Failure classifier + +| Signal | Class | First action | +| ------------------------------------------------ | -------------- | ---------------------------------------------------- | +| `Locator expected to be visible` after long wait | timing | switch to web-first; widen actionTimeout per-action | +| `strict mode violation: resolved to N elements` | selector | scope locator inside an ancestor; use `getByRole` | +| 401/403 in last network call | env / auth | verify storageState; rotate test creds | +| 5xx in last network call | env / app-bug | reproduce via API client; if reproducible → BUG.md | +| Order-dependent failure | data leak | run with `--shuffle --workers=1`; suspect a fixture | +| Passes headed, fails headless | timing / focus | check animation, autofocus, `prefers-reduced-motion` | + +## References + +- `references/failure-taxonomy.md` +- `references/trace-cheatsheet.md` +- `references/ui-mode-vs-debug.md` diff --git a/.claude/skills/playwright-debug-conductor/assets/README.md b/.claude/skills/playwright-debug-conductor/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/playwright-debug-conductor/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/playwright-debug-conductor/references/failure-taxonomy.md b/.claude/skills/playwright-debug-conductor/references/failure-taxonomy.md new file mode 100644 index 0000000..e319f03 --- /dev/null +++ b/.claude/skills/playwright-debug-conductor/references/failure-taxonomy.md @@ -0,0 +1,32 @@ +# Failure taxonomy + +Every failed test must end up in exactly one bucket. The bucket determines the fix path. + +| Class | Description | Fix owner | +| ------------ | --------------------------------------------------------------- | --------------------------- | +| **selector** | Locator misses, matches multiple, or matches the wrong element. | SDET | +| **timing** | Assertion fired before stable state. | SDET | +| **data** | Test data wrong/missing/leaked. | SDET (factory or fixture) | +| **env** | Wrong baseURL, expired secret, broken setup. | SDET / DevOps | +| **app-bug** | The product genuinely misbehaves. | Dev | +| **flake** | Intermittent, root cause unknown. | Hand off to flaky-detective | +| **harness** | Playwright/Node/runner misconfig. | SDET / Platform | + +## Decision flow + +``` +Failure + ├─ Reproducible alone? ── no ──> flake → flaky-detective + ├─ yes + │ ├─ Same on multiple browsers? + │ │ yes ── reproducible via API client only? ── yes ──> app-bug → BUG.md + │ │ no ──> timing/selector + │ │ no ──> harness / browser-specific + │ ├─ Wrong data? ──> data + │ ├─ Wrong baseURL/secret? ──> env + │ └─ otherwise ──> selector / timing per signals +``` + +## Anti-pattern: "fix" by retry + +Retrying does NOT close any failure class. If the only "fix" you can propose is "increase retries / timeout", reclassify as flake and triage properly. diff --git a/.claude/skills/playwright-debug-conductor/references/trace-cheatsheet.md b/.claude/skills/playwright-debug-conductor/references/trace-cheatsheet.md new file mode 100644 index 0000000..a87694a --- /dev/null +++ b/.claude/skills/playwright-debug-conductor/references/trace-cheatsheet.md @@ -0,0 +1,39 @@ +# Trace Viewer cheatsheet + +## Open a trace + +```bash +npx playwright show-trace test-results//trace.zip +# or open the HTML report and click any failed test → "Trace" +``` + +## What to look at, in order + +1. **Last action before failure** — pick the red step. Its DOM snapshot is on the right. Compare expected locator vs actual DOM. +2. **Action timing** — was the action queued long before resolution? Likely actionability gate (visible/enabled/stable). +3. **Network tab** — last few requests. 4xx/5xx, slow responses, missing CORS. +4. **Console tab** — uncaught errors, deprecation warnings. +5. **Source tab** — the line that called the failing API. +6. **Snapshots side-by-side** — toggle "Before" and "After" of the failing step. + +## Generate a richer trace + +```bash +npx playwright test path/to/spec --trace on --video on --headed --workers=1 +``` + +## Targeted trace for an investigation + +```ts +test('investigate intermittent submit', async ({ page }, testInfo) => { + await testInfo.attach('initial-html', { body: await page.content(), contentType: 'text/html' }); + // ... actions ... +}); +``` + +## Common findings + +- Element exists but is covered by a sticky header → `scrollIntoViewIfNeeded` or assert on `toBeInViewport`. +- Element re-renders mid-action → use `getByRole` (re-resolves) instead of `locator(...).nth(0)`. +- Animation hides the element briefly → `prefers-reduced-motion` env or wait for `transition-end`. +- Element resolves to 2 because a dialog and the main page share the same role/text → scope inside `getByRole('dialog')`. diff --git a/.claude/skills/playwright-debug-conductor/references/ui-mode-vs-debug.md b/.claude/skills/playwright-debug-conductor/references/ui-mode-vs-debug.md new file mode 100644 index 0000000..97ee570 --- /dev/null +++ b/.claude/skills/playwright-debug-conductor/references/ui-mode-vs-debug.md @@ -0,0 +1,48 @@ +# UI mode vs --debug vs codegen + +A quick map of when to reach for which tool. + +## UI mode (`npx playwright test --ui`) + +Best default for interactive debugging. + +- Time-travel through actions with DOM snapshots. +- Live watch: edits to spec/page-object re-run automatically. +- Filter by test name / project / tags / status. +- Pick locator interactively against the live page. + +Use when: iterating on a single test, choosing locators, exploring why a step misbehaves. + +## `--debug` (`npx playwright test --debug`) + +Drops you into the Playwright Inspector with `pause()` semantics. + +- Step over each action. +- Edit locators on the fly via the Inspector. +- Useful when UI mode is too "watchy" and you need full control. + +Use when: pinpoint debugging deep inside a long test. + +## `codegen` (`npx playwright codegen `) + +Records actions and emits suggestion-quality scaffolding. + +- Fastest way to get a candidate locator. +- Output is NOT production-ready — it goes through `playwright-test-author-ui` to be reformulated. + +Use when: bootstrapping a new page object, exploring a new flow. + +## Headed + slow-mo + +```bash +PWDEBUG=1 npx playwright test path/spec --headed --workers=1 +# or +npx playwright test --headed --workers=1 --use-args slowMo=300 +``` + +Use when: visually verifying a flow, or when a test passes headless and fails headed (or vice versa). + +## When NOT to use these + +- In CI. CI runs headless with `--reporter=list,html`. Trace Viewer post-mortem is the right tool there. +- For flake triage on a wide scale — that's `flaky-triage` running over JSON results, not interactive. diff --git a/.claude/skills/playwright-debug-conductor/scripts/extract-failing-step.ts b/.claude/skills/playwright-debug-conductor/scripts/extract-failing-step.ts new file mode 100755 index 0000000..89b6244 --- /dev/null +++ b/.claude/skills/playwright-debug-conductor/scripts/extract-failing-step.ts @@ -0,0 +1,57 @@ +#!/usr/bin/env -S npx tsx +/** + * extract-failing-step.ts — produce a compact JSON summary of the failure. + * Reads test-results/results.json (Playwright JSON reporter) and emits per-failed-test: + * { file, title, errorMessage, durationMs, projectName, attachments } + * + * Usage: + * tsx extract-failing-step.ts > FAILURES.json + */ +import { readFileSync, existsSync } from 'node:fs'; +import { join } from 'node:path'; + +const root = process.cwd(); +const resultsPath = join(root, 'playwright-report', 'results.json'); + +if (!existsSync(resultsPath)) { + console.error(`FAIL: ${resultsPath} not found. Run with --reporter=json,list,html.`); + process.exit(2); +} + +const results = JSON.parse(readFileSync(resultsPath, 'utf8')); + +interface Out { + file: string; + title: string; + projectName: string; + durationMs: number; + errorMessage: string; + attachments: { name: string; path?: string }[]; +} + +const failures: Out[] = []; + +function walk(suite: any, file: string) { + if (suite.specs) { + for (const spec of suite.specs) { + for (const t of spec.tests ?? []) { + for (const r of t.results ?? []) { + if (r.status === 'failed' || r.status === 'timedOut' || r.status === 'unexpected') { + failures.push({ + file: spec.file ?? file, + title: spec.title, + projectName: t.projectName, + durationMs: r.duration, + errorMessage: (r.errors?.[0]?.message ?? r.error?.message ?? '').slice(0, 2000), + attachments: (r.attachments ?? []).map((a: any) => ({ name: a.name, path: a.path })), + }); + } + } + } + } + } + for (const child of suite.suites ?? []) walk(child, file || child.file || ''); +} + +walk(results, ''); +process.stdout.write(JSON.stringify(failures, null, 2)); diff --git a/.claude/skills/playwright-framework-bootstrap/SKILL.md b/.claude/skills/playwright-framework-bootstrap/SKILL.md new file mode 100644 index 0000000..36ffb54 --- /dev/null +++ b/.claude/skills/playwright-framework-bootstrap/SKILL.md @@ -0,0 +1,52 @@ +--- +name: playwright-framework-bootstrap +description: Bootstraps or refactors a TypeScript+Playwright repository into a canonical layered architecture (UI/API/Data/Infra/Test). Creates page-objects/, components/, api/, fixtures/, factories/, data/, specs/ with strict boundaries and writes tests-config.json. Use when user starts a new automation project, asks to "set up a framework", "scaffold tests", "refactor folder structure", or when tests-config.json is missing. Do NOT use when only writing a single test or feature; use playwright-test-author-* instead. +allowed-tools: Read, Write, Edit, Glob, Grep, Bash +--- + +# Playwright Framework Bootstrap + +## When to trigger + +- New empty repo or one without `tests-config.json`. +- User says "scaffold", "bootstrap", "set up framework", "structure tests folder". +- Migration from a flat `tests/*.spec.ts` layout. + +## When NOT to trigger + +- A single test/page-object change → defer to `playwright-test-author-*`. +- Config-only edits → defer to `config-and-secrets`. + +## Workflow + +1. **Detect state.** Read `package.json`, `playwright.config.ts`, `tsconfig.json`, existing `tests/`. If `tests-config.json` exists, only verify and gap-fill. +2. **Pin layout.** Generate `tests-config.json` from `references/layout.template.json`. +3. **Create directories** matching the config (idempotent). +4. **Install minimal deps** (only if user opts in): `@playwright/test`, `fishery`, `openapi-typescript`, `openapi-fetch`, `zod`, `dotenv`. Confirm versions against `references/deps.lock.md`. +5. **Write `playwright.config.ts`** from `references/playwright.config.template.ts` with: projects per browser, `fullyParallel: true`, `retries: process.env.CI ? 2 : 0`, `reporter: [['list'], ['html', { open: 'never' }]]`, traces `on-first-retry`, screenshots `only-on-failure`. +6. **Write `BasePage.ts`** (locators-only abstract class, NO assertions, NO navigation logic in base). +7. **Validate** with `scripts/validate-layout.sh` (exit 0 required). + +## Required outputs + +- `tests-config.json` committed. +- `playwright.config.ts`, `tsconfig.json` paths aligned (`@pages/*`, `@api/*`, `@fixtures/*`). +- README diff snippet appended explaining layout. + +## Hard rules (enforced by `scripts/validate-layout.sh`) + +- No file in `specs/` may import from `pages/*/locators` directly — only via the page class. +- `BasePage` must not import `expect`. +- No two classes named `*Page` outside `pages/`. + +## References + +- `references/layout.template.json` +- `references/playwright.config.template.ts` +- `references/base-page.template.ts` +- `references/folder-rationale.md` (why each layer exists) + +## Composability + +- Pairs with `config-and-secrets`, `fixture-architect`, `api-client-from-openapi`. +- Always run BEFORE any `playwright-test-author-*` skill. diff --git a/.claude/skills/playwright-framework-bootstrap/assets/README.md b/.claude/skills/playwright-framework-bootstrap/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/playwright-framework-bootstrap/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/playwright-framework-bootstrap/references/base-page.template.ts b/.claude/skills/playwright-framework-bootstrap/references/base-page.template.ts new file mode 100644 index 0000000..df75492 --- /dev/null +++ b/.claude/skills/playwright-framework-bootstrap/references/base-page.template.ts @@ -0,0 +1,25 @@ +import type { Page } from '@playwright/test'; + +/** + * BasePage — abstract anchor for all page objects. + * Hard rules (enforced by lint-page-object.ts): + * - NO `expect` imports here. Assertions belong to specs. + * - NO navigation logic. Each subclass declares its own `path` and a public `goto()`. + * - NO shared mutable state. + * - Only locator builders and stateless helpers. + */ +export abstract class BasePage { + protected constructor(protected readonly page: Page) {} + + /** Sub-classes declare a relative path; goto() composes with the configured baseURL. */ + protected abstract readonly path: string; + + async goto(): Promise { + await this.page.goto(this.path); + } + + /** Returns the human-readable URL for diagnostics; never used for assertions. */ + url(): string { + return this.page.url(); + } +} diff --git a/.claude/skills/playwright-framework-bootstrap/references/deps.lock.md b/.claude/skills/playwright-framework-bootstrap/references/deps.lock.md new file mode 100644 index 0000000..d1ad423 --- /dev/null +++ b/.claude/skills/playwright-framework-bootstrap/references/deps.lock.md @@ -0,0 +1,16 @@ +# Recommended dependency versions + +Update with care; pin in `package.json` to `^` minor. + +| Package | Min version | Reason | +| ------------------------ | -------------------- | ------------------------------------------ | +| `@playwright/test` | `^1.50.0` | web-first assertions, UI mode, traces | +| `typescript` | `^5.4.0` | satisfies operator, const type params | +| `fishery` | `^2.2.2` | sequence + transient params + associations | +| `@faker-js/faker` | `^9.0.0` | deterministic seeding | +| `openapi-typescript` | `^7.0.0` | spec → types | +| `openapi-fetch` | `^0.13.0` | tiny typed client | +| `zod` | `^3.23.0` | runtime contract validation | +| `dotenv` / `dotenv-flow` | `^16.0.0` / `^4.0.0` | env layering | +| `eslint` | `^9.0.0` | flat config | +| `husky` + `lint-staged` | `^9.0.0` / `^15.0.0` | pre-commit gates | diff --git a/.claude/skills/playwright-framework-bootstrap/references/folder-rationale.md b/.claude/skills/playwright-framework-bootstrap/references/folder-rationale.md new file mode 100644 index 0000000..47753ef --- /dev/null +++ b/.claude/skills/playwright-framework-bootstrap/references/folder-rationale.md @@ -0,0 +1,36 @@ +# Folder rationale + +The layered layout exists to enforce SRP and DIP at the file-system level. Skills validate that imports cross layers in one direction only. + +| Folder | Responsibility | Forbidden inside | +| ------------------- | ------------------------------------------------------------------------------------ | ----------------------------------------------------- | +| `tests/pages/` | Page objects: locators + behaviours per page. One class per page. | `expect`, hardcoded URLs, business assertions | +| `tests/components/` | Component objects shared across multiple pages (header, footer, modal). | Cross-page navigation logic | +| `tests/api/` | API client layer. `generated/` for OpenAPI artefacts, `clients/` for typed wrappers. | `axios`, raw `fetch` outside generated/clients | +| `tests/fixtures/` | Playwright `test.extend` composition. DI for tests. | Domain logic, factories | +| `tests/factories/` | Test data factories (Fishery + Faker). One file per entity. | Network calls, side effects | +| `tests/specs/` | The actual tests. AAA / Given-When-Then. | Locators (must come from pages), inline data literals | +| `tests/data/` | Static JSON fixtures (golden files, schema samples). | Generated code | +| `tests/infra/` | env loader, logger, base helpers. | Domain entities | + +## Import direction (allowed) + +``` +specs/ ──> fixtures/ ──> pages/, components/, factories/, api/clients/ + \─> infra/ +factories/ ──> api/generated (types only) +api/clients/ ──> api/generated, infra/ +``` + +Reverse imports are forbidden. The `validate-layout.sh` script greps imports and exits non-zero on violations. + +## Why so many layers? + +- Page objects encode "where" — locators and primitive interactions. +- Components encode reuse across pages. +- Fixtures encode "what is available in a test" — DI container. +- Factories encode "what data the test needs" — independent of how it's persisted. +- API clients encode "how to talk to the system" — typed and contract-aware. +- Specs encode "what behaviour we verify" — declarative, AAA, no mechanics. + +Mixing these is the #1 source of brittleness in real-world Playwright suites. diff --git a/.claude/skills/playwright-framework-bootstrap/references/layout.template.json b/.claude/skills/playwright-framework-bootstrap/references/layout.template.json new file mode 100644 index 0000000..e16f3ab --- /dev/null +++ b/.claude/skills/playwright-framework-bootstrap/references/layout.template.json @@ -0,0 +1,32 @@ +{ + "layout": { + "pages": "tests/pages", + "components": "tests/components", + "apiClients": "tests/api", + "fixtures": "tests/fixtures", + "factories": "tests/factories", + "specs": "tests/specs", + "data": "tests/data", + "infra": "tests/infra" + }, + "openapi": { + "source": "specs/openapi.yaml", + "generated": "tests/api/generated" + }, + "naming": { + "pageClass": "PascalCase", + "specFile": "kebab-case.spec.ts", + "factoryFile": "kebab-case.factory.ts" + }, + "selectors": { + "preferred": ["getByRole", "getByLabel", "getByTestId"], + "testIdAttr": "data-testid" + }, + "stack": { + "factories": "fishery", + "openapi": "openapi-typescript+openapi-fetch", + "schemaValidation": "zod" + }, + "envs": ["local", "staging", "prod"], + "version": 1 +} diff --git a/.claude/skills/playwright-framework-bootstrap/references/playwright.config.template.ts b/.claude/skills/playwright-framework-bootstrap/references/playwright.config.template.ts new file mode 100644 index 0000000..902c8e4 --- /dev/null +++ b/.claude/skills/playwright-framework-bootstrap/references/playwright.config.template.ts @@ -0,0 +1,45 @@ +import { defineConfig, devices } from '@playwright/test'; +import 'dotenv/config'; + +const isCI = !!process.env.CI; + +export default defineConfig({ + testDir: 'tests/specs', + fullyParallel: true, + forbidOnly: isCI, + retries: isCI ? 2 : 0, + workers: isCI ? '50%' : undefined, + reporter: [ + ['list'], + ['html', { open: 'never', outputFolder: 'playwright-report' }], + ['json', { outputFile: 'playwright-report/results.json' }], + ['junit', { outputFile: 'playwright-report/junit.xml' }], + ], + use: { + baseURL: process.env.BASE_URL ?? 'http://localhost:3000', + trace: 'on-first-retry', + screenshot: 'only-on-failure', + video: 'retain-on-failure', + actionTimeout: 10_000, + navigationTimeout: 30_000, + testIdAttribute: 'data-testid', + }, + projects: [ + { name: 'setup', testMatch: /.*\.setup\.ts/ }, + { + name: 'chromium', + use: { ...devices['Desktop Chrome'], storageState: 'tests/.auth/user.json' }, + dependencies: ['setup'], + }, + { + name: 'firefox', + use: { ...devices['Desktop Firefox'], storageState: 'tests/.auth/user.json' }, + dependencies: ['setup'], + }, + { + name: 'api', + testDir: 'tests/specs/api', + use: { baseURL: process.env.API_BASE_URL ?? 'http://localhost:8080' }, + }, + ], +}); diff --git a/scripts/validate-layout.sh b/.claude/skills/playwright-framework-bootstrap/scripts/validate-layout.sh similarity index 100% rename from scripts/validate-layout.sh rename to .claude/skills/playwright-framework-bootstrap/scripts/validate-layout.sh diff --git a/.claude/skills/playwright-test-author-api/SKILL.md b/.claude/skills/playwright-test-author-api/SKILL.md new file mode 100644 index 0000000..859a09f --- /dev/null +++ b/.claude/skills/playwright-test-author-api/SKILL.md @@ -0,0 +1,39 @@ +--- +name: playwright-test-author-api +description: Writes Playwright API/contract tests using the typed OpenAPI client and zod schemas. Enforces request/response schema validation, idempotent setup, status-code + body assertions, and contract-driven negative tests. Use when user asks for an API test, "test the endpoint", "verify response schema", "contract test", "negative test for API". Do NOT use for UI flows; defer to playwright-test-author-ui. +allowed-tools: Read, Write, Edit, Glob, Grep, Bash +--- + +# Playwright Test Author — API + +## Hard rules + +- Use generated client (`tests/api/clients/*`) — no `fetch()`/`axios` directly in spec. +- ALWAYS validate response body via the corresponding `zod` schema (auto-generated from OpenAPI). +- Cover: happy path, schema integrity, auth boundary, validation errors per AC, idempotency where applicable. +- Each test must clean up its data (or use a per-test factory tag). +- 5xx → flaky → mark `test.fail()` with TODO ticket, do NOT silently retry. + +## Workflow + +1. Read endpoint definition in `specs/openapi.yaml` (path, methods, params, responses). +2. Read corresponding test design — pick scenarios tagged for `api` layer. +3. Author test in `tests/specs/api//.spec.ts` using `references/api-spec.template.ts`. +4. Run `scripts/lint-api-spec.ts` until exit 0. +5. Run target spec: `npx playwright test tests/specs/api/ --project=api`. + +## Test matrix per endpoint (minimum) + +- Happy path: valid request → success status + schema. +- Auth boundary: missing/invalid token → 401/403. +- Validation: each required field individually omitted → 4xx. +- Boundary values: numeric/length min/max ± 1. +- Idempotency: repeating the same request (where applicable) yields the same result. +- Negative: forbidden role, deleted resource, version conflict. + +## References + +- `references/contract-test-patterns.md` +- `references/idempotency-strategies.md` +- `references/error-taxonomy.md` +- `references/api-spec.template.ts` diff --git a/.claude/skills/playwright-test-author-api/assets/README.md b/.claude/skills/playwright-test-author-api/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/playwright-test-author-api/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/playwright-test-author-api/references/api-spec.template.ts b/.claude/skills/playwright-test-author-api/references/api-spec.template.ts new file mode 100644 index 0000000..b89cd5f --- /dev/null +++ b/.claude/skills/playwright-test-author-api/references/api-spec.template.ts @@ -0,0 +1,66 @@ +/** + * Template for an API/contract spec. + * Demonstrates the canonical structure: client + factory + schema validator. + */ +import { test, expect } from '@fixtures'; +import { OrderSchema, OrderListSchema } from '@api/generated/zod'; +import { expectMatchesSchema } from '@api/contract'; +import { orderFactory } from '@factories/order.factory'; + +test.describe('POST /orders', () => { + test('@smoke creates an order and returns it matching the schema', async ({ + ordersClient, + seededUser, + }) => { + const draft = orderFactory.build({ userId: seededUser.id }); + + const { data, response } = await ordersClient.create(draft); + + expect(response.status).toBe(201); + expectMatchesSchema(data, OrderSchema); + expect(data.userId).toBe(seededUser.id); + }); + + test('rejects request without required field with a 422 and structured error', async ({ + ordersClient, + seededUser, + }) => { + const invalid = orderFactory.build({ userId: seededUser.id, items: [] }); + + const { data, response } = await ordersClient.create(invalid); + + expect(response.status).toBe(422); + expect(data).toMatchObject({ + code: 'validation_error', + fields: expect.arrayContaining(['items']), + }); + }); + + test('returns 401 when called anonymously', async ({ anonClient }) => { + const draft = orderFactory.build(); + const { response } = await anonClient.orders.create(draft); + expect(response.status).toBe(401); + }); +}); + +test.describe('GET /orders', () => { + test('paginates correctly with limit and cursor', async ({ ordersClient, seededUser }) => { + // Arrange — seed 5 orders deterministically + for (const draft of orderFactory.buildList(5, { userId: seededUser.id })) { + await ordersClient.create(draft); + } + + const page1 = await ordersClient.list({ userId: seededUser.id, limit: 2 }); + expect(page1.response.status).toBe(200); + expectMatchesSchema(page1.data, OrderListSchema); + expect(page1.data.items).toHaveLength(2); + + const page2 = await ordersClient.list({ + userId: seededUser.id, + limit: 2, + cursor: page1.data.nextCursor, + }); + expect(page2.data.items).toHaveLength(2); + expect(page2.data.items[0].id).not.toBe(page1.data.items[0].id); + }); +}); diff --git a/.claude/skills/playwright-test-author-api/references/contract-test-patterns.md b/.claude/skills/playwright-test-author-api/references/contract-test-patterns.md new file mode 100644 index 0000000..5f2e995 --- /dev/null +++ b/.claude/skills/playwright-test-author-api/references/contract-test-patterns.md @@ -0,0 +1,56 @@ +# Contract test patterns + +## 1. Schema integrity (every successful response) + +```ts +expect(response.status).toBe(200); +expectMatchesSchema(data, OrderSchema); +``` + +Catches: drift, accidental field renames, type widening, removed fields. If the spec is the source of truth and the live API drifts, this assertion fires before any business logic test. + +## 2. Required-field validation (per AC) + +For each required field declared in the OpenAPI schema, assert that omitting it yields a 4xx with a structured error. Generate these tests programmatically when feasible: + +```ts +for (const required of ['userId', 'items']) { + test(`returns 422 when ${required} is missing`, async ({ ordersClient }) => { + const { [required]: _, ...partial } = orderFactory.build(); + const { data, response } = await ordersClient.create(partial as never); + expect(response.status).toBe(422); + expect(data.fields).toContain(required); + }); +} +``` + +## 3. Auth boundary + +Always include 401 (anonymous) and 403 (wrong role) tests. Do NOT collapse them into a single parameterised case — they exercise different code paths in the backend. + +## 4. Idempotency + +For POST endpoints declaring `Idempotency-Key`: + +```ts +const key = crypto.randomUUID(); +const r1 = await ordersClient.create(draft, { idempotencyKey: key }); +const r2 = await ordersClient.create(draft, { idempotencyKey: key }); +expect(r1.data.id).toBe(r2.data.id); +expect(r2.response.status).toBe(200); // not 201 on the second call +``` + +## 5. Pagination boundaries + +- limit = 0 (if allowed) or limit = 1. +- limit = max-1, max, max+1 (spec must say what max is). +- cursor pointing past the end → empty list, valid schema. +- cursor that has been deleted → graceful behaviour per AC. + +## 6. Long polling / async + +If an endpoint kicks off async work, the test polls a status endpoint with `expect.poll`, never `setInterval`. + +## 7. Pact-style consumer tests + +For services consumed by other services, contract tests are bi-directional. We integrate via Pact files in CI; this is out of scope for skill-driven authoring but the spec format we produce is compatible. diff --git a/.claude/skills/playwright-test-author-api/references/error-taxonomy.md b/.claude/skills/playwright-test-author-api/references/error-taxonomy.md new file mode 100644 index 0000000..523c75c --- /dev/null +++ b/.claude/skills/playwright-test-author-api/references/error-taxonomy.md @@ -0,0 +1,32 @@ +# Error response taxonomy + +A consistent error envelope makes contract tests trivial. The skill assumes the following shape (RFC 7807-flavoured): + +```jsonc +{ + "code": "validation_error", // machine-readable, snake_case + "message": "Items must not be empty.", + "fields": ["items"], // optional, present for 4xx field errors + "traceId": "...", // for cross-system debugging +} +``` + +## Canonical buckets + +| Status | code namespace | When | +| ------ | ------------------ | --------------------------------------- | +| 400 | `bad_request` | Malformed JSON, missing required header | +| 401 | `unauthenticated` | No / invalid token | +| 403 | `forbidden` | Token valid but role insufficient | +| 404 | `not_found` | Resource missing | +| 409 | `conflict` | Duplicate, idempotency mismatch | +| 422 | `validation_error` | Field-level violations | +| 429 | `rate_limited` | Throttling hit | +| 500 | `internal_error` | Server fault — should be rare | + +## Test rules + +- Never assert on `message` string verbatim — copy changes constantly. Assert on `code`. +- Always assert on `fields` array for 422. +- Always check that `traceId` is non-empty (helps support). +- Reject 5xx outright in the test — if a 5xx is expected, the AC is broken. diff --git a/.claude/skills/playwright-test-author-api/references/idempotency-strategies.md b/.claude/skills/playwright-test-author-api/references/idempotency-strategies.md new file mode 100644 index 0000000..58d3ac7 --- /dev/null +++ b/.claude/skills/playwright-test-author-api/references/idempotency-strategies.md @@ -0,0 +1,50 @@ +# Idempotency strategies + +## Why we test it + +POST endpoints that mutate state must be idempotent under retries. Otherwise a network blip duplicates orders, charges, or messages. + +## Strategies + +| Strategy | Where applicable | Test approach | +| ------------------------ | --------------------------------------------- | ----------------------------------------------------------------------------- | +| `Idempotency-Key` header | Stripe-style APIs, payment / order creation | Send same key twice; assert same id, second response 200 not 201. | +| Natural unique key | Resource has a unique constraint (email, sku) | Send same payload twice; second returns 409 Conflict OR 200 with existing id. | +| Server-side dedup window | Messaging / events | Send same payload twice within window; assert one record. | +| At-most-once via locking | Cron-triggered jobs | Out of scope for API tests. | + +## Test template + +```ts +test('retrying with the same Idempotency-Key returns the same resource', async ({ + ordersClient, + seededUser, +}) => { + const draft = orderFactory.build({ userId: seededUser.id }); + const key = crypto.randomUUID(); + + const first = await ordersClient.create(draft, { idempotencyKey: key }); + expect(first.response.status).toBe(201); + + const retry = await ordersClient.create(draft, { idempotencyKey: key }); + expect(retry.response.status).toBe(200); // not a new resource + expect(retry.data.id).toBe(first.data.id); +}); +``` + +## Negative case + +Different payload + same key → must NOT silently overwrite. + +```ts +test('different payload with same key fails', async ({ ordersClient, seededUser }) => { + const key = crypto.randomUUID(); + const a = orderFactory.build({ userId: seededUser.id }); + const b = orderFactory.build({ userId: seededUser.id }); + + await ordersClient.create(a, { idempotencyKey: key }); + const conflict = await ordersClient.create(b, { idempotencyKey: key }); + + expect(conflict.response.status).toBeGreaterThanOrEqual(400); +}); +``` diff --git a/.claude/skills/playwright-test-author-api/scripts/lint-api-spec.ts b/.claude/skills/playwright-test-author-api/scripts/lint-api-spec.ts new file mode 100755 index 0000000..999c53b --- /dev/null +++ b/.claude/skills/playwright-test-author-api/scripts/lint-api-spec.ts @@ -0,0 +1,88 @@ +#!/usr/bin/env -S npx tsx +/** + * lint-api-spec.ts — checks API spec files for forbidden patterns. + */ +import { readFileSync, readdirSync, statSync, existsSync } from 'node:fs'; +import { join } from 'node:path'; + +const root = process.cwd(); +const cfg = JSON.parse(readFileSync(join(root, 'tests-config.json'), 'utf8')); +const apiSpecsDir = join(root, cfg.layout.specs, 'api'); + +if (!existsSync(apiSpecsDir)) { + console.log('OK: no API spec directory yet.'); + process.exit(0); +} + +const rules: { + id: string; + severity: 'blocker' | 'major' | 'minor'; + re: RegExp; + message: string; +}[] = [ + { + id: 'no-axios', + severity: 'major', + re: /from\s+['"]axios['"]/, + message: 'Direct axios import. Use generated client.', + }, + { + id: 'no-fetch-url', + severity: 'major', + re: /\bfetch\s*\(\s*['"]https?:\/\//, + message: 'Inline fetch with URL. Use API client.', + }, + { + id: 'no-schema-bypass', + severity: 'major', + re: /(?:\/\/.*disable.*schema|\.skipSchema\b)/, + message: 'Schema validation disabled.', + }, + { + id: 'no-status-only', + severity: 'minor', + re: /^\s*expect\(response\.status\)\.toBe\(2\d\d\);\s*$/m, + message: 'Only status asserted; add schema or body assertion.', + }, + { + id: 'no-message-assert', + severity: 'minor', + re: /\.message['"]\s*\)\.toBe\(['"]/, + message: 'Asserting on error message string. Assert on code instead.', + }, +]; + +function collect(dir: string, acc: string[] = []): string[] { + for (const e of readdirSync(dir)) { + const p = join(dir, e); + if (statSync(p).isDirectory()) collect(p, acc); + else if (p.endsWith('.spec.ts')) acc.push(p); + } + return acc; +} + +const files = collect(apiSpecsDir); +let blockers = 0, + majors = 0, + minors = 0; + +for (const file of files) { + const lines = readFileSync(file, 'utf8').split('\n'); + lines.forEach((line, idx) => { + for (const r of rules) { + if (r.re.test(line)) { + console.error(`${file}:${idx + 1} [${r.severity.toUpperCase()}][${r.id}] ${r.message}`); + if (r.severity === 'blocker') blockers++; + else if (r.severity === 'major') majors++; + else minors++; + } + } + }); +} + +console.error( + `\n${files.length} API spec(s). blockers=${blockers} majors=${majors} minors=${minors}`, +); +if (blockers) process.exit(2); +if (majors) process.exit(1); +process.exit(0); diff --git a/.claude/skills/playwright-test-author-ui/SKILL.md b/.claude/skills/playwright-test-author-ui/SKILL.md new file mode 100644 index 0000000..5afced4 --- /dev/null +++ b/.claude/skills/playwright-test-author-ui/SKILL.md @@ -0,0 +1,36 @@ +--- +name: playwright-test-author-ui +description: Writes Playwright UI tests in TypeScript using project page objects, fixtures, and factories — never inline locators. Enforces web-first assertions, getByRole/getByTestId, no hard waits, AAA structure, and explicit data setup via factories+API. Use when user asks for a new UI test, "automate this scenario in UI", "Playwright test for the login page", or after gherkin-test-case-author finishes. Do NOT use for pure API tests (playwright-test-author-api) or for fixture/page-object scaffolding (fixture-architect / playwright-framework-bootstrap). +allowed-tools: Read, Write, Edit, Glob, Grep, Bash +--- + +# Playwright Test Author — UI + +## Hard rules (validated by `scripts/lint-ui-spec.ts`) + +1. **No `page.waitForTimeout`**. Use `expect.poll`, web-first assertions, `toBeVisible({ timeout })`. +2. **No raw CSS/XPath selectors in spec**. Selectors live in page objects only. +3. **Selector preference**: `getByRole` > `getByLabel` > `getByPlaceholder` > `getByTestId` > others. CSS only as last resort and reviewed. +4. **One assertion per behaviour, AAA structure**. `expect.soft` for grouped assertions on the same state. +5. **Data setup via API + factory**, not by clicking through UI to seed. +6. **Authentication via storageState fixture**, not by performing UI login per test. +7. **No `test.use()` global mutations** unless the entire file requires it. + +## Workflow + +1. Read corresponding test design + Gherkin (if any) + page objects. +2. Map scenario steps → page-object methods. If a method is missing, add it to the page object (NOT to spec). +3. Wrap data setup in `test.beforeEach` via fixture; cleanup in `afterEach`. +4. Run `scripts/lint-ui-spec.ts`. Fix until exit 0. +5. Run target spec: `npx playwright test --reporter=list`. Iterate. + +## Auto-debug loop + +On failure: emit Trace Viewer link via `--trace on-first-retry`. If three retries fail, hand off to `playwright-debug-conductor`. + +## References + +- `references/web-first-assertions.md` +- `references/locator-priority.md` +- `references/aaa-template.ts` +- `references/anti-flake-checklist.md` diff --git a/.claude/skills/playwright-test-author-ui/assets/README.md b/.claude/skills/playwright-test-author-ui/assets/README.md new file mode 100644 index 0000000..433cdeb --- /dev/null +++ b/.claude/skills/playwright-test-author-ui/assets/README.md @@ -0,0 +1 @@ +Reserved for binary assets (templates, icons, snapshots) used by this skill. diff --git a/.claude/skills/playwright-test-author-ui/references/aaa-template.ts b/.claude/skills/playwright-test-author-ui/references/aaa-template.ts new file mode 100644 index 0000000..2eb7818 --- /dev/null +++ b/.claude/skills/playwright-test-author-ui/references/aaa-template.ts @@ -0,0 +1,29 @@ +/** + * AAA / Given-When-Then template for a Playwright UI spec. + * + * - Arrange: data + auth + navigation, all via fixtures. + * - Act: a single user-meaningful action. + * - Assert: outcome the user observes; web-first matchers. + */ +import { test, expect } from '@fixtures'; +import { orderFactory } from '@factories/order.factory'; + +test.describe('Order cancellation', () => { + test('user can cancel their own pending order', async ({ + seededUser, + ordersClient, + dashboardPage, + }) => { + // Arrange + const order = await ordersClient.create(orderFactory.build({ userId: seededUser.id })); + await dashboardPage.goto(); + + // Act + await dashboardPage.orders.cancelByReference(order.reference); + + // Assert + await expect(dashboardPage.orders.statusOf(order.reference)).toHaveText(/cancelled/i); + const refreshed = await ordersClient.getById(order.id); + expect(refreshed.data?.status).toBe('cancelled'); + }); +}); diff --git a/.claude/skills/playwright-test-author-ui/references/anti-flake-checklist.md b/.claude/skills/playwright-test-author-ui/references/anti-flake-checklist.md new file mode 100644 index 0000000..6e120a0 --- /dev/null +++ b/.claude/skills/playwright-test-author-ui/references/anti-flake-checklist.md @@ -0,0 +1,42 @@ +# Anti-flake checklist (UI) + +Run mentally before merging any UI test. If any answer is "no", fix it. + +## Determinism + +- [ ] No `waitForTimeout`. Use web-first or `expect.poll`. +- [ ] No `setTimeout`/`setInterval` inside specs. +- [ ] All faker-generated data is seeded. +- [ ] Test does not depend on system time. Use `await page.clock.install()` if it must. +- [ ] Test does not depend on order — runs alone and in suite. + +## Isolation + +- [ ] Data setup via API + factory, not by clicking. +- [ ] Each test cleans up its own data (or fixture does). +- [ ] No global storage of `auth.json` mutated mid-suite. +- [ ] No shared mutable module state used across tests. + +## Locators + +- [ ] Selectors live in page objects, not specs. +- [ ] `getByRole` / `getByLabel` first. CSS/XPath only with justification. +- [ ] No `nth-child`, no `*:nth-of-type`. + +## Assertions + +- [ ] All assertions are web-first auto-retrying matchers. +- [ ] Soft assertions used to group related checks on the same state. +- [ ] No screenshot snapshots over dynamic content unless masked. + +## Network + +- [ ] External-service-dependent tests are mocked or marked `@external`. +- [ ] `page.route` handlers are unrouted before the test ends. +- [ ] No retry-on-5xx logic — let the test fail and surface the dependency issue. + +## CI parity + +- [ ] Test passes locally with `--workers=1` AND `--workers=4 --shuffle`. +- [ ] Test passes both headed and headless. +- [ ] Test passes on at least two browsers (chromium + firefox). diff --git a/.claude/skills/playwright-test-author-ui/references/locator-priority.md b/.claude/skills/playwright-test-author-ui/references/locator-priority.md new file mode 100644 index 0000000..dadf105 --- /dev/null +++ b/.claude/skills/playwright-test-author-ui/references/locator-priority.md @@ -0,0 +1,42 @@ +# Locator priority + +A strict ordering. Use the first one that fits. Drop only with peer-review. + +| Priority | Locator | When | +| -------- | ------------------------------------------ | ------------------------------------------------- | +| 1 | `getByRole('button', { name: 'Submit' })` | Almost everything users interact with has a role. | +| 2 | `getByLabel('Email')` | Form fields with associated `