A collection of Claude Skills distilled from the canonical software-testing books, that
make Claude write and review tests well — one Skill per book, bundled as a plugin.
An optional short testing-rules.md snippet is there too, if you want an always-on
nudge without installing anything.
English | 简体中文
Independent distillation of each book's principles (in our own words). Not affiliated with or endorsed by the authors or publishers. Read the books — they're excellent — for the full treatment.
LLM- and junior-written test suites fail the same predictable ways: they mock every collaborator and assert on the calls, pin down implementation details, mock the database, re-implement the production formula to compute the "expected" value, test trivial getters and setters, pick inputs ad hoc, and chase a coverage number. The result is expensive to maintain, fires false alarms, and discourages the very refactoring tests are supposed to enable.
These skills exist to stop exactly those habits — each from the angle of one canonical book, reconciled into one consistent house style.
testing-canon/
├── skills/ # the product — one Claude Skill per book (the plugin bundles these)
│ ├── khorikov-unit-testing/ # classical-school unit-test craft (the core)
│ ├── art-of-unit-testing/ # trustworthy/maintainable/readable fundamentals
│ ├── effective-software-testing/ # systematic test-case derivation
│ ├── xunit-test-patterns/ # test smells & the patterns that fix them
│ ├── legacy-code-testing/ # get untested code safely under test
│ ├── agile-testing-quadrants/ # whole-team test strategy
│ └── context-driven-testing/ # the investigative tester's mindset
├── testing-rules.md # optional: short drop-in snippet to APPEND to your own CLAUDE.md
├── EXAMPLES.md # worked before/after pairs, per book
├── CLAUDE.md # project/contributor guide (for working on this repo)
└── .claude-plugin/ # marketplace + plugin manifests (bundles all skills)
Each skill is a full progressive-disclosure Skill: a SKILL.md decision workflow plus
focused references/ — installed via the plugin or copied as a folder. If you'd rather
have a lightweight always-on nudge without installing anything, testing-rules.md is a
short snippet you append to your project's CLAUDE.md — a taster, not a replacement for
the skills.
| I want to… | Use |
|---|---|
| Write clean unit tests | khorikov-unit-testing + art-of-unit-testing |
| Figure out which test cases to write (inputs, edge cases, boundaries) | effective-software-testing |
| Fix brittle / flaky / smelly / duplicated tests | xunit-test-patterns |
| Get legacy / untested code safely under test | legacy-code-testing |
| Plan a team's test strategy / decide what kinds of tests are needed | agile-testing-quadrants |
| Think like a tester — exploratory testing, better bug reports | context-driven-testing |
| Just want a short always-on nudge (no install) | append testing-rules.md to your CLAUDE.md |
The skills trigger automatically on the matching task even if you never name the book —
e.g. "write unit tests for this OrderService", "why does this test break on every
refactor", "how do I test this untested class", "what should our test strategy be".
| Skill | Book | Author(s) |
|---|---|---|
khorikov-unit-testing |
Unit Testing Principles, Practices, and Patterns | Vladimir Khorikov |
art-of-unit-testing |
The Art of Unit Testing | Roy Osherove |
effective-software-testing |
Effective Software Testing | Maurício Aniche |
xunit-test-patterns |
xUnit Test Patterns | Gerard Meszaros |
legacy-code-testing |
Working Effectively with Legacy Code | Michael Feathers |
agile-testing-quadrants |
Agile Testing | Lisa Crispin & Janet Gregory |
context-driven-testing |
Lessons Learned in Software Testing | Kaner, Bach & Pettichord |
Where they agree (the skills embody this; testing-rules.md is the short always-on version): test observable
behavior, not implementation; mock only at the real boundary (use real objects and
assert on state otherwise); test the code that carries risk and treat coverage as
a guide, not a target; one clear behavior per test, named for the behavior, with no
logic in the test; design for testability by separating decisions from side effects;
and prefer verification styles in the order output > state > communication.
Where they genuinely disagree — left unreconciled, the skills would give Claude whiplash, so the repo picks a house style and each skill notes the alternative and links across:
| Topic | The disagreement | House style (what the skills default to) |
|---|---|---|
| Test naming | Osherove's Unit_Scenario_Expected (USE) vs. Khorikov's behavior sentence |
Behavior-sentence naming. art-of-unit-testing presents USE as the book's convention, then defers to Khorikov. |
| Isolation school | Khorikov is classical (mock only unmanaged out-of-process deps); Osherove & GOOS lean more mockist (London) | Classical is the default; the mockist position is documented as an alternative, not the recommendation. |
| Test-double terms | Khorikov: stub/mock by direction; Meszaros: five (Dummy/Stub/Spy/Mock/Fake); Osherove: fake/stub/mock | Meszaros's five are the reference taxonomy (xunit-test-patterns); the others map onto it, and every skill gives the mapping. |
| Code coverage | All treat it as a poor target; Aniche uses it as a systematic guide | "Poor target, useful guide" — phrased consistently across effective-software-testing and khorikov-unit-testing. |
| Observability hooks | Aniche will add a getter / isValid() to make a class observable; Khorikov resists exposing state just to test |
Prefer restructuring (extract a pure unit that returns the value); accept a small, honest observability hook only when restructuring is disproportionate. |
| Scope / altitude | Unit craft (Khorikov, Osherove, Meszaros) vs. team strategy (Crispin & Gregory) vs. mindset (Kaner et al.) | All valid at different altitudes. Navigate by intent (the table above), so they complement rather than compete. |
The plugin is hosted at
arcboxlabs/testing-canonand the marketplace/plugin are both namedtesting-canon. Set theowner/authorin.claude-plugin/*.jsonand theLICENSEcopyright holder if you fork it.
/plugin marketplace add arcboxlabs/testing-canon
/plugin install testing-canon@testing-canon
This bundles all seven skills (auto-discovered under skills/). They trigger on the
relevant task automatically.
Claude Code (per user) — copy just the skill(s) you want:
cp -r skills/legacy-code-testing ~/.claude/skills/
Claude.ai / Claude apps: zip a skills/<name>/ folder and upload it as a Skill in
Settings → Capabilities → Skills.
testing-rules.md is a short snippet of the non-negotiables. Append it to the
testing section of your own project's CLAUDE.md — it complements the skills, it
doesn't replace them:
echo "" >> CLAUDE.md
curl https://raw.githubusercontent.com/arcboxlabs/testing-canon/main/testing-rules.md >> CLAUDE.md
Install the skills, or copy testing-rules.md into a Cursor project rule
(.cursor/rules/testing.mdc), so the same rules apply in Cursor.
- Tests survive refactoring — green stays green when only internals change.
- Few or no mocks outside the email/bus/third-party boundary; integration tests use the real database and assert on state.
- No tests for getters/setters; effort concentrated on domain logic and risk.
- Test inputs are clearly derived (partitions/boundaries), not arbitrary.
- Test names read like behavior; a failing name tells you what broke.
- The suite is fast and trusted, so people actually run it.
These rules are language-agnostic. Pin them to your stack by adding to the testing
section of your CLAUDE.md:
## Project-Specific Testing Guidelines
- Framework: pytest, parametrize for cases
- Integration tests use the real Postgres test container; never mock the repository
- Mock only the SendGrid and Stripe gateways
MIT — see LICENSE.