Context
The agentkeys-workflow-collection skill's recorder + src/scrapers/openrouter-cdp.ts + src/scrapers/openai-cdp.ts (shipped in #66ac92d) proved the architecture: hand-written production scrapers + recorder as iteration scaffolding. The current pipeline:
/agentkeys-workflow-collection drives the recorder → iterates until signup mints a real key → flows.ts accumulates proven fixes.
- Human ports the working flow from
flows.ts into a new src/scrapers/<service>-cdp.ts by hand.
Step 2 is the gap. We do NOT want to fix it by making the emitter magically produce production-ready scrapers (the string-template approach is fragile — every flows.ts change needs the template re-synced, and service-specific knowledge will always leak through). Instead, keep the hand-written scrapers as the source of truth and make the recorder produce artifacts + tooling that shrink the porting step to a near-mechanical transcription.
Scope
1. Generalize the manifest interface + helper functions
src/workflow-recorder/artifacts.ts::Manifest is currently scrapped-together around the OpenRouter/OpenAI happy path. Make it general:
- Flow-shape agnostic: signup AND login both serializable to the same manifest (today the manifest has
flow: \"signup\" | \"login\" but many selectors/outcomes are signup-specific).
- Generic step-outcome vocabulary ("fill-email", "click-continue", "wait-verification", "extract-key") instead of the current mix of flow-specific labels.
- Typed "detected" fields: regexes (from/subject/URL), selectors (email/password/TOS/Continue/Create), timings (per-step ms), captcha-kind encountered (turnstile / hcaptcha / none / PoW-custom).
manifest.json becomes the contract, not the debug dump. Any consumer (recorder, ship-scraper skill, drift-detector) reads the same shape.
Files: src/workflow-recorder/artifacts.ts, src/workflow-recorder/flows.ts, src/workflow-recorder/email-analyzer.ts (now in src/lib/).
2. Improve agentkeys-workflow-collection skill — emitter uses the manifest interface
The current emitDraftScraper string-templates a scraper inline. Replace with: read the manifest, compose a scraper from:
- A stable shell (argv parsing, env-var read, CDP connect, JSON-event emit, exit-cleanly) — identical across services, copied verbatim.
- A service-specific body generated by walking the manifest's step sequence and emitting the corresponding lib/playwright-patterns calls.
- Inherited behavior from the lib (no inlining of humanType / clickOuterCreate / etc.) — matches the hand-written scraper structure.
Minimal changes per service: the emitter should produce a file that is >80% identical to a hand-written scraper. Service-specific helpers (OpenRouter's dismissOpenRouterOnboardingModals, OpenAI's completeOpenAIPostVerifyProfile) still require human input (recorded as "notes" fields in the manifest).
Update ~/.claude/skills/agentkeys-workflow-collection/SKILL.md Phase 4 to document the new emitter output + how to review it.
3. New skill: /agentkeys-ship-scraper
Takes the last-successful recording for a service and ships a production scraper. Flow:
- `--service ` argument.
- Find most recent manifest with
state: completed under provisioner-scripts/recordings/<slug>-*-reference/ (reference) or <slug>-<ts>/ (latest).
- Emit via the new manifest-driven emitter →
src/scrapers/<slug>-cdp.ts.
- Run
tsc --noEmit on the emitted file; surface errors as human decisions.
- Run the scraper once (live) to prove minting; write outcome into manifest.
- Stage the new file for PR.
Works for both login and signup flows:
- Signup flow: emits full create-account → email-verify → API-key-mint path.
- Login flow: emits login-with-credentials → API-key-mint (shorter; no email verify).
Skill lives at ~/.claude/skills/agentkeys-ship-scraper/SKILL.md.
Acceptance criteria
Out of scope
Why this architecture
Keeping production scrapers hand-written means every service gets a clear, auditable file with service-specific logic visible. The emitter's job is to produce a starting point, not the final artifact. This matches the way we actually debug: when OpenRouter's modal chain changes, you edit openrouter-cdp.ts (not the emitter template), re-record, regenerate, diff, ship.
Related commit: #66ac92d `feat(scrapers): deterministic OpenRouter + OpenAI production scrapers`
Context
The
agentkeys-workflow-collectionskill's recorder +src/scrapers/openrouter-cdp.ts+src/scrapers/openai-cdp.ts(shipped in #66ac92d) proved the architecture: hand-written production scrapers + recorder as iteration scaffolding. The current pipeline:/agentkeys-workflow-collectiondrives the recorder → iterates until signup mints a real key →flows.tsaccumulates proven fixes.flows.tsinto a newsrc/scrapers/<service>-cdp.tsby hand.Step 2 is the gap. We do NOT want to fix it by making the emitter magically produce production-ready scrapers (the string-template approach is fragile — every
flows.tschange needs the template re-synced, and service-specific knowledge will always leak through). Instead, keep the hand-written scrapers as the source of truth and make the recorder produce artifacts + tooling that shrink the porting step to a near-mechanical transcription.Scope
1. Generalize the manifest interface + helper functions
src/workflow-recorder/artifacts.ts::Manifestis currently scrapped-together around the OpenRouter/OpenAI happy path. Make it general:flow: \"signup\" | \"login\"but many selectors/outcomes are signup-specific).manifest.jsonbecomes the contract, not the debug dump. Any consumer (recorder, ship-scraper skill, drift-detector) reads the same shape.Files: src/workflow-recorder/artifacts.ts, src/workflow-recorder/flows.ts, src/workflow-recorder/email-analyzer.ts (now in
src/lib/).2. Improve
agentkeys-workflow-collectionskill — emitter uses the manifest interfaceThe current
emitDraftScraperstring-templates a scraper inline. Replace with: read the manifest, compose a scraper from:Minimal changes per service: the emitter should produce a file that is >80% identical to a hand-written scraper. Service-specific helpers (OpenRouter's
dismissOpenRouterOnboardingModals, OpenAI'scompleteOpenAIPostVerifyProfile) still require human input (recorded as "notes" fields in the manifest).Update ~/.claude/skills/agentkeys-workflow-collection/SKILL.md Phase 4 to document the new emitter output + how to review it.
3. New skill:
/agentkeys-ship-scraperTakes the last-successful recording for a service and ships a production scraper. Flow:
state: completedunderprovisioner-scripts/recordings/<slug>-*-reference/(reference) or<slug>-<ts>/(latest).src/scrapers/<slug>-cdp.ts.tsc --noEmiton the emitted file; surface errors as human decisions.Works for both
loginandsignupflows:Skill lives at
~/.claude/skills/agentkeys-ship-scraper/SKILL.md.Acceptance criteria
Manifestinterface + helper functions reviewed for flow-agnosticism; login-path recording produces a manifest shaped identically to signup.emitDraftScraperrewritten to compose from lib calls (not string-template). Output for OpenRouter recording matches hand-writtenopenrouter-cdp.tsto within ~20 lines (docstring / ordering allowed to differ; behavior identical)./agentkeys-ship-scraperend-to-end: invoke on OpenRouter's reference recording → emitted scraper mints a real key live → no regression vs currentsrc/scrapers/openrouter-cdp.ts.Out of scope
Why this architecture
Keeping production scrapers hand-written means every service gets a clear, auditable file with service-specific logic visible. The emitter's job is to produce a starting point, not the final artifact. This matches the way we actually debug: when OpenRouter's modal chain changes, you edit
openrouter-cdp.ts(not the emitter template), re-record, regenerate, diff, ship.Related commit: #66ac92d `feat(scrapers): deterministic OpenRouter + OpenAI production scrapers`