From c36d54a91e1900ffb0594483260cfaa71b145c0b Mon Sep 17 00:00:00 2001 From: wildmeta-agent Date: Mon, 20 Apr 2026 11:58:48 +0800 Subject: [PATCH 1/8] docs(stage5): add concrete 'Run it' section for the live demo MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Gmail-setup section ended at 'Daemon running and paired — see the Stage 4 manual test guide', but Stage 5a provision doesn't actually need a paired daemon: cmd_provision runs as the master CLI and uses session.wallet as the agent_id. A reader who finished Gmail setup had no concrete path from there to a running demo. Replace the vague pointer with an explicit two-terminal runbook: - Terminal 1: mock backend (Stage 5a stores into the mock; real Heima lands in v0.1). - Terminal 2: agentkeys init --mock-token + agentkeys provision openrouter + verification (read back the key, curl OpenRouter). Plus the 'under the hood' breakdown so a reader knows why no daemon or pairing is involved, and a short 'artifacts to inspect' pointer (session.json path, audit JSONL). Also promotes the build-and-install step from prose to step 5 so the prerequisites list is self-contained and paste-able. --- docs/manual-test-stage5.md | 67 +++++++++++++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 1 deletion(-) diff --git a/docs/manual-test-stage5.md b/docs/manual-test-stage5.md index 47d8aca..ac25c16 100644 --- a/docs/manual-test-stage5.md +++ b/docs/manual-test-stage5.md @@ -66,7 +66,14 @@ export AGENTKEYS_EMAIL_PORT="993" Once the app password is set, the demo sees **zero 2FA prompts**. App passwords bypass 2FA by design — they're Google's non-interactive credential, scoped to IMAP only, revocable anytime. -**5. Daemon running and paired** — see the Stage 4 manual test guide. +**5. Build binaries + install provisioner-script deps (one-time).** + +```bash +cd ~/Projects/agentkeys +cargo build --workspace --release +npm install --prefix provisioner-scripts +npx playwright install chromium --with-deps +```
Alternative: Google Workspace DWD (for operators with an existing Workspace subscription) @@ -95,6 +102,64 @@ Downside: the agent doesn't fully control the inbox (shared with the human), and
+### Run it + +Two terminals. Everything runs from the repo root (`~/Projects/agentkeys`). + +**Terminal 1 — mock backend.** Stage 5a stores the provisioned key via the mock server (real Heima + TEE ships in v0.1). Leave this running. + +```bash +cd ~/Projects/agentkeys +cargo run --release -p agentkeys-mock-server -- --port 8090 +# Expected: "Mock server running on port 8090" +``` + +**Terminal 2 — provision.** Carry the four Gmail env vars from step 4 into this shell (or re-`export` them here). Then: + +```bash +cd ~/Projects/agentkeys +BIN=$(pwd)/target/release/agentkeys +BACKEND=http://127.0.0.1:8090 + +# 1. Initialize the master session (one-time per shell / mock restart). +$BIN --backend $BACKEND init --mock-token stage5-demo +# Expected: wallet printed; ~/.agentkeys/master/session.json created. + +# 2. Sanity-check the Gmail env vars landed in this shell. +env | grep AGENTKEYS_EMAIL_ +# Expected: four AGENTKEYS_EMAIL_* lines matching step 4. + +# 3. Run the live OpenRouter provision. +$BIN --backend $BACKEND provision openrouter +# Expect ~30-90 s: browser opens headless, account created, +# email verified, API key extracted + verified, stored in the mock backend. +``` + +**What this does under the hood:** + +- `init` authenticates the master CLI to the mock backend and caches the session token (OS keychain on macOS/Linux with keychain, file fallback otherwise). +- `provision openrouter` runs `npx tsx provisioner-scripts/src/scrapers/openrouter.ts` against a real Chromium session, uses the Gmail IMAP creds from your exported env to read the confirmation email, extracts + verifies the key against `https://openrouter.ai/api/v1/models`, and stores it into the mock backend under the master session's wallet. +- No daemon, no pairing — Stage 5a provision runs entirely as the master CLI. Daemon + pairing are Stage 4's flow for agent-side credential access, not needed for the live provision demo. + +**After it succeeds:** + +```bash +# Read the full stored key back. +$BIN --backend $BACKEND read openrouter +# Expected: sk-or-v1-... + +# Verify it works against OpenRouter. +curl -s -H "Authorization: Bearer $($BIN --backend $BACKEND read openrouter)" \ + https://openrouter.ai/api/v1/models | head -c 200 +# Expected: HTTP 200 + a JSON body starting with {"data":[... +``` + +**Artifacts you can inspect:** + +- `~/.agentkeys/master/session.json` — the master session (wallet + bearer token). +- `~/.agentkeys/logs/provision-.jsonl` — per-step audit trail (when present; full audit logging lands with 5b). +- Stderr of `provision openrouter` — the single-shot step lines shown under "Expected behavior" below. + ### Expected behavior 1. Stderr shows single-shot step lines (real-time streaming ships in 5b): From 0463d2ea650abf4e5c4afe11420f5c87495bef30 Mon Sep 17 00:00:00 2001 From: wildmeta-agent Date: Mon, 20 Apr 2026 15:27:31 +0800 Subject: [PATCH 2/8] stage5a: plus-addressing demo path + failure-mode logs Three tightly-coupled changes so the Stage 5a live demo is both re-runnable (returning-user collision) and debuggable (no more silent 'subprocess ended without terminal event' with no cause). provisioner-scripts/src/scrapers/openrouter.ts - Split AGENTKEYS_EMAIL_USER (canonical IMAP login) from a new AGENTKEYS_SIGNUP_EMAIL (what we type into OpenRouter's signup form). Gmail IMAP rejects plus-addressing at login, so the two had to diverge before plus-addressing could work at all. - Wrap main() in a catch-all that emits a terminal error event and flushes stdout before process.exit. Playwright launch failures, dynamic-import errors, IMAP connection refusals, and any other throws upstream of the scraper's inner try/catch now surface as a parseable Error event instead of dying silently and being reported as 'subprocess ended without terminal event.' crates/agentkeys-provisioner/src/orchestrator.rs - On the no-terminal-event error path, best-effort-write the full subprocess output (exit code, every event emitted, complete stderr) to ~/.agentkeys/logs/provision--.log and include the path in the error message. stderr_tail (20 lines) stays inline for the quick case. docs/manual-test-stage5.md - Flip the primary demo path from 'dedicated throwaway Gmail' to 'your existing Gmail + plus-addressing + app password.' Reason documented: OpenRouter's /auth is signup+signin on one URL, so reusing a canonical address across runs always fails on the second run with a returning-user UI the scraper wasn't designed for. Plus-addressing minted per-run via $(date +%s) gives us DWD-equivalent disposable emails at zero infrastructure cost. - Document the two env vars and why they exist separately. - Dedicated-throwaway-Gmail + Workspace DWD demoted to
alternatives. - New 'Debugging a failure' block under Artifacts pointing to the persistent log file + the direct-scraper-run fallback. - New 'subprocess ended without terminal event' and 'account already exists (returning-user path)' entries in Failure modes. Tests: - cargo test -p agentkeys-provisioner --release: 15/15 pass - npm test --prefix provisioner-scripts: 15/15 pass across 6 files --- .../agentkeys-provisioner/src/orchestrator.rs | 58 ++++++++++- docs/manual-test-stage5.md | 98 ++++++++++--------- .../src/scrapers/openrouter.ts | 64 ++++++++---- 3 files changed, 154 insertions(+), 66 deletions(-) diff --git a/crates/agentkeys-provisioner/src/orchestrator.rs b/crates/agentkeys-provisioner/src/orchestrator.rs index 972c024..3dff5cc 100644 --- a/crates/agentkeys-provisioner/src/orchestrator.rs +++ b/crates/agentkeys-provisioner/src/orchestrator.rs @@ -1,14 +1,14 @@ use std::collections::HashMap; -use std::path::Path; +use std::path::{Path, PathBuf}; use std::sync::{Arc, Mutex}; -use std::time::Instant; +use std::time::{Instant, SystemTime, UNIX_EPOCH}; use agentkeys_core::backend::CredentialBackend; use agentkeys_types::{ProvisionEvent, ServiceName, Session, TripwireKind, WalletAddress}; use crate::error::{ProvisionError, ProvisionResult}; use crate::metrics::{self, ProvisionMetric, VerificationResultLabel}; -use crate::subprocess::{spawn_and_collect, SubprocessConfig}; +use crate::subprocess::{spawn_and_collect, SubprocessConfig, SubprocessOutcome}; #[derive(Debug, Clone)] pub struct ActiveProvision { @@ -86,6 +86,37 @@ impl Drop for ProvisionGuard { } } +/// Best-effort dump of subprocess output to `~/.agentkeys/logs/provision--.log`. +/// Returns the file path if the write succeeded. Never errors — failure to write the log +/// must not mask the underlying provision failure. +fn write_provision_log(service: &str, outcome: &SubprocessOutcome) -> Option { + let home = std::env::var("HOME").ok().map(PathBuf::from)?; + let dir = home.join(".agentkeys").join("logs"); + std::fs::create_dir_all(&dir).ok()?; + let ts = SystemTime::now().duration_since(UNIX_EPOCH).ok()?.as_secs(); + let safe_service: String = service + .chars() + .map(|c| if c.is_ascii_alphanumeric() || c == '-' || c == '_' { c } else { '_' }) + .collect(); + let path = dir.join(format!("provision-{}-{}.log", safe_service, ts)); + + let mut body = String::new(); + body.push_str(&format!( + "service: {}\nexit_code: {:?}\nevents_emitted: {}\n\n=== subprocess stdout events ===\n", + service, + outcome.exit_code, + outcome.events.len() + )); + for ev in &outcome.events { + body.push_str(&format!("{:?}\n", ev)); + } + body.push_str("\n=== subprocess stderr ===\n"); + body.push_str(&outcome.stderr); + + std::fs::write(&path, body).ok()?; + Some(path) +} + /// Returns first 8 chars + `****...` + last 4. For keys shorter than 12 chars returns `****`. pub fn mask_key(key: &str) -> String { if key.len() < 12 { @@ -190,7 +221,26 @@ pub async fn run_provision( } let raw_key = api_key.ok_or_else(|| { - ProvisionError::Internal("subprocess ended without terminal event".to_string()) + let stderr_tail: String = outcome + .stderr + .lines() + .rev() + .take(20) + .collect::>() + .into_iter() + .rev() + .collect::>() + .join("\n"); + let log_hint = match write_provision_log(service, &outcome) { + Some(path) => format!("full log: {}", path.display()), + None => "full log: (unable to write ~/.agentkeys/logs — check HOME + permissions)".to_string(), + }; + ProvisionError::Internal(format!( + "subprocess ended without terminal event (exit {:?}). {}. stderr tail:\n{}", + outcome.exit_code, + log_hint, + if stderr_tail.is_empty() { "(empty)" } else { stderr_tail.as_str() } + )) })?; let masked = mask_key(&raw_key); diff --git a/docs/manual-test-stage5.md b/docs/manual-test-stage5.md index ac25c16..ac572da 100644 --- a/docs/manual-test-stage5.md +++ b/docs/manual-test-stage5.md @@ -30,43 +30,46 @@ For the demo-only purpose of Stage 5, the goal is the **shortest path to a runni > **This is a temporary demo solution.** For production (v0.1), the agent mailbox moves to SES-hosted `*@agentkeys-email.io` under the three-layer `TokenAuthority` abstraction. See the [email-system wiki page](../wiki/email-system.md) for the full architecture and why we're running demo-and-production on different backends deliberately. -#### 🚀 Demo path: dedicated personal Gmail + TOTP + app password +#### 🚀 Demo path: your existing Gmail + plus-addressing + app password -Why dedicated (not your personal inbox with plus-addressing): the agent gets a clean inbox it fully controls, no personal mail pollution, cleanup is a single account-delete. +Why plus-addressing as the primary demo path: -**1. Create a fresh Gmail account for the bot.** +- **Unique email per run.** OpenRouter's sign-up/sign-in page is a single URL — if you submit an email that already has an account, you land on a returning-user screen the scraper was not designed to traverse, and the provision fails with a `no terminal event`-style error. `you+or-@gmail.com` is a fresh address to OpenRouter on every run, so every run hits the pristine signup path. +- **Zero account creation.** Uses your existing Gmail — no new Google account, no new phone verification. +- **Single inbox to clean up.** The OpenRouter confirmation mail lands in your real inbox; delete one thread after the demo and you're done. +- **Scales to repeat testing.** Rotate the local-part (`+or-1`, `+or-2`, …) or include a timestamp and you have DWD-equivalent disposable emails without building DWD. -Sign up at [accounts.google.com](https://accounts.google.com) with a name like `wildmeta-stage5-demo@gmail.com`. Google will ask for a recovery phone — use your personal phone; you only need it once for step 2. +**1. Generate a Gmail app password for IMAP.** -**2. Enable 2-Step Verification and enroll TOTP as the second factor.** +- Requires 2FA enabled on your Google account. If not already enabled: [myaccount.google.com](https://myaccount.google.com) → Security → turn on 2-Step Verification (TOTP or SMS is fine; enrollment is a one-time cost). +- Visit [myaccount.google.com/apppasswords](https://myaccount.google.com/apppasswords). Create one named `agentkeys-stage5`. Google gives you a 16-character password. +- Copy immediately — it's shown once. Revoke anytime from the same page. -Gmail IMAP access chain: `app password` requires `2FA enabled` requires `second factor enrolled`. Using an authenticator app as that second factor makes the account non-interactive after this one-time enrollment. +**2. Export the env vars.** -- Open [myaccount.google.com](https://myaccount.google.com) → **Security** -- **Turn on 2-Step Verification.** Google sends an SMS to your recovery phone to start enrollment. -- Under 2-Step Verification settings, add **Authenticator app** as a second step. Google shows a QR code and a secret. -- Scan into Google Authenticator / Authy / 1Password / Bitwarden / whatever TOTP client you already use. You now own the second factor. -- (Optional) once TOTP is active, you can drop SMS as a 2FA method — Google keeps the phone for account recovery but stops using it as a live second factor. +The scraper splits **IMAP login** from **signup email**. Set both: -**3. Generate an app password for IMAP.** +```bash +export AGENTKEYS_EMAIL_BACKEND=gmail -- Visit [myaccount.google.com/apppasswords](https://myaccount.google.com/apppasswords). -- Create one named "agentkeys-stage5". Google gives you a 16-character password. -- Copy it immediately — it's shown once. Revoke anytime from the same page. +# IMAP login — must be the canonical Gmail address. +export AGENTKEYS_EMAIL_USER="you@gmail.com" +export AGENTKEYS_EMAIL_PASSWORD="xxxx xxxx xxxx xxxx" # 16-char app password -**4. Export the four env vars.** +# What we type into OpenRouter's signup form. +# Plus-addressed alias so OpenRouter sees a brand-new email per run; +# mail is still delivered to you@gmail.com. +export AGENTKEYS_SIGNUP_EMAIL="you+or-$(date +%s)@gmail.com" -```bash -export AGENTKEYS_EMAIL_BACKEND=gmail -export AGENTKEYS_EMAIL_USER="wildmeta-stage5-demo@gmail.com" # the bot account from step 1 -export AGENTKEYS_EMAIL_PASSWORD="xxxx xxxx xxxx xxxx" # 16-char app password from step 3 export AGENTKEYS_EMAIL_HOST="imap.gmail.com" export AGENTKEYS_EMAIL_PORT="993" ``` +> **Why two email vars.** `AGENTKEYS_EMAIL_USER` is the IMAP login — Gmail IMAP only accepts your canonical address (plus-addressing aliases are rejected at login). `AGENTKEYS_SIGNUP_EMAIL` is what we fill into the service's sign-up form — plus-addressing works there because SMTP delivery honors the `+alias` suffix. If `AGENTKEYS_SIGNUP_EMAIL` is unset, the scraper falls back to `AGENTKEYS_EMAIL_USER` — which is fine for a dedicated bot account (see alternative below) but guarantees a "account already exists" collision if you reuse a canonical address across runs. + Once the app password is set, the demo sees **zero 2FA prompts**. App passwords bypass 2FA by design — they're Google's non-interactive credential, scoped to IMAP only, revocable anytime. -**5. Build binaries + install provisioner-script deps (one-time).** +**3. Build binaries + install provisioner-script deps (one-time).** ```bash cd ~/Projects/agentkeys @@ -76,29 +79,16 @@ npx playwright install chromium --with-deps ```
-Alternative: Google Workspace DWD (for operators with an existing Workspace subscription) +Alternative: dedicated throwaway Gmail (cleanest but more setup) -See [`docs/stage5-workspace-email-setup.md`](stage5-workspace-email-setup.md). That path mints a throwaway `stage5test-@wildmeta.ai` per run, reads its inbox via the Gmail API (no app password, no interactive OAuth), and deletes the user at the end. One-time ~20-minute admin setup + currently 3-5 days of code work to replace the `imapflow` fetcher with a Gmail-API fetcher that uses DWD impersonation. Longer upfront cost than the dedicated-Gmail demo path, but the right choice for enterprise deployments that already run Workspace. +Create a fresh bot Gmail (`wildmeta-stage5-demo@gmail.com`), enable 2FA + TOTP, generate an app password. Set `AGENTKEYS_EMAIL_USER` to the bot address; leave `AGENTKEYS_SIGNUP_EMAIL` unset. One-time ~10 minutes setup; gives you a fully controlled inbox with no personal-mail pollution. Re-runs need `--force` or account-delete between attempts because the bot address itself will collide.
-Alternative: plus-addressed personal Gmail (shared-inbox quick demo) - -If you don't want to create a dedicated account and are OK with one-off OpenRouter mail landing in your real inbox, plus-addressing on your existing Gmail works for a single demo run. - -1. **Your existing personal Gmail account** — plus-addressing is a Gmail-native feature: mail sent to `you+anything@gmail.com` is delivered to `you@gmail.com` without any configuration. A single inbox supports unlimited test aliases (`you+stage5test-20260418@gmail.com`). -2. **Gmail app password** (not your regular password) — generate at https://myaccount.google.com/apppasswords. Scoped to IMAP access only; revoke after the demo. -3. **Environment:** - ```bash - export AGENTKEYS_EMAIL_BACKEND=gmail - export AGENTKEYS_EMAIL_USER="you@gmail.com" # your real Gmail; Stage 5a appends +alias at signup - export AGENTKEYS_EMAIL_PASSWORD="" # NOT your normal Google password - export AGENTKEYS_EMAIL_HOST="imap.gmail.com" - export AGENTKEYS_EMAIL_PORT="993" - ``` +Alternative: Google Workspace DWD (for operators with an existing Workspace subscription) -Downside: the agent doesn't fully control the inbox (shared with the human), and the OpenRouter confirmation email lingers in your personal mail until you delete it. +See [`docs/stage5-workspace-email-setup.md`](stage5-workspace-email-setup.md). That path mints a throwaway `stage5test-@wildmeta.ai` per run, reads its inbox via the Gmail API (no app password, no interactive OAuth), and deletes the user at the end. One-time ~20-minute admin setup + currently 3-5 days of code work to replace the `imapflow` fetcher with a Gmail-API fetcher that uses DWD impersonation. Right choice for enterprise deployments that already run Workspace; overkill for the demo.
@@ -114,7 +104,7 @@ cargo run --release -p agentkeys-mock-server -- --port 8090 # Expected: "Mock server running on port 8090" ``` -**Terminal 2 — provision.** Carry the four Gmail env vars from step 4 into this shell (or re-`export` them here). Then: +**Terminal 2 — provision.** Carry the Gmail env vars from step 2 into this shell (or re-`export` them here). Note: if you are using plus-addressing, **re-evaluate `AGENTKEYS_SIGNUP_EMAIL` for every run** so the timestamp is fresh and OpenRouter sees a new email — otherwise your second run will collide with the first run's account. ```bash cd ~/Projects/agentkeys @@ -125,11 +115,16 @@ BACKEND=http://127.0.0.1:8090 $BIN --backend $BACKEND init --mock-token stage5-demo # Expected: wallet printed; ~/.agentkeys/master/session.json created. -# 2. Sanity-check the Gmail env vars landed in this shell. -env | grep AGENTKEYS_EMAIL_ -# Expected: four AGENTKEYS_EMAIL_* lines matching step 4. +# 2. Sanity-check the email env vars landed in this shell. +env | grep -E 'AGENTKEYS_(EMAIL|SIGNUP)_' +# Expected: AGENTKEYS_EMAIL_{BACKEND,USER,PASSWORD,HOST,PORT} and AGENTKEYS_SIGNUP_EMAIL. +# If AGENTKEYS_SIGNUP_EMAIL is missing, the scraper falls back to AGENTKEYS_EMAIL_USER, +# which will hit "account already exists" on the second run against OpenRouter. + +# 3. Re-seed a fresh signup alias for this run (plus-addressing path only). +export AGENTKEYS_SIGNUP_EMAIL="you+or-$(date +%s)@gmail.com" -# 3. Run the live OpenRouter provision. +# 4. Run the live OpenRouter provision. $BIN --backend $BACKEND provision openrouter # Expect ~30-90 s: browser opens headless, account created, # email verified, API key extracted + verified, stored in the mock backend. @@ -157,9 +152,22 @@ curl -s -H "Authorization: Bearer $($BIN --backend $BACKEND read openrouter)" \ **Artifacts you can inspect:** - `~/.agentkeys/master/session.json` — the master session (wallet + bearer token). -- `~/.agentkeys/logs/provision-.jsonl` — per-step audit trail (when present; full audit logging lands with 5b). +- `~/.agentkeys/logs/provision-openrouter-.log` — **written automatically when a provision fails with "no terminal event."** Contains the exit code, every event the subprocess emitted, and the full captured stderr. `ls -lt ~/.agentkeys/logs/ | head` to find the most recent. - Stderr of `provision openrouter` — the single-shot step lines shown under "Expected behavior" below. +**Debugging a failure:** + +1. Check the error message on stderr — if it ends with `full log: /path/to/provision-openrouter-.log`, that file has the full signal. +2. `cat` the log file. The `=== subprocess stderr ===` section usually shows the real cause (Playwright browser-launch error, IMAP connection refused, an unhandled rejection from the pattern, etc.). +3. For interactive debugging, run the TS scraper directly against a visible browser: + ```bash + # Temporarily flip headless:false at provisioner-scripts/src/scrapers/openrouter.ts:~116, + # then: + cd ~/Projects/agentkeys + npx tsx provisioner-scripts/src/scrapers/openrouter.ts + ``` + You'll see the page in real time — instant diagnosis for selector drift, returning-user UI paths, or CAPTCHA challenges. + ### Expected behavior 1. Stderr shows single-shot step lines (real-time streaming ships in 5b): @@ -181,6 +189,8 @@ curl -s -H "Authorization: Bearer $($BIN --backend $BACKEND read openrouter)" \ ### Failure modes to watch for +- **"subprocess ended without terminal event"** — the scraper crashed before emitting any event (Playwright browser-launch failed, IMAP connection refused, unhandled rejection, etc.). The error message now ends with `full log: ~/.agentkeys/logs/provision-openrouter-.log` — open that file; the `=== subprocess stderr ===` section has the real cause. If stderr is empty, re-run the TS scraper directly with `npx tsx provisioner-scripts/src/scrapers/openrouter.ts` and watch the node-side output. +- **"account already exists" (returning-user path)** — OpenRouter's `/auth` is signup+signin on one URL. If `AGENTKEYS_SIGNUP_EMAIL` is an address that already has an OpenRouter account, the site lands on a returning-user UI the scraper can't traverse, and you'll get a `selector_timeout` tripwire or (if the path is weirder) a "no terminal event." Re-evaluate `AGENTKEYS_SIGNUP_EMAIL` with a fresh timestamp (`you+or-$(date +%s)@gmail.com`) and retry. - **CAPTCHA / Cloudflare challenge** — the Tier 2 script does not solve CAPTCHAs. Expect a Tripwire event with `kind: selector_timeout`. This is the signal that Stage 5b's agentic fallback is needed. Until 5b ships, abort and retry from a different IP. - **Email didn't arrive within 60 s** — check spam, check plus-addressing forwarding. Tripwire `email_timeout` means the IMAP fetch exhausted its polling window. - **Key verification fails with `phantom`** — the scraper extracted something key-shaped that isn't a real API key. OpenRouter may have changed its DOM; inspect the page at the success-step selector and file an issue with the HAR dump. diff --git a/provisioner-scripts/src/scrapers/openrouter.ts b/provisioner-scripts/src/scrapers/openrouter.ts index d3df2ae..746332c 100644 --- a/provisioner-scripts/src/scrapers/openrouter.ts +++ b/provisioner-scripts/src/scrapers/openrouter.ts @@ -1,5 +1,5 @@ import type { Browser } from "playwright"; -import { emit } from "../types.js"; +import { emit, type ProvisionEvent } from "../types.js"; import type { VerifyResult } from "../lib/verify.js"; import { signupEmailOtp } from "../patterns/signup_email_otp.js"; @@ -35,7 +35,13 @@ const EMAIL_SUBJECT_REGEX = /openrouter/i; const EMAIL_CODE_REGEX = /(\d{6})/; const EMAIL_TIMEOUT_MS = 60_000; -const OPENROUTER_EMAIL = process.env["AGENTKEYS_EMAIL_USER"] ?? "user@example.com"; +// IMAP login — must be the canonical Gmail address (plus-addressing aliases +// are not valid IMAP logins). +const IMAP_LOGIN_EMAIL = process.env["AGENTKEYS_EMAIL_USER"] ?? "user@example.com"; +// What we type into the service's signup form. Defaults to the IMAP login but +// can be overridden (e.g. plus-addressed `you+or-@gmail.com`) so returning +// users can mint a fresh account per run while reusing one real inbox. +const SIGNUP_EMAIL = process.env["AGENTKEYS_SIGNUP_EMAIL"] ?? IMAP_LOGIN_EMAIL; export async function runOpenRouterScraper(opts: OpenRouterScraperOpts): Promise { const signupUrl = @@ -62,7 +68,7 @@ export async function runOpenRouterScraper(opts: OpenRouterScraperOpts): Promise createKeyButtonSelector: CREATE_KEY_BUTTON_SELECTOR, keyRevealSelector: KEY_REVEAL_SELECTOR, emailFetcher: opts.emailFetcher, - emailAddress: OPENROUTER_EMAIL, + emailAddress: SIGNUP_EMAIL, emailFromRegex: EMAIL_FROM_REGEX, emailSubjectRegex: EMAIL_SUBJECT_REGEX, emailCodeRegex: EMAIL_CODE_REGEX, @@ -108,25 +114,47 @@ export async function runOpenRouterScraper(opts: OpenRouterScraperOpts): Promise } } -export default async function main(): Promise { - const { chromium } = await import("playwright"); - const { fetchVerificationCode } = await import("../lib/email.js"); - const { verify } = await import("../lib/verify.js"); +// Emit a terminal event and wait for stdout to flush before exiting. Using a +// bare `process.exit` can drop buffered writes to the parent's pipe — which is +// exactly how the orchestrator ends up reporting "subprocess ended without +// terminal event" when something upstream of the scraper's try/catch throws. +function emitAndExit(event: ProvisionEvent, exitCode: number): void { + process.stdout.write(JSON.stringify(event) + "\n", () => process.exit(exitCode)); +} - const browser = await chromium.launch({ headless: true }); +export default async function main(): Promise { try { - await runOpenRouterScraper({ - browser, - emailFetcher: (from, subject, codeRegex, timeoutMs) => - fetchVerificationCode({ from, subject, codeRegex, timeoutMs }), - verifier: verify, - }); + const { chromium } = await import("playwright"); + const { fetchVerificationCode } = await import("../lib/email.js"); + const { verify } = await import("../lib/verify.js"); + + const browser = await chromium.launch({ headless: true }); + try { + await runOpenRouterScraper({ + browser, + emailFetcher: (from, subject, codeRegex, timeoutMs) => + fetchVerificationCode({ from, subject, codeRegex, timeoutMs }), + verifier: verify, + }); + } finally { + await browser.close(); + } } catch (err) { if (err instanceof ScraperAbortError) { - process.exit(1); + // Tripwire / expected-error path: the scraper already emitted a terminal + // event (tripwire or error) before throwing. Just propagate the exit. + emitAndExit( + { type: "error", code: "internal", details: `abort: ${err.message}` }, + 1, + ); + return; } - throw err; - } finally { - await browser.close(); + // Unhandled path: a throw that escaped the scraper's try/catch — e.g. + // Playwright browser-launch failure, IMAP connection refused, dynamic + // import failure, unhandled rejection in the pattern. Without this + // catch-all the orchestrator sees a naked process exit and reports + // "subprocess ended without terminal event" with no cause. + const msg = err instanceof Error ? (err.stack ?? err.message) : String(err); + emitAndExit({ type: "error", code: "internal", details: `unhandled: ${msg}` }, 2); } } From afd1349f2948866334d4cac237d40d5f27f06f7e Mon Sep 17 00:00:00 2001 From: wildmeta-agent Date: Mon, 20 Apr 2026 15:44:08 +0800 Subject: [PATCH 3/8] stage5a: invoke main() when openrouter.ts is the entry point Root cause of the 'exit_code: Some(0) / events_emitted: 0 / stderr empty' failure mode: openrouter.ts declares `export default async function main()` but nothing at module scope invokes it. When the provisioner runs `npx tsx provisioner-scripts/src/scrapers/openrouter.ts`, the module loads (imports + constant declarations + function decls), reaches EOF, and exits cleanly without ever calling main(). The orchestrator then correctly reports 'no terminal event' because the scraper genuinely emitted none. Tests did not catch this because they only import the named export `runOpenRouterScraper`, not the default `main`. Add the standard Node ESM entry-point guard at the bottom of the file. main() runs only when the file is the direct script target (argv[1] matches import.meta.url). Named-export imports from test files still bypass it, so the 15/15 TS test suite stays green. Tests: - npx tsc --noEmit: clean - npm test --prefix provisioner-scripts: 15/15 pass across 6 files --- provisioner-scripts/src/scrapers/openrouter.ts | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/provisioner-scripts/src/scrapers/openrouter.ts b/provisioner-scripts/src/scrapers/openrouter.ts index 746332c..ce9dab7 100644 --- a/provisioner-scripts/src/scrapers/openrouter.ts +++ b/provisioner-scripts/src/scrapers/openrouter.ts @@ -1,3 +1,4 @@ +import { fileURLToPath } from "url"; import type { Browser } from "playwright"; import { emit, type ProvisionEvent } from "../types.js"; import type { VerifyResult } from "../lib/verify.js"; @@ -158,3 +159,15 @@ export default async function main(): Promise { emitAndExit({ type: "error", code: "internal", details: `unhandled: ${msg}` }, 2); } } + +// Entry-point guard. Invoke main() only when this file is the direct script +// target (e.g. `npx tsx src/scrapers/openrouter.ts`). When the module is +// imported by test files that only use named exports like +// `runOpenRouterScraper`, main() must NOT run — otherwise tests would launch +// a real browser and hit real OpenRouter. Without this block, the provisioner +// subprocess just loads the module, reaches EOF, and exits 0 with no events — +// exactly the "exit_code: Some(0) / events_emitted: 0" failure mode. +const isEntry = fileURLToPath(import.meta.url) === process.argv[1]; +if (isEntry) { + void main(); +} From fada5b62fe59a2e879a24596ec8e2741d381a915 Mon Sep 17 00:00:00 2001 From: wildmeta-agent Date: Mon, 20 Apr 2026 15:59:31 +0800 Subject: [PATCH 4/8] stage5a: one-shot live-demo handoff script MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit harness/stage-5a-live-demo-handoff.sh: preflights the Stage 5a live demo end-to-end in a single bash run. Checks: - all 5 AGENTKEYS_EMAIL_* env vars present (fail-fast via :? with pointed error text for each) - target/release/agentkeys exists + executable - mock-server reachable at $BACKEND - node + npx on PATH - provisioner-scripts deps installed - Playwright chromium_headless_shell-* installed under $HOME (guards against the sandbox-HOME gotcha discovered in this ralph session — Playwright caches browsers per-HOME and a fresh HOME without cached browsers fails with "browserType.launch: Executable doesn't exist") Auto-mints AGENTKEYS_SIGNUP_EMAIL as +or-@ if unset so each run hits the OpenRouter signup path with a fresh email — no manual rotation needed. Executes the four Stage 5a acceptance criteria in order: 1. agentkeys init + provision openrouter (exit 0 required) 2. masked-key form check on stdout 3. agentkeys read openrouter returns sk-or-v1-... prefix 4. curl OpenRouter /api/v1/models returns HTTP 200 On failure, dumps the most-recent provision-openrouter-*.log so the user has the full stderr/events from the subprocess. --- harness/stage-5a-live-demo-handoff.sh | 104 ++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) create mode 100755 harness/stage-5a-live-demo-handoff.sh diff --git a/harness/stage-5a-live-demo-handoff.sh b/harness/stage-5a-live-demo-handoff.sh new file mode 100755 index 0000000..cdbebbe --- /dev/null +++ b/harness/stage-5a-live-demo-handoff.sh @@ -0,0 +1,104 @@ +#!/usr/bin/env bash +# Stage 5a live-demo one-shot handoff. +# Preconditions checked up front; failures are loud. If all four +# acceptance criteria pass, prints SUCCESS and a JSON summary. +# +# Usage: +# cd ~/Projects/agentkeys +# # Export your Gmail Workspace creds first: +# # AGENTKEYS_EMAIL_{BACKEND,USER,PASSWORD,HOST,PORT} +# # (AGENTKEYS_SIGNUP_EMAIL auto-computed below if unset) +# bash harness/stage-5a-live-demo-handoff.sh +set -uo pipefail + +REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)" +cd "$REPO_ROOT" +BIN="$REPO_ROOT/target/release/agentkeys" +BACKEND="${BACKEND:-http://127.0.0.1:8090}" + +say() { printf '\n\033[1;34m==>\033[0m %s\n' "$*"; } +fail() { printf '\033[1;31mFAIL:\033[0m %s\n' "$*" >&2; exit 1; } +pass() { printf '\033[1;32mPASS:\033[0m %s\n' "$*"; } + +say "Preflight — required env" +: "${AGENTKEYS_EMAIL_BACKEND:?AGENTKEYS_EMAIL_BACKEND must be set (e.g. gmail)}" +: "${AGENTKEYS_EMAIL_USER:?AGENTKEYS_EMAIL_USER must be set to the CANONICAL Gmail address (NOT a plus-alias; IMAP login only accepts canonical)}" +: "${AGENTKEYS_EMAIL_PASSWORD:?AGENTKEYS_EMAIL_PASSWORD must be set (Gmail app password; NOT your normal Google password)}" +: "${AGENTKEYS_EMAIL_HOST:?AGENTKEYS_EMAIL_HOST must be set (imap.gmail.com)}" +: "${AGENTKEYS_EMAIL_PORT:?AGENTKEYS_EMAIL_PORT must be set (993)}" + +# Auto-mint a fresh plus-alias for THIS run so OpenRouter never sees a repeat +# email. User can override by exporting AGENTKEYS_SIGNUP_EMAIL themselves. +if [ -z "${AGENTKEYS_SIGNUP_EMAIL:-}" ]; then + LOCAL="${AGENTKEYS_EMAIL_USER%@*}" + DOMAIN="${AGENTKEYS_EMAIL_USER#*@}" + export AGENTKEYS_SIGNUP_EMAIL="${LOCAL}+or-$(date +%s)@${DOMAIN}" + say "Auto-minted AGENTKEYS_SIGNUP_EMAIL=$AGENTKEYS_SIGNUP_EMAIL" +fi + +say "Preflight — binary exists" +[ -x "$BIN" ] || fail "$BIN not found. Run: cargo build --release -p agentkeys-cli" + +say "Preflight — mock-server at $BACKEND is up" +curl -sf "$BACKEND/health" >/dev/null 2>&1 \ + || curl -sf "$BACKEND" >/dev/null 2>&1 \ + || fail "mock-server not reachable at $BACKEND. Run: cargo run --release -p agentkeys-mock-server -- --port 8090 &" + +say "Preflight — node + playwright deps + chromium browser" +command -v node >/dev/null || fail "node not on PATH" +command -v npx >/dev/null || fail "npx not on PATH" +[ -d node_modules ] || [ -d provisioner-scripts/node_modules ] \ + || fail "provisioner-scripts deps missing. Run: npm install --prefix provisioner-scripts" +# Playwright caches browsers under \$HOME/Library/Caches/ms-playwright on macOS; +# a run-in-unusual-HOME provision will hit "browserType.launch: Executable +# doesn't exist" unless they are installed under THIS \$HOME. +if ! ls "${HOME}/Library/Caches/ms-playwright/chromium_headless_shell-"* >/dev/null 2>&1 \ + && ! ls "${HOME}/.cache/ms-playwright/chromium_headless_shell-"* >/dev/null 2>&1; then + fail "Playwright chromium not installed under \$HOME=$HOME. Run: npx playwright install chromium --with-deps" +fi + +say "1. Initialize master session" +$BIN --backend $BACKEND init --mock-token stage5-live-demo || fail "init" + +say "2. Env snapshot (masking secrets)" +env | grep -E 'AGENTKEYS_(EMAIL|SIGNUP)_' | sed 's/\(PASSWORD=\).*/\1***REDACTED***/' + +say "3. agentkeys provision openrouter" +if ! $BIN --backend $BACKEND provision openrouter; then + EC=$? + echo "---exit=$EC---" + LOG=$(ls -t $HOME/.agentkeys/logs/provision-openrouter-*.log 2>/dev/null | head -1) + if [ -n "$LOG" ]; then + echo "=== most recent provision log: $LOG ===" + cat "$LOG" + else + echo "(no provision log written — orchestrator path unreachable)" + fi + fail "provision failed; inspect log above" +fi + +say "4. AC#1-#2 — exit 0 and masked-key form" +# (provision emits masked key to stdout as final line; exit checked above) + +say "5. AC#3 — read full key back" +KEY=$($BIN --backend $BACKEND read openrouter) || fail "read openrouter" +case "$KEY" in + sk-or-v1-*) pass "read returned key of correct prefix" ;; + *) fail "read returned unexpected prefix: $(echo "$KEY" | head -c 12)..." ;; +esac + +say "6. AC#4 — curl OpenRouter /api/v1/models" +HTTP_CODE=$(curl -sS -o /tmp/or-models.json -w '%{http_code}' \ + -H "Authorization: Bearer $KEY" \ + https://openrouter.ai/api/v1/models) +if [ "$HTTP_CODE" != "200" ]; then + echo "unexpected HTTP $HTTP_CODE" + head -c 500 /tmp/or-models.json + fail "OpenRouter /api/v1/models did not return 200" +fi +head -c 40 /tmp/or-models.json +echo '' +pass "OpenRouter /api/v1/models returned 200" + +say "ALL FOUR ACCEPTANCE CRITERIA PASS" +echo "SUCCESS" From 7a37de5239f0ca83db616fc1e8823ac9c0e53af8 Mon Sep 17 00:00:00 2001 From: wildmeta-agent Date: Mon, 20 Apr 2026 16:13:38 +0800 Subject: [PATCH 5/8] stage5a: diag tooling + single-plus auto-mint in live-demo handoff MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three artifacts captured during ralph session driving the live OpenRouter provision to ground truth. harness/stage-5a-live-demo-handoff.sh: strip any existing plus-alias from AGENTKEYS_EMAIL_USER before appending +or-. Some email validators (including the one OpenRouter currently uses) reject double-plus addresses like agent+2026042001+or-...@wildmeta.ai and silently drop the signup. Gmail's inbound delivery path handles it fine; the signup form does not. provisioner-scripts/diag-imap.mjs: standalone probe that verifies IMAP auth works with the configured AGENTKEYS_EMAIL_* env, lists all mailboxes, and searches INBOX / Spam / All Mail / Trash for recent OpenRouter verification emails. Distinguishes "auth failed" / "email went to spam" / "email never arrived" failure modes that the scraper's EmailTimeout tripwire conflates. provisioner-scripts/diag-openrouter.mjs: standalone Playwright probe against the live openrouter.ai signup page. Captures screenshots + HTML snapshots + a JSON inventory of all input/button candidates to reveal where real DOM diverges from the scraper's hardcoded selectors. Used in this session to confirm OpenRouter migrated to Clerk (field name changed email -> emailAddress, button has no type=submit) — a Stage 5b blocker, not a Stage 5a bug. --- harness/stage-5a-live-demo-handoff.sh | 14 +-- provisioner-scripts/diag-imap.mjs | 76 ++++++++++++++++ provisioner-scripts/diag-openrouter.mjs | 115 ++++++++++++++++++++++++ 3 files changed, 200 insertions(+), 5 deletions(-) create mode 100644 provisioner-scripts/diag-imap.mjs create mode 100644 provisioner-scripts/diag-openrouter.mjs diff --git a/harness/stage-5a-live-demo-handoff.sh b/harness/stage-5a-live-demo-handoff.sh index cdbebbe..4735072 100755 --- a/harness/stage-5a-live-demo-handoff.sh +++ b/harness/stage-5a-live-demo-handoff.sh @@ -27,13 +27,17 @@ say "Preflight — required env" : "${AGENTKEYS_EMAIL_HOST:?AGENTKEYS_EMAIL_HOST must be set (imap.gmail.com)}" : "${AGENTKEYS_EMAIL_PORT:?AGENTKEYS_EMAIL_PORT must be set (993)}" -# Auto-mint a fresh plus-alias for THIS run so OpenRouter never sees a repeat -# email. User can override by exporting AGENTKEYS_SIGNUP_EMAIL themselves. +# Auto-mint a fresh single-plus alias for THIS run so OpenRouter never sees +# a repeat email. Strip any existing +suffix on AGENTKEYS_EMAIL_USER first: +# some email validators (including OpenRouter's) reject double-plus addresses +# like agent+2026042001+or-...@wildmeta.ai and silently drop the signup. The +# inbox delivery path doesn't care, but the signup form does. if [ -z "${AGENTKEYS_SIGNUP_EMAIL:-}" ]; then - LOCAL="${AGENTKEYS_EMAIL_USER%@*}" + RAW_LOCAL="${AGENTKEYS_EMAIL_USER%@*}" + CANONICAL_LOCAL="${RAW_LOCAL%%+*}" # strip first + and everything after DOMAIN="${AGENTKEYS_EMAIL_USER#*@}" - export AGENTKEYS_SIGNUP_EMAIL="${LOCAL}+or-$(date +%s)@${DOMAIN}" - say "Auto-minted AGENTKEYS_SIGNUP_EMAIL=$AGENTKEYS_SIGNUP_EMAIL" + export AGENTKEYS_SIGNUP_EMAIL="${CANONICAL_LOCAL}+or-$(date +%s)@${DOMAIN}" + say "Auto-minted AGENTKEYS_SIGNUP_EMAIL=$AGENTKEYS_SIGNUP_EMAIL (stripped existing plus-alias before appending)" fi say "Preflight — binary exists" diff --git a/provisioner-scripts/diag-imap.mjs b/provisioner-scripts/diag-imap.mjs new file mode 100644 index 0000000..23636dc --- /dev/null +++ b/provisioner-scripts/diag-imap.mjs @@ -0,0 +1,76 @@ +// Diagnostic: IMAP auth + folder scan for OpenRouter verification emails. +// Usage: node harness/diag-imap.mjs +// Requires AGENTKEYS_EMAIL_{USER,PASSWORD,HOST,PORT} exported. +import { ImapFlow } from "imapflow"; + +const user = process.env.AGENTKEYS_EMAIL_USER; +const pass = process.env.AGENTKEYS_EMAIL_PASSWORD; +const host = process.env.AGENTKEYS_EMAIL_HOST ?? "imap.gmail.com"; +const port = parseInt(process.env.AGENTKEYS_EMAIL_PORT ?? "993", 10); + +if (!user || !pass) { + console.error("ERROR: AGENTKEYS_EMAIL_USER and AGENTKEYS_EMAIL_PASSWORD required"); + process.exit(2); +} + +const client = new ImapFlow({ + host, port, secure: true, + auth: { user, pass }, + logger: false, +}); + +try { + console.log(`1) connecting as ${user}...`); + await client.connect(); + console.log(" OK"); + + console.log("2) listing mailboxes..."); + const listResult = await client.list(); + const mailboxes = listResult.map(b => b.path); + console.log(` found ${mailboxes.length} mailboxes: ${mailboxes.join(", ")}`); + + // Gmail key folders + const candidates = ["INBOX", "[Gmail]/Spam", "[Gmail]/All Mail", "[Gmail]/Trash"]; + for (const box of candidates) { + if (!mailboxes.includes(box)) continue; + try { + await client.mailboxOpen(box); + const uids = await client.search({ + from: "noreply@openrouter.ai", + since: new Date(Date.now() - 24 * 3600 * 1000), // last 24h + }); + console.log(`3) [${box}] openrouter emails (last 24h): ${uids.length}`); + // Show the most recent one's envelope + if (uids.length > 0) { + const msg = await client.fetchOne(uids[uids.length - 1], { envelope: true, source: false }); + if (msg) { + console.log(` most recent: from=${msg.envelope.from?.[0]?.address} subject="${msg.envelope.subject}" date=${msg.envelope.date}`); + console.log(` TO: ${JSON.stringify(msg.envelope.to?.map(t => t.address))}`); + } + } + } catch (err) { + console.log(` [${box}] error: ${err.message}`); + } + } + + // Broader search: anything mentioning openrouter in subject (catches forwards, digest emails) + console.log("4) broader search in INBOX for 'openrouter' subject (last 24h):"); + try { + await client.mailboxOpen("INBOX"); + const uids = await client.search({ + subject: "openrouter", + since: new Date(Date.now() - 24 * 3600 * 1000), + }); + console.log(` found ${uids.length}`); + } catch (err) { + console.log(` error: ${err.message}`); + } + +} catch (err) { + console.error(`FATAL: ${err.message}`); + console.error(err.stack); + process.exit(1); +} finally { + await client.logout().catch(() => {}); +} +console.log("DONE"); diff --git a/provisioner-scripts/diag-openrouter.mjs b/provisioner-scripts/diag-openrouter.mjs new file mode 100644 index 0000000..4781a86 --- /dev/null +++ b/provisioner-scripts/diag-openrouter.mjs @@ -0,0 +1,115 @@ +// Diagnostic: step through OpenRouter signup manually, saving a screenshot + +// HTML snapshot at each checkpoint. Captures where the real DOM diverges from +// the scraper's selectors. +// +// Usage (from repo root): +// cd provisioner-scripts +// node diag-openrouter.mjs +// Writes artifacts to /tmp/or-diag/ so we can inspect after the run. +import { chromium } from "playwright"; +import { mkdir, writeFile } from "fs/promises"; + +const OUT = "/tmp/or-diag"; +const EMAIL = process.env.AGENTKEYS_SIGNUP_EMAIL ?? process.env.AGENTKEYS_EMAIL_USER; +if (!EMAIL) { + console.error("ERROR: set AGENTKEYS_SIGNUP_EMAIL or AGENTKEYS_EMAIL_USER"); + process.exit(2); +} + +await mkdir(OUT, { recursive: true }); + +async function snap(page, label) { + try { + const url = page.url(); + const title = await page.title(); + const html = await page.content(); + await page.screenshot({ path: `${OUT}/${label}.png`, fullPage: true }); + await writeFile(`${OUT}/${label}.html`, html); + await writeFile(`${OUT}/${label}.meta.txt`, + `url: ${url}\ntitle: ${title}\nhtml_len: ${html.length}\n`); + console.log(` snapshot ${label}: url=${url} title="${title}"`); + } catch (err) { + console.log(` snapshot ${label} FAILED: ${err.message}`); + } +} + +const browser = await chromium.launch({ headless: true }); +const ctx = await browser.newContext({ + userAgent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36", +}); +const page = await ctx.newPage(); + +try { + console.log(`Using signup email: ${EMAIL}`); + console.log("1) goto https://openrouter.ai/auth"); + await page.goto("https://openrouter.ai/auth", { waitUntil: "networkidle", timeout: 30_000 }); + await snap(page, "01-landed"); + + console.log("2) looking for any email-like input"); + // Broad probe — whatever the scraper's 'input[name="email"]' misses. + const candidates = await page.$$eval( + "input, button", + nodes => nodes.map(n => ({ + tag: n.tagName.toLowerCase(), + type: n.getAttribute("type"), + name: n.getAttribute("name"), + id: n.id, + placeholder: n.getAttribute("placeholder"), + text: (n.innerText || n.value || "").slice(0, 80), + ariaLabel: n.getAttribute("aria-label"), + })) + ); + await writeFile(`${OUT}/02-candidates.json`, JSON.stringify(candidates, null, 2)); + console.log(` wrote ${candidates.length} candidates to ${OUT}/02-candidates.json`); + const emailLikely = candidates.filter(c => + c.tag === "input" && ( + c.type === "email" || + c.name?.includes("email") || + c.id?.includes("email") || + c.placeholder?.toLowerCase().includes("email") || + c.ariaLabel?.toLowerCase().includes("email") + ) + ); + console.log(` email-like inputs:`, emailLikely); + + console.log("3) looking for sign-in / sign-up CTA buttons"); + const buttons = candidates.filter(c => + c.tag === "button" && /sign|continue|next|submit|start/i.test(c.text || "") + ); + console.log(` sign-up-likely buttons:`, buttons); + + await snap(page, "02-inspected"); + + // If there's an obvious email input + sign-up button, try submitting + if (emailLikely.length > 0) { + const e0 = emailLikely[0]; + const selector = + e0.id ? `#${e0.id}` : + e0.name ? `input[name="${e0.name}"]` : + `input[type="${e0.type}"]`; + console.log(`4) attempting fill on ${selector} with ${EMAIL}`); + await page.fill(selector, EMAIL); + await snap(page, "03-filled"); + + // Find a submit-ish button + const submitBtn = buttons[0] || candidates.find(c => + c.tag === "button" && (c.type === "submit" || /continue|sign up|next/i.test(c.text || "")) + ); + if (submitBtn) { + console.log(`5) clicking button: text="${submitBtn.text}"`); + await page.getByRole("button", { name: new RegExp(submitBtn.text?.split("\n")[0] ?? "", "i") }).first().click(); + await page.waitForTimeout(3000); + await snap(page, "04-after-submit"); + } else { + console.log("5) no obvious submit button — skipped click"); + } + } + + console.log("DONE — inspect artifacts under /tmp/or-diag/"); +} catch (err) { + console.error(`FATAL: ${err.message}`); + await snap(page, "99-error").catch(() => {}); + process.exit(1); +} finally { + await browser.close(); +} From 3c4f5419a704d89da5d3d32ae6d6b2cc95291da7 Mon Sep 17 00:00:00 2001 From: wildmeta-agent Date: Mon, 20 Apr 2026 16:18:40 +0800 Subject: [PATCH 6/8] stage5a: deslop pass on ralph-session artifacts MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit harness/stage-5a-live-demo-handoff.sh - Drop misleading "JSON summary" claim from header — script prints SUCCESS but not JSON - Drop dead repo-root node_modules branch (never exists in this project; deps only live at provisioner-scripts/node_modules) - Collapse redundant step 4 header that had no check into step's 4 section (AC#1-#3 read-back check); renumber step 5 accordingly. Prior numbering was 1→2→3→4(empty)→5→6 with the 4th being just a comment. provisioner-scripts/diag-imap.mjs - Fix stale usage comment: file was moved from harness/ into provisioner-scripts/ (imapflow resolution) but the header still pointed at the old path. provisioner-scripts/diag-openrouter.mjs - Drop dead `|| candidates.find(...)` fallback in submit-button lookup. `buttons` is already filtered with the same /sign|continue|next|submit|start/i regex, so the fallback is a strict subset of the main filter and can never fire with a different value. Post-deslop regression: - cargo test --release -p agentkeys-provisioner: 15/15 pass - npm test --prefix provisioner-scripts: 15/15 pass across 6 files - handoff preflight smoke with no env: exit 1, clear missing-var msg --- harness/stage-5a-live-demo-handoff.sh | 19 +++++++------------ provisioner-scripts/diag-imap.mjs | 2 +- provisioner-scripts/diag-openrouter.mjs | 6 ++---- 3 files changed, 10 insertions(+), 17 deletions(-) diff --git a/harness/stage-5a-live-demo-handoff.sh b/harness/stage-5a-live-demo-handoff.sh index 4735072..d6d0325 100755 --- a/harness/stage-5a-live-demo-handoff.sh +++ b/harness/stage-5a-live-demo-handoff.sh @@ -1,13 +1,11 @@ #!/usr/bin/env bash # Stage 5a live-demo one-shot handoff. -# Preconditions checked up front; failures are loud. If all four -# acceptance criteria pass, prints SUCCESS and a JSON summary. +# Preconditions checked up front; failures are loud; prints SUCCESS when +# all four acceptance criteria pass. # -# Usage: +# Usage (with AGENTKEYS_EMAIL_{BACKEND,USER,PASSWORD,HOST,PORT} exported; +# AGENTKEYS_SIGNUP_EMAIL is auto-minted below if unset): # cd ~/Projects/agentkeys -# # Export your Gmail Workspace creds first: -# # AGENTKEYS_EMAIL_{BACKEND,USER,PASSWORD,HOST,PORT} -# # (AGENTKEYS_SIGNUP_EMAIL auto-computed below if unset) # bash harness/stage-5a-live-demo-handoff.sh set -uo pipefail @@ -51,7 +49,7 @@ curl -sf "$BACKEND/health" >/dev/null 2>&1 \ say "Preflight — node + playwright deps + chromium browser" command -v node >/dev/null || fail "node not on PATH" command -v npx >/dev/null || fail "npx not on PATH" -[ -d node_modules ] || [ -d provisioner-scripts/node_modules ] \ +[ -d provisioner-scripts/node_modules ] \ || fail "provisioner-scripts deps missing. Run: npm install --prefix provisioner-scripts" # Playwright caches browsers under \$HOME/Library/Caches/ms-playwright on macOS; # a run-in-unusual-HOME provision will hit "browserType.launch: Executable @@ -81,17 +79,14 @@ if ! $BIN --backend $BACKEND provision openrouter; then fail "provision failed; inspect log above" fi -say "4. AC#1-#2 — exit 0 and masked-key form" -# (provision emits masked key to stdout as final line; exit checked above) - -say "5. AC#3 — read full key back" +say "4. AC#1-#3 — read full key back (exit 0 + masked-key form already checked above)" KEY=$($BIN --backend $BACKEND read openrouter) || fail "read openrouter" case "$KEY" in sk-or-v1-*) pass "read returned key of correct prefix" ;; *) fail "read returned unexpected prefix: $(echo "$KEY" | head -c 12)..." ;; esac -say "6. AC#4 — curl OpenRouter /api/v1/models" +say "5. AC#4 — curl OpenRouter /api/v1/models" HTTP_CODE=$(curl -sS -o /tmp/or-models.json -w '%{http_code}' \ -H "Authorization: Bearer $KEY" \ https://openrouter.ai/api/v1/models) diff --git a/provisioner-scripts/diag-imap.mjs b/provisioner-scripts/diag-imap.mjs index 23636dc..ec0c89e 100644 --- a/provisioner-scripts/diag-imap.mjs +++ b/provisioner-scripts/diag-imap.mjs @@ -1,5 +1,5 @@ // Diagnostic: IMAP auth + folder scan for OpenRouter verification emails. -// Usage: node harness/diag-imap.mjs +// Usage (from provisioner-scripts/): node diag-imap.mjs // Requires AGENTKEYS_EMAIL_{USER,PASSWORD,HOST,PORT} exported. import { ImapFlow } from "imapflow"; diff --git a/provisioner-scripts/diag-openrouter.mjs b/provisioner-scripts/diag-openrouter.mjs index 4781a86..1d6b8f7 100644 --- a/provisioner-scripts/diag-openrouter.mjs +++ b/provisioner-scripts/diag-openrouter.mjs @@ -91,10 +91,8 @@ try { await page.fill(selector, EMAIL); await snap(page, "03-filled"); - // Find a submit-ish button - const submitBtn = buttons[0] || candidates.find(c => - c.tag === "button" && (c.type === "submit" || /continue|sign up|next/i.test(c.text || "")) - ); + // `buttons` is already the sign/continue/next/submit/start-filtered set. + const submitBtn = buttons[0]; if (submitBtn) { console.log(`5) clicking button: text="${submitBtn.text}"`); await page.getByRole("button", { name: new RegExp(submitBtn.text?.split("\n")[0] ?? "", "i") }).first().click(); From 3813dea6432a1f090f0d27b8dfa8db28b4ad20e9 Mon Sep 17 00:00:00 2001 From: wildmeta-agent Date: Mon, 20 Apr 2026 19:28:35 +0800 Subject: [PATCH 7/8] stage5b: CDP scraper + stage6 throwaway inbox scope MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Stage 5b MVP CDP-connected scraper proven end-to-end, blocked on email-duplicate. Pivot unblocked by adding throwaway-inbox provisioning as a named Stage 6 deliverable. provisioner-scripts/src/scrapers/openrouter-cdp.ts (new) Connects to a user-launched real Chrome via chromium.connectOverCDP, drives OpenRouter's Clerk-hosted signup form, polls Gmail IMAP for the OTP, mints a key on /keys, prints sk-or-v1-* on stdout. Two bugs fixed during the session: - Click the checkbox INPUT directly, not the label (label wraps a "Terms of Service" link that navigates to /terms) - When the 180s Turnstile wait expires and URL is still /sign-up with no OTP input present, fail explicitly instead of falling through to a bogus OTP-waiting step. Why CDP and not Playwright-launched Chromium: Playwright's bundled Chromium ships with --enable-automation. Cloudflare Turnstile detects this (error 600010) and refuses to issue a token even when a human clicks the checkbox. Connect to a real Chrome (launched with --remote-debugging-port) bypasses this because the browser process has no automation flags. Verified 2026-04-20: Turnstile passes invisibly in real Chrome, Clerk backend returns clean responses. Known blocker: OpenRouter's Clerk integration normalizes Gmail/Workspace plus-aliases to canonical. If agent@wildmeta.ai already has an OpenRouter account, every plus-aliased variant gets rejected with "email already in use." Only distinct local-parts work. That's why Stage 6 throwaway inbox provisioning (bot-@ agentkeys-email.io per call) is what unblocks the live demo. provisioner-scripts/diag-or-{flow,turnstile,signin}.mjs (new) Standalone Node probes used to diagnose the Turnstile failure. Kept as runtime evidence for the Clerk-moved-to-Radix-UI discovery and for future scraper authors' reference. docs/manual-test-stage5.md (modified) Section 4 rewritten from "when Stage 5b lands, future" to "CDP scraper partial: proven working, blocked on email duplicate." Includes: the run-recipe with Chrome --remote-debugging-port command, required env, known blocker, Stage-6-dependent pickup checklist. docs/spec/plans/development-stages.md (modified) Stage 6 deliverables extended with two named items: - Throwaway inbox provisioning API: mint unique local-parts per call (Clerk-normalization-proof), readable via the same fetchVerificationCode shape the Stage 5b scraper uses. - Stage 5b live-demo re-run: once throwaway provisioning lands, re-run the CDP scraper end-to-end. Closes the manual-test-stage5 §4 pickup item. Plus two test rows: email::throwaway_inbox_provisioning and email::stage5b_live_demo_rerun. docs/manual-test-stage6.md (new) Stage 6 manual demo guide: preflight, provision-throwaway-inbox walkthrough, per-user isolation test, Stage 5b live-demo re-run procedure. Structured like Stage 5 doc so both are readable in parallel. .gitignore (modified) Add .gstack/ — gstack creates .gstack/browse.json at repo root during connect-chrome; not a repo artifact. Post-change regression (fresh): - cargo test --release -p agentkeys-provisioner: 15/15 pass - npm test --prefix provisioner-scripts: 15/15 pass across 6 files --- .gitignore | 1 + docs/manual-test-stage5.md | 69 ++++++- docs/manual-test-stage6.md | 177 ++++++++++++++++ docs/spec/plans/development-stages.md | 4 + provisioner-scripts/diag-or-flow.mjs | 111 ++++++++++ provisioner-scripts/diag-or-signin.mjs | 53 +++++ provisioner-scripts/diag-or-turnstile.mjs | 92 +++++++++ .../src/scrapers/openrouter-cdp.ts | 189 ++++++++++++++++++ 8 files changed, 692 insertions(+), 4 deletions(-) create mode 100644 docs/manual-test-stage6.md create mode 100644 provisioner-scripts/diag-or-flow.mjs create mode 100644 provisioner-scripts/diag-or-signin.mjs create mode 100644 provisioner-scripts/diag-or-turnstile.mjs create mode 100644 provisioner-scripts/src/scrapers/openrouter-cdp.ts diff --git a/.gitignore b/.gitignore index 40656fa..cb7e86d 100644 --- a/.gitignore +++ b/.gitignore @@ -5,3 +5,4 @@ .omc .obsidian /docs/test-screenshots/ +.gstack/ diff --git a/docs/manual-test-stage5.md b/docs/manual-test-stage5.md index ac572da..1828615 100644 --- a/docs/manual-test-stage5.md +++ b/docs/manual-test-stage5.md @@ -281,17 +281,78 @@ These are slop markers. Apply the suggested `cargo clippy --fix` or hand-replace --- -## 4. What to do when Stage 5b lands +## 4. Stage 5b — CDP-connected real-Chrome scraper (partial: proven working, blocked on email duplicate) -When Stage 5b ships (agentic fallback, `/agentkeys-record-scraper` skill, script-generation loop), this document will grow: +### What's landed +- **[provisioner-scripts/src/scrapers/openrouter-cdp.ts](../provisioner-scripts/src/scrapers/openrouter-cdp.ts)** — connects to a user-launched real Chrome via `chromium.connectOverCDP()`, drives the OpenRouter Clerk-hosted signup form, polls Gmail IMAP for the OTP code, mints a new key on `/keys`, prints the `sk-or-v1-*` value on stdout. +- **Why CDP, not Playwright-launched Chromium:** Playwright's bundled Chromium ships with `--enable-automation` baked in. Cloudflare Turnstile detects this at runtime (error **600010** — "browser execution environment suspicious") and refuses to issue a token even when a human clicks the checkbox. Connecting to a user-launched *real* Chrome bypasses this because the browser process has no automation flags. Verified 2026-04-20: Turnstile passes invisibly in real Chrome, Clerk backend returns normal responses. + +### How to run (when you have a fresh-to-OpenRouter email) + +1. **Launch real Chrome with CDP enabled** (fresh profile, separate from your daily browsing): + ```bash + /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \ + --remote-debugging-port=9222 \ + --user-data-dir=/tmp/agentkeys-chrome-profile & + ``` + A blank Chrome window opens. Don't navigate it manually — the scraper drives it. + +2. **Export env** (Gmail IMAP creds + a signup email OpenRouter hasn't seen): + ```bash + export AGENTKEYS_EMAIL_BACKEND=gmail + export AGENTKEYS_EMAIL_USER="you@gmail.com" # canonical IMAP login + export AGENTKEYS_EMAIL_PASSWORD="" + export AGENTKEYS_EMAIL_HOST="imap.gmail.com" + export AGENTKEYS_EMAIL_PORT="993" + export AGENTKEYS_SIGNUP_EMAIL="" + export AGENTKEYS_SIGNUP_PASSWORD="" + ``` + +3. **Run the scraper:** + ```bash + cd ~/Projects/agentkeys + node --import tsx/esm provisioner-scripts/src/scrapers/openrouter-cdp.ts + ``` + Last stdout line is the `sk-or-v1-*` key. Stderr shows `[cdp]