Add init --agent-instructions for managed agent-facing snippets#54
Merged
pengfei-threemoonslab merged 3 commits intomainfrom May 8, 2026
Merged
Add init --agent-instructions for managed agent-facing snippets#54pengfei-threemoonslab merged 3 commits intomainfrom
pengfei-threemoonslab merged 3 commits intomainfrom
Conversation
Turns the snippet copy in docs/target-repo-agent-snippets.md into a CLI output: agents-shipgate init --agent-instructions=all|<csv>|none renders or writes AGENTS.md, CLAUDE.md, .cursor/rules/agents-shipgate.mdc, and .github/pull_request_template.md via managed `<!-- agents-shipgate:start -->` markers. Idempotent (safe to rerun, byte-equal on no-op), advisory only — no strict CI or baseline scaffolding. Composes with --write and --ci as a third orthogonal action; per-target status enum + structured agent-mode error on skip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address review on #54: - Refuse symlinks at managed-block and cursor target paths (new `skipped_symlink` status). `apply.py` no longer calls `.resolve()` on the joined target path, so a symlinked AGENTS.md cannot route writes outside the workspace. - Collapse PR template casings via `Path.samefile()` so case-insensitive filesystems (macOS APFS, Windows NTFS) no longer report `skipped_ambiguous` when only one file actually exists on disk. - Make `init --write --agent-instructions=...` idempotent at the process level: when the flag is set and shipgate.yaml already exists, the manifest skip is treated as informational (exit 0, no agent-mode error emit). Plain `init --write` still exits 2 for backwards compatibility. - Pin `COLUMNS=200` for the help-text test so Rich does not truncate `--agent-instructions` to `--agent-in…` on narrow CI terminals. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address review on #54 (head 05f5696): - Parent-directory symlink escape: walk every existing component between the workspace root and the target file. ``.github -> /tmp/outside`` no longer routes ``pull_request_template.md`` writes outside the workspace; same protection covers ``.cursor`` and any deeper intermediate. New helper ``_first_symlink_in_chain`` is used by both managed-block and cursor target paths. - Help-text test was flaking on GitHub CI (Rich truncates option names on narrow terminals even with ``COLUMNS=200``). Replace the rendered- string assertion with Click param introspection: confirm ``agent_instructions`` is a registered option on ``init`` and that its help text mentions ``advisory``. No terminal rendering involved. - ``--agent-instructions=none`` parses to an empty target list (no instruction action runs), so the manifest-skip idempotency accommodation should not apply. Restore exit 2 when shipgate.yaml already exists in this case — matches plain ``init --write``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pengfei-threemoonslab
added a commit
that referenced
this pull request
May 8, 2026
Picks up: contract CLI command (#52), v0.10.0 release tag (#53), init --agent-instructions (#54), HITL evidence provenance (#50), agent autofix-boundary docs (#55), packet schema v0.3. Conflict resolution: kept v0.11 report-schema references on top of main's v0.10.0 release / packet-schema v0.3 / contract-command additions. AGENTS.md and SKILL.md adopt main's centralized "contract lives in agent-contract-current.md" pattern; the v0.11 provenance line lives there now. test_public_surface_contract.py adopts main's derive-from-model approach for the current schema constants and just adds v0.10 to the legacy-pattern list. Also fixes a SARIF regression flagged in review: ``_location()`` chose the structured branch whenever ``source.path`` was set, so a finding with ``path="foo.py"`` and legacy ``location="foo.py:10"`` emitted no ``region``. Hybrid / plugin findings now fall back to ``_split_location(source.location or source.ref)`` when ``start_line`` is absent. Adds a regression test. After merge: 805 passed (+3 skipped), ruff clean, ``agents-shipgate contract --json`` reports ``report_schema_version: "0.11"``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pengfei-threemoonslab
added a commit
that referenced
this pull request
May 9, 2026
CI on PR #56 caught the stale llms-full.txt: the merge of origin/main brought in three commits (#54 init --agent-instructions, #55 autofix- boundary docs, #51 v0.11 schema with source provenance) that touched AGENTS.md and the report-schema literals, so the committed llms-full.txt no longer matched what `scripts/build-llms-full.py` produces from the post-merge sources. Re-ran the build script and committed the regenerated file. Now: - 859 tests pass + 3 skipped (was 748 on this branch; +111 from main's new tests for source provenance, agent-instructions renderers, managed blocks, etc.) - ruff clean - llms-full.txt is byte-identical to `render(REPO_ROOT)`, so test_llms_full_is_up_to_date passes No code changes — purely the rebuilt artifact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pengfei-threemoonslab
added a commit
that referenced
this pull request
May 9, 2026
* Add trigger catalog, zero-install detector, and adoption benchmark Shipping the first two weeks of the agent-adoption strategy in one PR. The unifying goal is to make Agents Shipgate self-discoverable to coding agents (Claude Code, Codex, Cursor) that haven't already adopted it: give them machine-readable trigger rules, a zero-install path to verify relevance, and a public benchmark that measures whether adoption-improving edits are working. Trigger surface: - docs/triggers.json — machine-readable mirror of the AGENTS.md trigger table. Every rule carries an `agents_md_row` field that the contract test pins against the AGENTS.md prose, so the two cannot drift. Stable for 0.x. - src/agents_shipgate/triggers.py — canonical loader and evaluator with predicate vocabulary documented in triggers.json itself. `python -m agents_shipgate.triggers shipgate.yaml prompts/refund.md` returns a run/skip verdict plus matched rules. - prompts/decide-shipgate-relevance.md — relevance-decision prompt that walks an agent through fetching triggers.json and applying it to a PR diff before any other Shipgate prompt fires. - AGENTS.md, llms.txt, .well-known/agents-shipgate.json, pyproject.toml, README — cross-link the new surface so every entry surface points at every other. Long-form reference: - llms-full.txt — concatenated AGENTS.md + recipes + contract + checks + concepts + autofix-policy in one document for AI search engines and coding agents that prefer one fetch. - scripts/build-llms-full.py — deterministic generator; the contract test fails if a source file changes without regenerating. Zero-install path: - tools/shipgate-detect.py — stdlib-only Python detector that replicates the structural verdict of `agents-shipgate detect --json` without requiring a local install. Pinned to the canonical CLI by tests/test_zero_install_detector.py across all 8 sample fixtures (same is_agent_project, same fired frameworks, same suggested sources). - docs/zero-install.md — three zero-install paths (single-file detector, uvx, GitHub Action) with a decision matrix. - docs/quickstart.md now leads with the zero-install detector before the install section. Benchmark scaffolding: - benchmark/ — frozen archetypes, four prompts (none mention Shipgate by name), five setup variants, tester-facing runbook, results CSV schema, and an upstream-PR tracker. The headline metric is the delta between `00-no-hints` and `10-agents-md` on the discovery rubric. Manual W2 baseline run is the next step. Drift guards: - tests/test_public_surface_contract.py extends the existing drift suite with checks for triggers.json/AGENTS.md row parity, llms-full.txt freshness via the build script's render(), and a parametrized "every prompt is mirrored to skills/" assertion. - tests/test_zero_install_detector.py adds 24 parity tests pinning the zero-install script to the canonical CLI on every sample. Verification: 733 pytest passes (709 W1 baseline + 24 new); ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix trigger evaluator precedence and decorator-rule reachability Addresses three review findings on the trigger catalog landed in the parent commit: P1 (decorator rules unreachable from the prompt): prompts/decide-shipgate-relevance.md piped only `git diff --name-only` into the evaluator, so `diff_contains` rules (TRIGGER-FUNCTION-TOOL- DECORATOR, TRIGGER-FRAMEWORK-VERSION-BUMP, TRIGGER-SHIPGATE-CI-WORKFLOW Action match) silently never fired — agents following the prompt would skip a PR that only adds `@function_tool`. Add `--git-diff [REVSPEC]` to `python -m agents_shipgate.triggers`, which shells out to `git diff --name-only [REVSPEC]` AND `git diff [REVSPEC]` to populate paths and diff body in one call. Update the prompt's Option B to use it. P2 (run_shipgate silently overrode skip_shipgate): A README-only diff that incidentally mentioned `@tool` (or quoted the Action URL) returned `run_shipgate: true` because the evaluator treated `has_run` as winning over `has_skip`, making the docs-only negative rule effectively dead. Reorder the precedence: stop_conditions → force_run → skip → run → dry_run. `skip_shipgate` now beats `run_shipgate`. To preserve the "manifest present means always run" semantic, promote TRIGGER-EXISTING-MANIFEST-PRESENT to a new action `force_run` that overrides skip — an opted-in repo's docs-only PR still scans because the cost is low and tool-adjacent prose can matter. P3 (dry_run rules silently dead): When only TRIGGER-FRAMEWORK-VERSION-BUMP fired, the evaluator reported "No rules matched" — the prompt translated that to "do not propose Shipgate", making the dry_run rule non-actionable despite being in the catalog. Add a `dry_run_recommended` field to the evaluator output. When only dry_run rules match, `run_shipgate` stays false but the field is true and the rationale names the matched rules. The prompt now routes this state to "propose a non-mutating scan; do not propose init --write". triggers.json gains an `actions` block describing each action's semantics and an `action_precedence` array documenting the high-to-low order. Both are reference material for an agent reading the catalog directly. Tests: - New: skip beats run on docs-only with @tool in prose - New: force_run beats skip when manifest present - New: dry_run sets dry_run_recommended; rule appears in matched_rules - New: pin TRIGGER-EXISTING-MANIFEST-PRESENT.action == "force_run" - Updated: _VALID_TRIGGER_ACTIONS includes "force_run" Verification: 737 pytest passes (was 733; +4 new); ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Cover tests in docs-only skip; fix bare --git-diff; update prompt verification Three more review findings from PR #56: P2a (tests-only diff with @tool slips through): AGENTS.md says "Pure read-only doc/test changes with no manifest impact" should skip, but TRIGGER-DOCS-ONLY-NEGATIVE only matched `**/*.md`. A tests-only diff that incidentally mentions `@function_tool` (in a fixture or assertion) returned `run_shipgate: true` because the broad decorator rule fired and no skip rule counterbalanced it. Extended the `every_file_matches` predicate to accept either a string (existing form, back-compat) or a list (any-of within the predicate). Updated the rule's pattern list to include `tests/**`, `test/**`, `**/tests/**`, `**/test/**`, `**/test_*.py`, `**/*_test.py`, `**/conftest.py`. Mixed docs+tests PRs now skip; code+tests mixes still trigger normally. P2b (bare --git-diff misses staged and untracked): The prompt advertised bare `--git-diff` for "uncommitted changes" but the implementation ran plain `git diff`, which only shows unstaged changes. A staged `@function_tool` addition silently returned no matched rules. Bare flag now runs `git diff HEAD` for both paths and content (covering BOTH staged and unstaged tracked changes), then appends untracked file *paths* via `git ls-files --others --exclude-standard`. Untracked file *content* is not captured (reading arbitrary unstaged files into memory is risky); the prompt documents the limitation explicitly. P3 (prompt verification contradicts the dry_run path): prompts/decide-shipgate-relevance.md's verification checklist named the output keys as `run_shipgate, matched_rules, rationale` (no `dry_run_recommended`) and asserted "no Shipgate command appears" whenever `run_shipgate: false`. That contradicts the dry_run path added in the previous commit, which explicitly proposes a non-mutating scan when `dry_run_recommended: true`. Updated the verification checklist to: - List all four canonical output keys including `dry_run_recommended` - Allow exactly one Shipgate command (a non-mutating scan) when dry_run_recommended is true and run_shipgate is false - Forbid Shipgate commands only when both are false Added two NOT-to-do bullets: never propose `init --write` on a dry_run-only match; bare --git-diff doesn't surface untracked file content. Mirrored to the skill copy. Tests added (10 cases): - Parametrized: 6 tests-only path patterns with @function_tool in diff all return run_shipgate=false - Code+test mix with @function_tool returns run_shipgate=true (negative case for the every_file_matches expansion) - _eval_predicate accepts every_file_matches as both string and list - _git_diff_context with no revspec captures staged-only changes - _git_diff_context with no revspec surfaces untracked file paths (and confirms untracked content is NOT in diff_text) Verification: 747 pytest passes (was 737; +10 new); ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Clarify zero-install detector is a structural subset, not drop-in P3 review: docs/zero-install.md and llms.txt implied the zero-install script has the same JSON shape as `agents-shipgate detect --json`. It doesn't — the canonical CLI emits `diagnostics[]` and `next_actions[]` arrays (the diagnostic engine), which are intentionally out of scope for the stdlib-only zero-install path. The script emits a structural subset of `DetectResult` plus `script_version`. - llms.txt: "same JSON shape" → "same structural verdict … emits the canonical `DetectResult` fields plus `script_version`, but NOT the CLI's `diagnostics` or `next_actions` arrays." - docs/zero-install.md: rephrased "Output mirrors `agents-shipgate detect --json` (plus a `script_version` field)" to "structural subset … not a drop-in replacement." Closing line now reads "structural verdict parity" and explicitly notes "field-by-field byte parity is not pinned and not promised." - tools/shipgate-detect.py docstring: lists `diagnostics[]` and `next_actions[]` in the "Intentional simplifications" section alongside the existing items (no git fast path, descriptive evidence strings, ±0.5 score variance). Test: pin the absence of `diagnostics` and `next_actions` keys in the script output so a future change that adds them is forced to update the wording surfaces in the same PR. Verification: 748 pytest passes (was 747; +1 new); ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Regenerate llms-full.txt after merging main (v0.11 schema bump) CI on PR #56 caught the stale llms-full.txt: the merge of origin/main brought in three commits (#54 init --agent-instructions, #55 autofix- boundary docs, #51 v0.11 schema with source provenance) that touched AGENTS.md and the report-schema literals, so the committed llms-full.txt no longer matched what `scripts/build-llms-full.py` produces from the post-merge sources. Re-ran the build script and committed the regenerated file. Now: - 859 tests pass + 3 skipped (was 748 on this branch; +111 from main's new tests for source provenance, agent-instructions renderers, managed blocks, etc.) - ruff clean - llms-full.txt is byte-identical to `render(REPO_ROOT)`, so test_llms_full_is_up_to_date passes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
docs/target-repo-agent-snippets.mdinto a first-class CLI output.agents-shipgate init --agent-instructions=all(or aagents-md,claude-md,cursor,pr-templatesubset; ornone) renders the snippets to stdout; combined with--write, the CLI plants them in the target repo via managed<!-- agents-shipgate:start v=1 -->markers.next_actionJSON line on stderr underAGENTS_SHIPGATE_AGENT_MODE=1.--writeand--ci(third action, exit code =max(manifest_exit, agent_instructions_exit)); JSON output gains a top-levelagent_instructionskey. Strict CI and baselines remain opt-in human decisions — a snapshot test assertsci_mode: strictdoes not appear in any rendered target outside the shared CI-pointer paragraph.Why
Adopters previously had to read
docs/target-repo-agent-snippets.mdand paste copy into their repos by hand. This makes the existing copy executable: one CLI flag plants the right guidance into AGENTS.md, CLAUDE.md, the Cursor rule, and the PR template — so any agent or reviewer working in the target repo discovers Shipgate without leaving the repo.Notable design points
--agent-instructionsalone would fail Typer's argument parsing;=all/=<csv>/=noneis unambiguous and explicit in scripts.v=Ntoken is the renderer-format version (not the package version) so the cursorPRIOR_RENDER_SHA256list stays small..github/pull_request_template.mdand.github/PULL_REQUEST_TEMPLATE.md; refuses ambiguous-both-without-marker; explicitly out-of-scope-skips when the directory form.github/PULL_REQUEST_TEMPLATE/exists (status enum reservesskipped_directory_templateso future support is non-breaking).Test plan
python -m pytest— 602 passed, 3 skipped (case-sensitivity-only PR-template tests skip on macOS APFS; Linux CI exercises them).python -m ruff check .— clean.samples/simple_langchain_agent:init --write --agent-instructions=all --jsoncreates all 4 files; re-run is byte-equal (git status --porcelainempty).init --write --ci --agent-instructions=allsucceeds; JSON hasmanifest_status,workflow,agent_instructionskeys.AGENTS_SHIPGATE_AGENT_MODE=1exits 2 with{\"error\": \"config_already_exists\", \"next_action\": ...}on stderr; on-disk file untouched.src/agents_shipgate/cli/discovery/agent_instructions/renderers/matches your intent (especially the## CImini-section that adds the advisory pointer to AGENTS.md / CLAUDE.md / cursor — the doc didn't carry that paragraph inline).🤖 Generated with Claude Code