Conversation
…ng syntax
Capa was issuing direct HTTP GETs to raw GitHub/GitLab URLs when fetching
rule files and agent-instruction snippets. For private repos behind SSO
this silently returned a `text/html` login page (HTTP 200), and capa
wrote that HTML straight into rule and AGENTS.md files. The code path
that already handled `git clone` for skills was never reused for these
sibling features.
Two fixes, one PR:
1. Route every github/gitlab typed reference through the existing
snapshot resolver (git clone with stored OAuth tokens), so private
repos work the same way for skills, rules, and agent snippets.
- New `src/shared/repo-file.ts`: `fetchRepoFile` (clone-and-read)
and `fetchTextFile` (auth-aware raw URL fetch with HTML-response
detection so we fail loudly instead of silently embedding a
login page).
- `installCommand` now passes a shared auth + snapshot context to
`installAgentsFile` and uses `fetchRepoFile` / `fetchTextFile`
for the rules loop.
- `installAgentsFile` resolves base + additional snippets through
the same helpers; raw URLs that look like github/gitlab raw paths
are auto-detected and re-routed through the clone path so users
who paste a raw URL still get OAuth-authenticated fetches.
2. Make the repo-string grammar unambiguous. Previously `@` was
overloaded: skills used the right-hand side as a basename to search
for, while rules/snippets used it as an exact file path. Same
syntax, opposite resolution semantics, no error when users guessed
wrong.
Now there are two explicit forms:
- `owner/repo@<name>` recursive search by basename
- `owner/repo::<path>` exact path inside the repo
Both accept `:version` (tag/branch) and `#sha` (commit) suffixes.
`parseRepoString` rejects `@` references with slashes and points
the user at the `::` form in the error message.
The `@` form is now valid for rules / agent snippets too
(recursive file lookup with disambiguation errors), and the `::`
form is now valid for skills (exact directory lookup). The two
features are symmetric.
Other surface changes:
- `capa add` accepts `owner/repo::path/to/skill` and the gitlab
variant, with optional `:version` / `#sha`.
- `detectRepoCoordsFromRawUrl` emits the `::` form (raw URLs always
know an exact path, so search would be the wrong default).
- Repo-string docs in `capabilities-schema.md` get a side-by-side
table of the two grammars and explicit guidance on when to use
each. Rules section gains worked examples for both.
- `buildRawUrl` only accepts `mode === 'exact'` and ships a JSDoc
pointing at `fetchRepoFile` for the private-repo case, since the
raw URL path is the original source of the bug.
Tests: 539 pass, 0 fail (29 new). New `repo-string.test.ts` covers
both grammars and every error path; `repo-file.test.ts` exercises
`@` search with unique-match, no-match (with candidate hints), and
ambiguous-match (with disambiguation hint) cases against a local git
fixture; `agents-file.test.ts` round-trip-tests that translated raw
URLs always parse back as exact paths.
GitHub's "Raw" button now generates URLs in the form https://raw.githubusercontent.com/<owner>/<repo>/refs/heads/<branch>/<path> (and the equivalent /refs/tags/<tag>/ form). Both forms work when fetched directly, but `detectRepoCoordsFromRawUrl` greedily took the first segment after the repo as the ref, so it produced bogus repo strings like: owner/repo::heads/main/examples/foo.md:refs which then failed the downstream `git clone` step with "Repository not accessible" because there's no such ref as `refs`. Fix: extract a `splitGithubRefAndPath` helper that recognizes `refs/heads/<branch>/...` and `refs/tags/<tag>/...` prefixes and pulls the actual ref out of the third segment. Apply it to both the `raw.githubusercontent.com` and `github.com /raw/` branches. GitLab raw URLs are unaffected (they use `/-/raw/<ref>/<path>` directly). Tests: added 5 new cases ??? bare branch (HEAD-equivalent), non-default branch, tag, github.com /raw/refs/heads/ form, and a regression test matching the exact URL shape from the bug report. Round-trip test was extended to include all three new shapes.
…nfig
Two related rule-cleanup bugs:
1. `capa clean` did not remove installed rule files. The cleanup call
was gated on `capabilities.rules.length > 0`, so once the user
commented out (or removed) all their rules and ran clean, the
previously-installed `.cursor/rules/*.mdc` (or equivalent) files
were left orphaned.
2. `capa install` did not remove a rule that was commented out since
the previous install. The install loop only writes the *current*
rules ??? there was no diff-and-prune step against what existed
before ??? so removing one rule from the config left its file behind
on disk indefinitely.
Both share the same root cause: rule cleanup was driven by the *current*
state of `capabilities.rules` rather than by what capa had previously
installed. Worse, rule files for directory-based providers (Cursor,
Copilot, Windsurf) were written without being registered in the
managed-files DB at all, so even when cleanup logic wanted to find
them, there was no persistent record of which files belonged to capa.
Fix:
- `installRules` now accepts an `onFileWritten` callback that fires
once per rule file written by a directory-based provider. `install.ts`
uses it to register every rule file in the managed-files DB, mirroring
how skills are tracked.
- New `pruneRules(projectPath, providers, currentRules, previouslyManagedFiles)`
in `rules-installer.ts` brings on-disk rules state in sync with the
capabilities file:
* For directory-based providers, iterates the previously-managed file
list and deletes any file inside the provider's rules dir whose
basename doesn't correspond to a current rule for that provider.
User-authored files in the same dir are never touched because we
only consider files capa explicitly registered.
* For instruction-folded providers, scans the instruction file for
`<!-- capa:start:rule:<id> -->` blocks and removes ones not in the
current rules set. Marker blocks are self-tracking, so no DB lookup
is needed.
* Per-rule `providers:` filtering is honored ??? a rule restricted to
one provider counts as "absent" from the others' perspective.
* Returns the list of removed file paths so the caller can drop them
from the managed-files DB.
- `install.ts` always runs `pruneRules` at step 3.6, even when the
current rules array is empty. Commenting out a rule and re-running
install now removes its file/marker block.
- `clean.ts` no longer gates the `cleanRules` call on
`capabilities.rules.length > 0`. With rule files now registered in
the managed-files DB, the existing managed-files loop already takes
care of deleting orphan rule files; the unconditional `cleanRules`
call additionally strips `<!-- capa:start:rule:* -->` blocks from
instruction-folded providers' AGENTS.md / CLAUDE.md.
Tests: 553 pass (9 new):
- onFileWritten fires exactly once per directory-based rule, and
never for instruction-folded providers
- pruneRules removes a single dropped rule
- pruneRules removes ALL rule files when rules section is emptied
- pruneRules never deletes user-authored files in the rules dir
- pruneRules removes the marker block for a removed rule but keeps
current ones
- pruneRules removes ALL marker blocks when rules section is emptied
- pruneRules respects per-rule provider filtering
- pruneRules ignores managed paths outside the provider rules dir
(regression guard against false deletions of skill directories)
…ules `RulesList` already painted source-type badges (gitlab=orange, github=green, etc.), but `SkillsList` rendered them with the neutral `bg-bg-secondary` style ??? so the same `gitlab` source looked orange in one section and gray in another. Confusing at a glance and inconsistent with how the rest of the UI treats source provenance. Extracted the color map into a shared `sourceTypeColors.ts` helper and wired both lists through `sourceTypeBadgeClasses(type)`. Adding a new component that surfaces a skill/rule/snippet source type from now on means importing one helper instead of duplicating the literal classes. Color map (kept the existing rule colors, added two for skill-only types): inline -> blue remote -> purple github -> green gitlab -> orange local -> slate (skills only ??? content lives on disk) installed -> amber (skills only ??? user installed it elsewhere) Unknown / future types fall back to the neutral classes so they still render legibly. Verified the new slate / amber utilities make it through Tailwind v4's source-scan into the bundled CSS.
There was a problem hiding this comment.
Pull request overview
This PR hardens and unifies how capa resolves GitHub/GitLab-backed rules and agent snippets by routing them through the existing clone-and-cache (OAuth-capable) path, and introduces an unambiguous repo-string grammar (@ basename search vs :: exact path) across skills/rules/snippets. It also adds rule-artifact pruning support and consolidates source-type badge styling in the web UI.
Changes:
- Add
fetchRepoFile()(clone + snapshot read) andfetchTextFile()(auth-aware raw fetch that rejects HTML login pages) to prevent silent SSO/HTML corruption. - Reshape repo-string parsing to support
owner/repo@<basename>(search) andowner/repo::<path>(exact) consistently, updating CLI/docs/tests accordingly. - Track and prune rule artifacts (managed rule files + instruction marker blocks) when rules are removed from capabilities.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| web-ui/src/features/projects/components/sourceTypeColors.ts | Centralizes badge color mapping for skill/rule source types. |
| web-ui/src/features/projects/components/SkillsList.tsx | Uses shared badge color helper for skill type labels. |
| web-ui/src/features/projects/components/RulesList.tsx | Uses shared badge color helper; removes inline color map. |
| src/types/capabilities.ts | Updates JSDoc to document the two repo-string grammars. |
| src/shared/repo-string.ts | Adds mode + target, supports :: exact paths and validates @ slashes. |
| src/shared/repo-file.ts | New shared helpers for snapshot-based file reads + safe raw fetch behavior. |
| src/shared/providers/tests/registry.test.ts | Adds coverage for rule managed-file registration and pruning behavior. |
| src/shared/tests/repo-string.test.ts | Adds tests for parsing both grammars and buildRawUrl constraints. |
| src/shared/tests/repo-file.test.ts | Adds tests for repo snapshot file lookup (exact/search) + HTML rejection. |
| src/cli/utils/rules-installer.ts | Adds onFileWritten hook and new pruneRules() implementation. |
| src/cli/utils/agents-file.ts | Routes GitHub/GitLab snippet/base fetching through snapshots; adds raw-URL detector. |
| src/cli/utils/tests/agents-file.test.ts | Tests raw-URL detection and :: emission/round-trip parsing. |
| src/cli/commands/install.ts | Uses new fetch helpers for rules/snippets; installs + prunes rule artifacts; updates skill repo parsing. |
| src/cli/commands/clean.ts | Always cleans rule markers (even when rules list is empty). |
| src/cli/commands/add.ts | Extends capa add parsing + help text to support :: exact-path grammar. |
| src/cli/commands/tests/add.test.ts | Adds tests for :: parsing and updated help output expectations. |
| skills/capabilities-manager/references/workflows-and-examples.md | Migrates a rule example to :: form and explains separator semantics. |
| skills/capabilities-manager/references/commands.md | Updates capa add docs to show both grammars and pinning. |
| skills/capabilities-manager/references/capabilities-schema.md | Adds repo-string format section and updates rules/snippets guidance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Reject path-traversal in :: exact-path resolution
Both fetchRepoFile and the skill installer now route the user-supplied
target through a shared assertSafeRepoPath() guard that rejects
../ segments, leading slashes/backslashes, and drive-letter prefixes
before joining with snapshotDir, closing the read-arbitrary-file
vector Copilot flagged on `owner/repo::../../etc/passwd`-style
strings.
- Reject empty :version / #sha suffixes in parseRepoString
Inputs like `owner/repo::path.md:` or `...path.md#` now throw with a
targeted message instead of silently producing version=''/sha='' which
later corrupted snapshot resolution and raw URLs (`//path`).
- Document and percent-decode multi-segment refs in splitGithubRefAndPath
GitHub raw URLs are genuinely ambiguous when a branch contains `/`
(the ref-vs-path boundary is unknowable without an API call). Added a
thorough JSDoc note on the limitation, percent-decode the ref segment
so `feature%2Ffoo` round-trips correctly, and pinned the
literal-multi-segment behavior with a regression test so any future
change is intentional.
- Cosmetic fixes from the same review:
* HTML-rejection error: "agents.basefrom" -> "agents.base from"
(missing space between label and "from")
* findFilesByBasename docstring no longer claims to skip dotfiles
(it intentionally traverses .cursor / .github / .agents)
* resolveRepoSnippet log now prints the original repo string so ::
form references aren't rendered as @Form during install
Tests: +8 cases covering the path-traversal guard (POSIX absolute,
parent segments, backslash-prefix, drive-letter via the helper),
empty version/sha rejection, and the documented multi-segment-ref
behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two intertwined fixes:
Private GitHub/GitLab repos now work for rules and agent snippets.
Capa was issuing direct HTTP GETs to raw GitHub/GitLab URLs when fetching rule files and agent-instruction snippets. For private repos behind SSO this silently returned a
text/htmllogin page (HTTP 200), and capa then wrote that HTML straight into the user's rule files andAGENTS.md— corrupting them with what looked like real markdown to the file watcher but turned out to be a login page on inspection. The clone-and-cache code path that already worked correctly for skills was simply never reused for these sibling features. Now it is.Repo-string grammar is no longer ambiguous.
Previously
@had two opposite meanings depending on context:Same syntax, opposite resolution semantics, no error when users guessed wrong. Going forward there are two explicit forms, both valid for skills, rules, and snippets:
owner/repo@<name>owner/repo::<path>Both still accept
:version(tag/branch) and#sha(commit SHA) suffixes for pinning.What changed
New module:
src/shared/repo-file.tsfetchRepoFile(platform, repoString, getRepoSnapshot, authFetch, opts)— clone the containing repo (with OAuth) and read the file off the snapshot. Branches onmode:exact→ direct path lookup, friendly error if missingsearch→ recursive walk for files whose basename matches; helpful errors with candidate filenames on miss and full match list on ambiguity (Tip: Use "owner/repo::<exact-path>" to disambiguate.)fetchTextFile(url, opts)—fetchwrapper that adds OAuth headers when anAuthenticatedFetchis provided, and rejects HTML responses (looksLikeHtmlPage(body, contentType)) so the SSO-login-page bug can never happen silently again.Reshaped:
src/shared/repo-string.tsParsedRepogained amode: 'search' | 'exact'discriminator and atargetfield (with a non-enumerablefilepathaccessor for back-compat).parseRepoStringrecognizes::first, then@. Rejects@references whose target contains/, suggesting the::form in the error message. Empty targets and missingowner/repoproduce specific errors.buildRawUrlrequiresmode === 'exact'(raw URLs need a known path) and ships a JSDoc explicitly steering callers tofetchRepoFilefor the private-repo case.Updated callers
installCommand— wiresrepoFetchAuth(AuthenticatedFetch) andrepoFetchCtxthrough toinstallAgentsFileand usesfetchRepoFile/fetchTextFilefor the rules loop.installAgentsFile— base + additional snippets resolved viafetchRepoFile. A newdetectRepoCoordsFromRawUrlhelper auto-detectshttps://raw.githubusercontent.com/...andhttps://gitlab.com/.../-/raw/...URLs and re-routes them through the clone path so users who paste a raw URL still get OAuth-authenticated fetches. Detected URLs always emit the::form.install.ts— replaced the ad-hocsplit(/[:#]/)+split('@')parser (which silently broke on::because of the double colon) withparseRepoString. Added an exact-path branch that looks up<snapshotDir>/<target>/SKILL.mddirectly. The recursive-search branch now produces an error message that points users at the::form when the basename collides or is ambiguous.capa add— accepts bothowner/repo::path/to/skillandgitlab:group/.../repo::path/to/skill, with optional:version/#sha. Help text rewritten with a two-grammar table and a "when to use which" section.Docs
capabilities-schema.md— new "Repo string format (@vs::)" section with a side-by-side table and explicit usage guidance. Rules section gains worked examples for both grammars.commands.md—capa adddoc lists both grammars with examples.workflows-and-examples.md— fixed an existing rule example that usedowner/repo@rules/typescript.md:v2.0.0(now invalid because@rejects slashes) and converted it to the::form.src/types/capabilities.ts— JSDoc updated to describe both grammars.Migration
Existing capabilities files that referenced rules / agent snippets via
owner/repo@some/path/file.mdwill now fail loudly at install time with an error message that points the user at the equivalentowner/repo::some/path/file.mdform. No silent behavior change — the error is the migration prompt.Test plan
bun test— 539 pass / 0 fail (29 new tests added)bunx tsc --noEmit— cleansrc/shared/__tests__/repo-string.test.ts(18 tests) — both grammars, all pinning suffixes, subgroup repos, slash-rejection, missing-target / missing-owner errors,buildRawUrlintegration, back-compatfilepathaccessorsrc/shared/__tests__/repo-file.test.ts— fixture grew ana/dup.md+b/dup.mdcollision pair; new tests cover unique-match search, no-match (with candidate hints), ambiguous-match (with both paths listed), and exact-path missing-file errorssrc/cli/utils/__tests__/agents-file.test.ts— every detector test expects::, plus a round-trip sanity test that confirms the emitted strings parse back asmode === 'exact'src/cli/commands/__tests__/add.test.ts— new GitHub::and GitLab::describe blocks (5 tests) including pinning; updated help-text assertion