Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .devdev/pr-review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# PR review

What to flag:

- **Correctness gaps.** The change does the wrong thing, or the
right thing only for the happy path. Concrete repro > vibes.
- **Public API churn.** New pub items, breaking signatures, new
required deps. Worth a sentence even when the change is fine.
- **Unjustified scope creep.** "Drive-by refactors" inside an
otherwise tight PR. Ask whether they belong in their own commit.
- **Test debt.** New behaviour without coverage, or a fix without
a regression test.

What to skip:

- Style nits the formatter would catch. `cargo fmt` exists.
- Renaming preferences. The name in the diff is fine.
- Speculative "what if someday" objections. Cross that bridge later.
- Restating what the diff already says.

Tone:

- One thread per concern. Don't pile observations into a single
comment.
- Quote the line you mean. Anchor the comment to the code.
- If approving, say so plainly. No ceremony.
- Sign off with the takeaway, not with apologies.
14 changes: 14 additions & 0 deletions .devdev/rust-style.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Rust style

- Edition 2024. Use `let ... else`, `if let && ...`, `?` over
`match err`.
- Errors: `thiserror` for libraries, `anyhow` for binaries. No
ad-hoc `String` errors at module boundaries.
- Tests live next to the code they cover — `#[cfg(test)] mod tests`
for unit tests, `tests/` dir for integration tests.
- No `unwrap()` in non-test code unless the invariant is one line
away.
- Small modules over giant ones. If `mod.rs` exceeds ~400 lines,
it wants splitting.
- Public items get a one-line doc comment. Internal items only get
comments when the *why* isn't obvious from the *what*.
13 changes: 13 additions & 0 deletions .devdev/vibe.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Vibe

Empiricism, brevity, wit. In that order.

- **Empirical first.** Don't claim something works until tests pass.
Don't claim a fix without reproducing the bug. "It compiles" is
not "it works."
- **Short over clever.** A two-line fix beats a refactor. If a
three-paragraph comment is necessary, the code probably isn't.
- **Wit, not snark.** A good review sounds like a colleague at a
whiteboard, not a linter. Find the real point. Make it once.

If you can't decide between "say it" and "shut up," shut up.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
# Scratch / tempfile output
/tmp/
/target/tmp/
run_id.txt
/.devdev-runtime/

# IDE
.idea/
Expand Down
22 changes: 22 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,35 @@ release is cut.
## [Unreleased]

### Added
- **PR shepherding pipeline**: `devdev repo watch <owner>/<repo>` polls
GitHub, hashes PR state, consults an append-only NDJSON idempotency
ledger, and emits `PrOpened`/`PrUpdated`/`PrClosed` events on an
internal `EventBus`. Per-PR `MonitorPrTask`s subscribe and re-prompt
the agent on each update. Idempotent watch / unwatch via
`repo/watch` + `repo/unwatch` IPC. New scenario S07 covers the
user-surface plumbing.
- **`devdev_ask` MCP tool**: universal approval seam exposed to ACP
agents. Takes `kind={post_review,post_comment,request_token,question}`
and routes through `ApprovalGate`. On approval for the
external-action kinds, the response includes a host-derived
short-lived `GH_TOKEN` so the agent can run `gh` itself — no typed
`post_review` adapter path.
- **Vibe Check**: `devdev init` runs a scribe session that writes
`.devdev/*.md` preference files in the user's voice. `devdev
preferences list` discovers preferences across repo, parents, and
`~/.devdev/` with repo-wins precedence; `devdev preferences edit
<name>` opens `$EDITOR`.
- `devdev-workspace`: standalone crate README covering the library
entry points, minimal example, and platform matrix.
- `ROADMAP.md`: Today / Next / Aspirational breakdown.
- `SECURITY.md`, `CONTRIBUTING.md`: policy + contributor workflow.
- `rust-toolchain.toml`: pinned compiler.
- MIT `LICENSE`.

### Removed
- `placeholder_review_fn` agent-callback seam — superseded by the
event-driven `MonitorPrTask` + `devdev_ask` flow described above.

### Changed
- Root `README.md` rewritten for the two-audience split
(workspace-curious vs DevDev-hosting). Explicit non-claim on
Expand Down
19 changes: 14 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,12 @@ in [ROADMAP.md](ROADMAP.md).
(`cargo`, `git`, `rg`, language servers) can operate in. Start at
[`crates/devdev-workspace/README.md`](crates/devdev-workspace/README.md).
- **DevDev-hosting.** You want to run the full agent product locally
(PR monitoring, preferences-as-Markdown, approval gates). Today
you're early — several loops are still behind placeholders, tracked
in [ROADMAP.md](ROADMAP.md).
(PR shepherding, preferences-as-Markdown, approval gates). The
end-to-end loop — `devdev init` → `devdev repo watch` → agent
reviews PRs as they appear — works against the mock GitHub
adapter today; live `gh` posting is gated behind `devdev_ask`
approvals. See [ROADMAP.md](ROADMAP.md) for what's shipped vs.
in flight.

## Quickstart: the workspace library

Expand Down Expand Up @@ -71,13 +74,19 @@ From source:

```
cargo install --git https://github.com/goldenwitch/devdev devdev-cli
devdev up # starts the daemon
devdev down # stops it
devdev up # starts the daemon
devdev init # interview yourself; writes .devdev/*.md
devdev repo watch owner/name # poll GitHub for PR events
devdev preferences list # show discovered .devdev/*.md
devdev down # stops the daemon
```

DevDev expects a logged-in [GitHub Copilot CLI](https://github.com/github/copilot-cli)
(`copilot --acp` must work) and, for GitHub adapters, either a
`gh auth login` session or a `GH_TOKEN` / `GITHUB_TOKEN` env var.
When the agent wants to post a review or comment it calls the
`devdev_ask` MCP tool; the daemon prompts you for approval and, on
“yes”, hands the agent a short-lived `gh` token to act with.

## Platform matrix

Expand Down
30 changes: 25 additions & 5 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,28 @@ Works end-to-end and is exercised by tests on every push.
- Scenario harness: user-surface scenarios drive only the `devdev`
binary + IPC + checkpoints + documented env vars.

**Repo watch → event pipeline (cap 26 / cap 27)**

- `devdev repo watch <owner>/<repo>` polls GitHub, hashes PR state,
consults an append-only NDJSON idempotency ledger, and emits
`PrOpened` / `PrUpdated` / `PrClosed` events on an internal
`EventBus`. Per-PR `MonitorPrTask`s subscribe and re-prompt the
agent on update.
- `devdev_ask` MCP tool: the universal approval seam. Agent calls it
with `kind={post_review,post_comment,request_token,question}`;
daemon routes through `ApprovalGate` and — on approval, for the
external-action kinds — surfaces a host-derived short-lived
`GH_TOKEN` so the agent can run `gh` itself. No typed adapter path.

**Vibe Check (cap 25)**

- `devdev init` runs a scribe session that writes `.devdev/*.md`
preference files in the user's voice (one topic per file, append
`## Revision <date>` on revisits).
- `devdev preferences list` discovers preferences across repo /
parents / `~/.devdev/` with repo-wins precedence; `devdev
preferences edit` opens `$EDITOR`.

**Scenario catalog status**

| ID | Status |
Expand All @@ -51,16 +73,14 @@ Works end-to-end and is exercised by tests on every push.

What we're actively working on to close the DevDev-hosting loop.

- **Wire `placeholder_review_fn`.** The agent-callback seam in
`crates/devdev-cli/src/daemon_cli.rs` is still a placeholder. Real
target: `MonitorPrTask` driving the same seam with real PR state.
- **Scout routing.** Pick the right model/agent per task class instead
of one-size-fits-all.
- **Idempotency ledger.** Durable record of work already done so an
agent restart doesn't re-do the same thing.
- **Full ACP session backend (S03/S04).** Enough plumbing that the
agent's tool calls and mid-session events are observable from the
scenario surface.
- **End-to-end PR shepherding scenario (S07).** Drives `devdev init`
→ `devdev repo watch` → mock GH adapter → asserts the agent gets
re-prompted with preference context on each PR update.

### Explicitly not on this list

Expand Down
11 changes: 9 additions & 2 deletions crates/devdev-acp/src/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -444,7 +444,11 @@ pub struct ToolCall {
pub tool_call_id: String,
pub title: String,
pub kind: ToolCallKind,
pub status: ToolCallStatus,
/// Optional: Copilot CLI omits this on the initial `tool_call`
/// notification (status is implicit `pending`). Subsequent
/// `tool_call_update` notifications carry it explicitly.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub status: Option<ToolCallStatus>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub raw_input: Option<serde_json::Value>,
}
Expand All @@ -453,7 +457,10 @@ pub struct ToolCall {
#[serde(rename_all = "camelCase")]
pub struct ToolCallUpdate {
pub tool_call_id: String,
pub status: ToolCallStatus,
/// Optional for symmetry with `ToolCall`. A status-less update
/// is a metadata-only change (e.g. output appended).
#[serde(default, skip_serializing_if = "Option::is_none")]
pub status: Option<ToolCallStatus>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub output: Option<String>,
}
Expand Down
4 changes: 2 additions & 2 deletions crates/devdev-acp/tests/acceptance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ fn session_update_tool_call_roundtrip() {
tool_call_id: "tc-1".into(),
title: "Read file".into(),
kind: ToolCallKind::Read,
status: ToolCallStatus::Completed,
status: Some(ToolCallStatus::Completed),
raw_input: Some(serde_json::json!({"path": "/foo.rs"})),
});
let json = serde_json::to_string(&variant).unwrap();
Expand All @@ -85,7 +85,7 @@ fn session_update_tool_call_roundtrip() {
fn session_update_tool_call_update_roundtrip() {
let variant = SessionUpdate::ToolCallUpdate(ToolCallUpdate {
tool_call_id: "tc-1".into(),
status: ToolCallStatus::Failed,
status: Some(ToolCallStatus::Failed),
output: Some("error: not found".into()),
});
let json = serde_json::to_string(&variant).unwrap();
Expand Down
6 changes: 6 additions & 0 deletions crates/devdev-cli/src/acp_backend.rs
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,12 @@ impl AcpSessionBackend {
let argv: Vec<&str> = args.iter().map(String::as_str).collect();
let client_config = AcpClientConfig {
env_overrides: crate::realpath_shim::prepare_nodejs_options(),
// Real agents (Copilot CLI included) can think for
// a long time between session/update notifications,
// especially while running multi-step gh/git plans.
// Keep the idle window generous to avoid killing a
// working agent mid-turn.
idle_timeout: std::time::Duration::from_secs(300),
..AcpClientConfig::default()
};
let client = AcpClient::connect_process(
Expand Down
Loading
Loading