Streamline the qemu product flow and export sparse review bundles by OneNoted · Pull Request #3 · OneNoted/inspectors

OneNoted · 2026-04-15T14:22:41Z

Summary

stop leaking qemu runtime and prep-image storage while making the default qemu product flow easier to drive
add storage-efficient qemu review recording with review_recording metadata and POST /api/sessions/:id/review/export
update the control plane, web and desktop operator surfaces, SDKs, docs, and proof scripts for the sparse review-bundle workflow

Testing

bun run build
bun run lint
bun run test
python -m py_compile python/sdk/client.py scripts/run-qemu-live-view-demo.py scripts/run-taskers-qemu-dogfood.py
cargo test -p guest-runtime --lib --test qemu_bridge

Session deletion kept qemu artifact directories on disk, qemu image prep could leak docker-backed storage, and guest-runtime lifecycle logic had drifted between the library and binary entrypoint. This change centralizes the runtime implementation in the library, reclaims session-owned runtime directories on teardown, adds safe marker-based startup janitoring plus an explicit reclaim endpoint, and makes qemu prep keep its temporary storage inside the tracked workdir so it can be cleaned deterministically. Constraint: Preserve qemu as the product path while keeping xvfb and display sessions honest Constraint: Avoid new dependencies while making cleanup safe enough for app-local storage Rejected: Docker volume prune as the primary fix | too broad and unsafe for unrelated local projects Rejected: Storage hotfix without lifecycle consolidation | likely to regress because lib.rs and main.rs had drifted Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep runtime/cache/exports ownership explicit and do not reintroduce session-owned files outside runtime storage Tested: cargo test -p guest-runtime --quiet; bun test apps/control-plane/src/server.desktop-app.test.ts Not-tested: Full end-to-end qemu product boot against a real docker/qemu environment

The happy path still existed in the backend, but the default surface started from provider/debug knobs instead of the session -> task -> live view -> cleanup loop an agent should infer first. This change promotes the qemu-first start-and-watch workflow, adds explicit delete and stale-storage reclaim actions, wires optional desktop activation for installed-app oversight, updates SDK/docs around the canonical contract, and keeps advanced/debug controls available without making them primary. Constraint: Preserve honest xvfb fallback semantics while keeping qemu product as the canonical live path Constraint: Keep the packaged desktop resources in sync with the source web UI and control-plane server Rejected: Full UI redesign before storage and contract ergonomics were in place | too broad for the approved plan Rejected: Desktop activation as a required step in the agent contract | should remain optional additive ergonomics Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep the default path start-and-watch first; new debug controls belong under advanced surfaces, not the main flow Tested: bun run typecheck; bun test apps/control-plane/src/server.desktop-app.test.ts apps/web-ui/public/provider-default.test.js crates/desktop-app/resources/control-plane/ui/provider-default.test.js packages/ts-sdk/src/index.test.ts; bun run --filter @acu/control-plane build; bun run sync:desktop-resources; cargo test -p guest-runtime --quiet; bun run test; bun run lint Not-tested: Manual desktop activation against a real installed desktop bundle outside the test harness

The first reclaim pass handled current runtime storage and the new cache layout, but it missed prep workdirs stranded under the old runtime-scoped `_qemu_images/_build` path. This follow-up teaches reclaim to scan both layouts, reports the legacy build root explicitly, and documents the remaining limitation around detached anonymous Docker volumes that inspectors cannot attribute safely after their containers are gone. Constraint: Reclaim must stay inspectors-scoped and avoid deleting unrelated Docker resources Rejected: Global docker volume prune from the reclaim flow | too destructive for unrelated local workloads Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep reclaim logic compatible with both current and legacy inspectors storage layouts until old runtime caches are no longer in the wild Tested: cargo test -p guest-runtime --quiet; bun run test; bun run lint Not-tested: Real-world detached anonymous Docker volume recovery beyond inspectors-attributable state

Taskers is the best local app for exercising inspectors qemu product mode because it is a real user-facing GUI app with machine-verifiable companion tooling. The existing proof already launched Taskers and created a workspace, but it was pinned to an older bundle path and did not prove visible desktop changes strongly enough. This update auto-selects the newest local bundle when not overridden, records baseline/launch/before/after screenshots with hashes, verifies visible desktop changes, exercises desktop mouse input, and documents Taskers as the recommended qemu dogfood target. Constraint: Keep the proof local and reuse the existing Taskers bundle under ../taskers without adding external dependencies Constraint: Preserve machine-verifiable proof via taskersctl while strengthening visible desktop evidence Rejected: Switching the proof target to a generic GNOME utility | weaker real-world coverage and no companion machine proof Rejected: Keeping the old fixed Taskers bundle path | too brittle as local bundles evolve Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep the qemu dogfood target focused on apps that are both visibly interactive and machine-verifiable; Taskers remains the default until a better local target exists Tested: python3 -m py_compile scripts/run-taskers-qemu-dogfood.py python/sdk/client.py; ACU_BASE_URL=http://127.0.0.1:3100 python3 scripts/run-taskers-qemu-dogfood.py; ACU_BASE_URL=http://127.0.0.1:3100 python3 scripts/run-qemu-live-view-demo.py; bun run test; bun run lint Not-tested: Visual inspection of the saved screenshots by a human outside the captured artifact hashes and machine proof

Add review_recording session metadata, a sparse guest-runtime review ledger, control-plane export plumbing, and operator UI status so humans can retain useful qemu evidence without default video capture. Constraint: Review capture must stay truthful to the qemu product action plane Constraint: Runtime storage remains ephemeral unless explicitly exported Rejected: Continuous video capture | too storage-heavy for the requested goal Rejected: Raw noVNC protocol replay | higher complexity with worse review ergonomics Confidence: medium Scope-risk: moderate Reversibility: clean Directive: Keep guest-runtime as the sole review timeline writer; control-plane should append via requests only Tested: cargo test -p desktop-core -p guest-runtime --lib Tested: cargo test -p guest-runtime --test qemu_bridge Tested: cd apps/control-plane && bun run build && node --test dist/**/*.test.js Tested: cd apps/web-ui && node --test public/**/*.test.js Not-tested: End-to-end export against a live qemu product guest

Document the new review_recording session summary and review export route, and update the qemu proof scripts so they keep exported review bundles as part of later-review evidence. Constraint: Proof output should confirm exported bundles survive session teardown Rejected: Document default video capture | contradicts the storage-first design Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep v1 docs explicit that review export is sparse timeline evidence, not video replay Tested: python -m py_compile python/sdk/client.py scripts/run-qemu-live-view-demo.py scripts/run-taskers-qemu-dogfood.py Not-tested: Live execution of the qemu proof scripts against a real guest

Refresh the packaged control-plane bundle after the new qemu review export and operator-surface changes so the desktop app ships the same behavior as the source server/UI. Constraint: Packaged desktop resources are copied from built control-plane assets Rejected: Hand-editing bundled output | too easy to drift from the source build Confidence: high Scope-risk: narrow Reversibility: clean Directive: Regenerate desktop resources via the sync script after control-plane or web-ui changes Tested: bun run sync:desktop-resources Tested: node --test crates/desktop-app/resources/control-plane/ui/live-view.test.js Tested: node --check crates/desktop-app/resources/control-plane/ui/app.js Tested: node --check crates/desktop-app/resources/control-plane/ui/live-view.js Not-tested: End-to-end packaged desktop app launch after the bundle refresh

Rewrite exported timeline screenshot refs into the durable export bundle, serialize per-session review writes behind a session lock, and tighten the docs/proof scripts to reflect the storage-first export flow. Constraint: Exported bundles must remain reviewable after runtime teardown Constraint: Canonical review sequence numbers must stay stable under concurrent writes Rejected: Leaving runtime screenshot paths in exported timelines | exported bundles broke after session deletion Rejected: Relying on manifest counters without write serialization | could corrupt ordering under concurrent events Confidence: high Scope-risk: moderate Reversibility: clean Directive: Any future review event or export path must go through the session review lock before touching the manifest or timeline Tested: cargo test -p guest-runtime --lib --test qemu_bridge Tested: bun run build Tested: bun run lint Tested: bun run test Tested: python -m py_compile python/sdk/client.py scripts/run-qemu-live-view-demo.py scripts/run-taskers-qemu-dogfood.py Not-tested: Live qemu product export against a real VM after the concurrency fix

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 69a253297b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Only failed review events or explicit error statuses should trigger temporary postmortem retention. Successful bridge_state_changed events must keep the bundle ephemeral so normal qemu teardowns still reclaim runtime storage immediately. Constraint: Successful qemu bridge transitions still emit review events but must not pin runtime storage Rejected: Keep bridge_state_changed as a postmortem trigger | leaks normal session runtime dirs until TTL expiry Confidence: high Scope-risk: narrow Reversibility: clean Directive: Treat postmortem retention as an error-path signal; do not widen it to healthy lifecycle milestones without a bounded retention design review Tested: cargo test -p guest-runtime --lib Tested: cargo test -p guest-runtime --test qemu_bridge Not-tested: Full workspace test/lint rerun after this narrow Rust-only follow-up

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df14b55834

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Restrict startup cleanup to exited image-prep containers, persist liveness markers for prepare-build workdirs, and only report reclaim success after a remove actually succeeds so operators are not misled. Constraint: Running ensure-image jobs may overlap with guest-runtime startup and reclaim requests Rejected: Cleaning every acu-image-prep-* container on startup | kills active image builds Rejected: Reporting every reclaim candidate as removed | can claim success on failed deletes Confidence: high Scope-risk: narrow Reversibility: clean Directive: Any future prep-work cleanup must prove both liveness safety and truthful operator reporting Tested: cargo test -p guest-runtime --test qemu_bridge Tested: cargo test -p guest-runtime --lib Tested: python -m py_compile scripts/qemu_guest_assets.py Not-tested: Full workspace build/lint/test rerun after this Rust/Python-only follow-up

OneNoted added 8 commits April 15, 2026 10:45

chatgpt-codex-connector bot reviewed Apr 15, 2026

View reviewed changes

Comment thread crates/guest-runtime/src/lib.rs Outdated

chatgpt-codex-connector bot reviewed Apr 15, 2026

View reviewed changes

Comment thread crates/guest-runtime/src/lib.rs Outdated

Comment thread scripts/qemu_guest_assets.py

Comment thread crates/guest-runtime/src/lib.rs Outdated

OneNoted merged commit d4aaa1f into main Apr 15, 2026
6 checks passed

OneNoted deleted the qemu-product-review-bundles branch April 15, 2026 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streamline the qemu product flow and export sparse review bundles#3

Streamline the qemu product flow and export sparse review bundles#3
OneNoted merged 10 commits intomainfrom
qemu-product-review-bundles

OneNoted commented Apr 15, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

OneNoted commented Apr 15, 2026

Summary

Testing

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant