Skip to content

feat(agentic-os): add unified ControlHub, Playbook system, and CDP browser control#419

Merged
GCWing merged 9 commits intoGCWing:mainfrom
bobleer:feat/agentic-os-unified-control
Apr 16, 2026
Merged

feat(agentic-os): add unified ControlHub, Playbook system, and CDP browser control#419
GCWing merged 9 commits intoGCWing:mainfrom
bobleer:feat/agentic-os-unified-control

Conversation

@bobleer
Copy link
Copy Markdown
Collaborator

@bobleer bobleer commented Apr 16, 2026

Summary

Transform BitFun from an AI IDE into an Agentic OS by introducing a unified control architecture:

  • ControlHub tool — single entry point for all control operations, routing by domain (desktop, browser, app, terminal, system) to existing backends (ComputerUse, SelfControl, TerminalControl) and new capabilities. Existing tools are retained for backward compatibility.
  • Playbook tool — YAML-based predefined operation playbooks for common tasks (browser screenshot, data extraction, form fill, desktop automation, model setup). Supports template variable substitution with automatic type preservation, conditional step filtering, and output variable annotations.
  • CDP browser control — connect to and control the user's default browser (Chrome, Edge, etc.) via Chrome DevTools Protocol, preserving login sessions, cookies, and extensions. Includes CdpClient (WebSocket transport), BrowserLauncher (cross-platform detection + CDP-enabled launch), BrowserActions (navigate, click, fill, snapshot, screenshot, evaluate JS, etc.), and Tauri desktop API commands.
  • Frontend UI — SessionConfig adds a browser control section with CDP status, connect button, and create-launcher button (macOS). All toast messages use i18n (en-US + zh-CN).

Key Design Decisions

  1. Backward compatibleComputerUse, SelfControl, and TerminalControl tools remain registered; ControlHub is an additive unified layer.
  2. Atomic actions — all control operations are fine-grained atomic actions that agents can compose freely or follow Playbook guides.
  3. Type-safe templates — Playbook {{var}} substitution preserves native JSON types (numbers, booleans) instead of always producing strings.
  4. Condition engine — Playbook steps support "X is Y", "X is not Y", "X is provided" conditions for conditional execution.
  5. Cross-platformsystem.open_app works on macOS (open -a), Windows (cmd /C start), and Linux (xdg-open).

Files Changed

Area Files Description
Browser control browser_control/ (4 files) CDP client, browser launcher, high-level actions
ControlHub control_hub_tool.rs Unified routing tool (~725 lines)
Playbook playbook_tool.rs + builtin_playbooks/ (5 YAML) Playbook engine + built-in templates
Tool registry registry.rs, mod.rs Register new tools
Desktop API browser_control_api.rs, lib.rs Tauri commands for browser control
Frontend SessionConfig.tsx Browser control UI section
i18n session-config.json (en-US, zh-CN) Localization strings

Test Plan

  • cargo check passes with zero errors and warnings
  • npx tsc --noEmit passes for web-ui
  • Desktop app: verify browser control section appears in Session Config
  • Desktop app: click "Connect browser" with a running Chrome/Edge — verify CDP connection
  • Agent: test ControlHub { domain: "browser", action: "connect" }navigatesnapshot flow
  • Agent: test Playbook { action: "run", name: "browser_screenshot", params: { url: "https://example.com" } }
  • Agent: test ControlHub { domain: "system", action: "open_app", app_name: "Calculator" }
  • Verify existing ComputerUse / SelfControl tools still work independently

bowen628 added 9 commits April 14, 2026 13:26
…owser control

Introduce the Agentic OS unified control architecture with three major components:

1. **ControlHub tool** — single entry point for all control operations, routing
   by domain (desktop, browser, app, terminal, system) to the appropriate
   backend. Reuses existing ComputerUse, SelfControl, and TerminalControl tools
   while adding new browser CDP and system-level capabilities. Cross-platform
   open_app support (macOS/Windows/Linux).

2. **Playbook tool** — predefined YAML-based operation playbooks for common
   tasks (browser screenshot, data extraction, form fill, desktop automation,
   model setup). Features template variable substitution with type preservation
   (numbers/booleans stay native), conditional step filtering (`X is Y`,
   `X is not Y`, `X is provided`), and output variable annotations.

3. **CDP browser control** — connect to and control the user's default browser
   (Chrome, Edge, etc.) via Chrome DevTools Protocol, preserving login sessions,
   cookies, and extensions. Includes:
   - CdpClient: WebSocket-based CDP command transport
   - BrowserLauncher: cross-platform browser detection and CDP-enabled launch
   - BrowserActions: high-level atomic actions (navigate, click, fill, snapshot,
     screenshot, evaluate JS, etc.)
   - Desktop API: Tauri commands for status/launch/create-launcher

4. **Frontend UI** — SessionConfig adds browser control section with CDP status
   display, connect button, and create-launcher button (macOS). All toast
   messages use i18n (en-US + zh-CN).

Existing tools (ComputerUse, SelfControl, TerminalControl) are retained for
backward compatibility while ControlHub provides the unified interface.
- Split BrowserKind::Arc from Unknown(name) pattern match to avoid
  E0408/E0381 on Windows and Linux cfg blocks
- Add `let _ = script` in non-macOS applescript branch to suppress
  unused variable warning
- Add cross-platform `shell` script_type for run_script action
- Allow unused log::debug import (used only in macOS-gated code)
@GCWing GCWing merged commit 76579e8 into GCWing:main Apr 16, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants