Skip to content

refactor: middleware architecture for v3 - composable packages#1255

Open
Huijiro wants to merge 119 commits intomainfrom
v3
Open

refactor: middleware architecture for v3 - composable packages#1255
Huijiro wants to merge 119 commits intomainfrom
v3

Conversation

@Huijiro
Copy link
Copy Markdown
Member

@Huijiro Huijiro commented Mar 23, 2026

Summary

This PR introduces a new modular architecture for Agentuity, breaking the "batteries included" @agentuity/runtime into composable packages. The goal is to let users choose their framework while Agentuity provides services as middleware.


What Changed

New Packages

Package Purpose
@agentuity/otel OpenTelemetry initialization, Logger
@agentuity/local Local development services (Bun SQLite) with runtime detection
@agentuity/services Cloud services + event providers + local fallback
@agentuity/hono agentuity() middleware for Hono

Architecture

Before (v2):
createApp() → everything bundled together

After (v3):
agentuity() middleware
    ├── @agentuity/otel (tracing, logging)
    └── @agentuity/services
            ├── Cloud services (when SDK key present)
            ├── @agentuity/local (fallback, Bun only)
            └── Event providers (session, evalrun)

Usage

// Before (v2) - deprecated
import { createApp } from '@agentuity/runtime';
export default await createApp({ router, agents });

// After (v3)
import { Hono } from 'hono';
import { agentuity } from '@agentuity/hono';

const app = new Hono();
app.use('*', agentuity());

app.get('/data', async (c) => {
  const { kv, logger } = c.var;
  const data = await kv.get('key');
  return c.json(data);
});

export default app;

Completed Migrations

  • OTel extraction - @agentuity/otel with registerOtel(), Logger
  • Logger interface - Single source in @agentuity/core (removed duplicate)
  • Local services - @agentuity/local with Bun SQLite + runtime detection
  • Event providers - SessionEventProvider, EvalRunEventProvider moved to services
  • Services initialization - Cloud vs local auto-detection
  • Hono middleware - agentuity() composes OTel + services
  • Deprecation warning - createApp() warns users

Strategic Questions (Open for Discussion)

These are fundamental architectural decisions that need community/team input:

1. Agent Orchestration: Build vs Integrate?

Current state: @agentuity/runtime has its own createAgent() with:

  • Agent registry
  • Agent-to-agent invocation (ctx.invoke())
  • Agent middleware

Options:

Approach Pros Cons
Build @agentuity/agents Full control, tight integration with services Maintenance burden, reinventing the wheel
Integrate with Vercel AI SDK Industry standard, great DX, actively maintained Vendor lock-in concerns, may not fit all use cases
Integrate with TanStack AI Framework-agnostic, growing ecosystem Newer, less mature
Hybrid - provider plugins Best of both worlds, user choice More complexity, multiple codepaths

Questions:

  • What's the value proposition of our own agent abstraction?
  • Can we provide Agentuity services as plugins to existing frameworks?
  • Do we need agent-to-agent invocation, or is that an anti-pattern?

2. Conversation/Chat: Build vs Integrate?

Current state: packages/runtime/src/session.ts (~2000 lines) with:

  • Thread, Session management
  • ThreadProvider, SessionProvider interfaces
  • Cloud WebSocket persistence
  • Local SQLite persistence

Options:

Approach Pros Cons
Build @agentuity/chat Tailored for our use case Significant maintenance, may diverge from standards
Provide primitives only Flexible, users choose their stack Less out-of-the-box value
Integrate with Vercel AI SDK Built-in chat/conversation patterns Locks users into that ecosystem
Integrate with TanStack AI Framework agnostic conversation state Newer, patterns still emerging

Key consideration: Thread/Session has Hono-specific code (cookies, context). Extracting requires making Hono a peer dependency.

Questions:

  • Is conversation state a core product feature or a commodity?
  • Should we focus on storage primitives and let frameworks handle the abstraction?
  • How much value does our Thread/Session abstraction provide over alternatives?

3. Agent Evaluation System

Current state: @agentuity/evals package with:

  • Evaluation framework for agents
  • Built-in evaluators (quality, safety, performance)
  • Custom evaluator support
  • Integration with EvalRunEventProvider (now in @agentuity/services)
  • Results storage and reporting

Options:

Approach Pros Cons
Keep as-is, @agentuity/evals Already extracted, working May need updates for new architecture
Integrate with existing eval frameworks Leverage ecosystem tools May not fit our agent model
Make framework-agnostic Works with any agent implementation More abstraction, less tight integration
Deprecate, recommend external tools Less maintenance Loses unique value prop

Questions:

  • Is agent evaluation a core differentiator for Agentuity?
  • Should evals be tied to our agent abstraction or work with any agent?
  • How does this integrate with third-party agent frameworks (Vercel AI SDK, TanStack)?
  • Should we support third-party eval frameworks (e.g., Ragas, DeepEval)?

4. Framework Adapters

Not started:

  • @agentuity/next - Next.js middleware
  • @agentuity/express - Express middleware
  • @agentuity/fastify - Fastify middleware

Questions:

  • Which frameworks should we prioritize?
  • Should we wait for the agent/chat/evals decisions first?
  • Can we provide service injection without framework-specific packages?

5. Local Services for Non-Bun Runtimes

Current: @agentuity/local only supports Bun (SQLite)

Potential runtimes:

  • Node.js (better-sqlite3, libsql)
  • Cloudflare Workers (D1, KV)
  • Deno (native SQLite)

Questions:

  • Should we add Node.js support now?
  • Use Drizzle ORM for cross-platform?
  • Keep it Bun-only, expect cloud services for other runtimes?

Breaking Changes

This is a v3 breaking change with no backward compatibility guarantees:

  1. createApp() is deprecated (will be removed)
  2. Direct imports from @agentuity/runtime internals will break
  3. Service access pattern changes from globals to c.var in Hono

Decision Timeline

Topic Blocking? Target Decision
Agent orchestration approach Yes - blocks @agentuity/agents Pre-merge discussion
Chat/conversation approach Partially - blocks extraction Post-merge, pre-next-release
Evaluation system direction Yes - relates to agent decision Pre-merge discussion
Framework adapters No Post-merge, demand-driven
Non-Bun local services No Post-merge, demand-driven

Looking for feedback on:

  1. Agent orchestration: build our own vs integrate with AI SDKs?
  2. Chat/conversation: extract or defer?
  3. Evaluation system: keep, integrate, or deprecate?
  4. Priority order for follow-up work

Summary by CodeRabbit

  • Chores

    • Consolidated CI into a single local testing job, removed multiple CI jobs and legacy test workflows.
    • Removed numerous demo apps, docs, sample configs, and shell-based deployment test scripts.
    • Deleted Prettier config/ignore files.
    • Updated framework demo test to build before running and increased timeouts.
  • New Features

    • Added Bun-powered integration/E2E test servers and a new HTTP-focused test suite.

Huijiro and others added 30 commits March 10, 2026 14:39
…ead AST code

- Workbench schema endpoint now generates TypeScript interface syntax
  from JSON Schema (via runtime toJSONSchema) instead of requiring
  Zod source strings from the AST-extracted metadata. Falls back to
  metadata strings only when no runtime schema is available.

- Add jsonSchemaToTypeScript() utility in workbench.ts that converts
  JSON Schema → clean TypeScript type notation for display:
  { name: string; age: number; tags?: string[] }

- Delete findCreateAppEndPosition() from ast.ts — dead code, exported
  but never imported anywhere.

- 20 new tests covering all JSON Schema → TypeScript conversions:
  primitives, objects, optionals, descriptions, arrays, unions,
  intersections, enums, literals, nullables, records, nested objects.
…hemaToTypeScript

Address CodeRabbit review feedback:
- Escape quotes, backslashes, and newlines in const/enum string values
- Quote property keys that aren't valid JS identifiers (hyphens, spaces, leading digits)
- 3 new test cases covering both fixes
Remove 402 lines of dead code from the AST pipeline:

- Delete analyzeWorkbench() + parseConfigObject() — only imported by
  the dead workbench.ts file, never used in production
- Delete checkFunctionUsage() — exported but never imported anywhere
- Delete checkRouteConflicts() — exported but never imported anywhere
- Delete WorkbenchAnalysis interface — only used by dead code
- Remove WorkbenchConfig import from ast.ts (no longer needed)
- Delete packages/cli/src/cmd/build/workbench.ts entirely — the whole
  file was dead code (getWorkbench, generateWorkbenchMainTsx, etc.
  are superseded by vite/workbench-generator.ts)
- Remove analyzeWorkbench tests from ast.test.ts (testing dead code)

ast.ts: 3,526 → 3,124 lines (402 lines removed, cumulative with
previous findCreateAppEndPosition deletion)
The lifecycle generator now uses TypeScript's own type checker to
extract the setup() return type instead of walking AST literals and
guessing types from values. This handles:

- Inline setup in createApp({ setup: () => ... })
- Exported setup functions (function decl or const arrow)
- Shorthand property: createApp({ setup })
- Variable references: setup: () => someVar
- Async functions (Promise unwrapping)
- Any pattern TypeScript itself can resolve

Also extract getDevmodeDeploymentId into ids.ts (pure hash, not AST).

ast.ts consumers remaining: only parseRoute (route-discovery.ts)
…iscovery + app-router-detector

createRouter() no longer wraps Hono methods — it's now just `new Hono()`
with Agentuity's Env type. This preserves Hono's full Schema type inference
chain, enabling `typeof router` to encode all route types.

The routeId lookup (for OTel spans) and returnResponse auto-conversion that
createRouter previously did will move to entry-file middleware in a follow-up.

agent-discovery.ts: rewritten to import() agent files at build time instead
of AST-parsing with acorn-loose. The agent instance already knows its own
metadata, schemas, and evals. Schemas are now extracted as JSON Schema
strings via toJSONSchema() instead of Zod source strings via astring.

app-router-detector.ts: rewritten to use TypeScript's compiler API instead
of acorn-loose. Detects createApp({ router }) patterns for explicit routing.

Both rewrites eliminate acorn-loose/astring usage from their respective files.
Only ast.ts itself still imports acorn-loose (for parseRoute, used by
route-discovery.ts).

Tests: 18 agent-discovery tests, 8 app-router-detector tests, 8 lifecycle
tests, dev-registry-generation tests all pass. Runtime: 665 tests pass.
- Delete ast.ts (3,120 lines) — entire acorn-loose + astring AST pipeline
- Delete route-migration.ts (793 lines) — file-based routing migration
- Delete api-mount-path.ts (87 lines) — file-based path computation
- Remove acorn-loose + astring from package.json
- Remove file-based routing fallback from entry-generator.ts
- Remove migration prompts from dev/index.ts and vite-bundler.ts
- Remove src/api/ directory watcher from file-watcher.ts
- Remove migrateRoutes CLI option
- Delete 15 test files testing deleted AST/file-based routing code
- Rewrite route-discovery + dev-registry tests for new architecture

Net: -13,073 lines deleted, +199 lines added
- Import toForwardSlash from normalize-path.ts instead of duplicating
- Replace existsSync with Bun.file().exists() in lifecycle-generator,
  app-router-detector, and agent-discovery
- Import toJSONSchema from @agentuity/schema public entry point (resolved
  from user's node_modules) instead of reaching into src/ internals
- Remove createAgent substring gate — check exported value shape instead,
  supporting re-exported agents
- Default createRouter S generic to BlankSchema ({}) to match Hono 4.7.13
- Migrate integration-suite, e2e-web, svelte-web, auth-package-app,
  webrtc-test, nextjs-app, tanstack-start, vite-rsc-app to explicit
  createApp({ router }) pattern
- Create combined router.ts files for apps with multiple route files
- Expose agent.evals on AgentRunner (was missing, breaking eval discovery)
- Deduplicate agents by name (re-exported agents from index.ts)
- Update route-metadata-nested tests for explicit routing
…estart loop

- Runtime: createApp() returns fetch/port/hostname for bun --hot to
  hot-swap the server's request handler without process restart
- Runtime: skip Bun.serve() in dev mode (bun --hot manages server
  via default export)
- Runtime: add idempotent OTel re-registration guard for hot reloads
- Runtime: pass cors/compression config directly to middleware instead
  of lazy global lookup via getAppConfig()
- Runtime: remove getAppState/getAppConfig/setAppConfig globals
  (config passed directly, app state was always {})
- Runtime: add typed _globals.ts for Symbol.for() state and globals.d.ts
  for string-keyed globalThis properties, eliminating unsafe casts
- Runtime: use Symbol.for() pattern in _process-protection.ts
- Runtime: guard one-time log messages (server started, local services)
  to prevent reprinting on hot reloads
- Runtime: downgrade internal port messages to debug level
- CLI: use bun --hot --no-clear-screen for backend subprocess
- CLI: remove file-watcher.ts usage, restart loop, stopBunServer,
  cleanupForRestart — bun --hot handles all backend HMR
- CLI: run 'Preparing dev server' once at startup instead of on
  every file change (~490 lines removed from dev/index.ts)
In production mode, startServer() already calls Bun.serve() on the
configured port. Bun v1.2+ also auto-serves when the default export
has fetch + port properties (added in c98ce19 for --hot support),
causing a second bind attempt and EADDRINUSE.

Strip fetch/port/hostname from the returned AppResult in production
so only the explicit Bun.serve() is active. Dev mode keeps them for
bun --hot auto-serve.
Resolve conflicts:
- modify/delete: keep v2's deletions of generated files (app.ts, routes.ts),
  ast.ts, and route-migration.ts — superseded by v2's import-based architecture
- agent-discovery.ts: keep v2's import-based version, port duplicate eval name
  detection from main (cab51e2)
- dev/index.ts: keep v2's bun --hot version — main's file-watcher restart loop
  fixes (5b7f9b8) don't apply since v2 removed the restart loop

Auto-merged from main:
- Gateway URL fallback update (agentuity.ai → catalyst.agentuity.cloud)
- Windows path fix for AI SDK patches (buildPatchFilter)
- Task status aliases, sandbox events, OIDC commands, monitoring
- Coder TUI updates, API reference docs, various CLI fixes
Bun --hot creates the server from the default export's properties.
Without the websocket handler, WebSocket upgrades fail with:
'To enable websocket support, set the "websocket" object in Bun.serve({})'

Add websocket from hono/bun to the AppResult (and strip it in
production alongside fetch/port/hostname).
json5 was not declared as a dependency in cli/package.json, causing
a type error. Use the existing parseJSONC utility (from utils/jsonc)
which handles tsconfig.json comments and trailing commas.
## @agentuity/migrate package (new)

A CLI tool to migrate v1 projects to v2:
- `npx @agentuity/migrate` — guided migration with codemods
- Deletes `src/generated/` directory
- Removes `bootstrapRuntimeEnv()` call from app.ts
- Transforms routes from createRouter() mutable style to new Hono<Env>() chained
- Generates src/api/index.ts and src/agent/index.ts barrels
- Adds migration comments for setup/shutdown lifecycle
- Guides on agentuity.config.ts deprecation
- Detects frontend using removed APIs (createClient, useAPI, RPCRouteRegistry)
- Runs bun install and typecheck post-migration

## agentuity.config.ts deprecation

- New app-config-extractor.ts extracts analytics/workbench from createApp()
- config-loader.ts emits deprecation warning when loading agentuity.config.ts
- getWorkbenchConfig() now prefers runtime config from createApp()
- dev/index.ts and vite-builder.ts use loadRuntimeConfig()

Config consolidation in v2:
- Runtime config (analytics, workbench, cors, etc.) → createApp() only
- Vite config (plugins, define, render, bundle) → vite.config.ts
- agentuity.config.ts → deprecated, delete entirely

## Documentation

- Updated migration-guide.mdx with v1→v2 tab
- Includes automated migration instructions and manual steps
- Covers all breaking changes and troubleshooting
This commit consolidates several v2 improvements:

### bun-dev-server error diagnostics
- Add app.ts validation to detect v1 pattern (destructuring without export default)
- Capture Bun stderr/stdout and show in error messages
- Add port cleanup with ensurePortAvailable() to kill orphan processes
- Warn before starting if app.ts has common issues
- Export validation functions for testing

### Process manager for dev mode
- New ProcessManager class to track all spawned processes/servers
- Ordered cleanup (LIFO for processes)
- Force kill fallback after timeout
- Integrated into dev/index.ts for cleanup on failure/shutdown

### Remove agentuity.config.ts support
- Deleted loadAgentuityConfig from config-loader.ts
- getWorkbenchConfig now only takes (dev, runtimeConfig) - no config file fallback
- Users must use vite.config.ts for Vite config
- Users must use createApp() for runtime config (workbench, analytics)

### Remove auto-adding React plugin
- Vite no longer auto-adds @vitejs/plugin-react
- Users must configure frontend framework in vite.config.ts

### Deprecate @agentuity/react
- Added deprecation notice to README.md and package.json
- @agentuity/auth no longer depends on @agentuity/react
- AuthProvider now accepts callback props instead of relying on AgentuityProvider

### Migrate tool updates
- Detect missing vite.config.ts when frontend exists
- Detect deprecated @agentuity/react API usage
- Detect agentuity.config.ts and suggest migration

Tests: Updated workbench tests, removed define-config test (obsolete), added process-manager tests
Tests verify:
- publicDir is set correctly in dev mode config
- Public files are served at root paths in dev
- Public files maintain directory structure
- Various file types are handled correctly
- Edge cases (empty folder, hidden files, subdirectories)
- Integration with vite-builder functions
Add tests for dev server orchestration covering:

- dev-lock.test.ts: Lockfile management, orphan process cleanup,
  edge cases for corrupted/missing lockfiles

- ws-proxy.test.ts: Front-door TCP proxy routing decisions,
  error handling, URL parsing, query strings

- dev-server-integration.test.ts: Full lifecycle testing,
  crash recovery, hot reload validation, error resilience

All 60 tests pass covering:
- Startup/shutdown with port cleanup
- Hot reload behavior (Bun --hot, Vite HMR)
- Crash recovery (SIGTERM/SIGKILL escalation)
- WS proxy routing (HTTP→Vite, WS upgrade→Bun)
- Error resilience (TypeScript errors, v1 patterns)
Merge main (1.0.54) into v2 branch.

Resolution strategy:
- Deleted files (v2): Kept v2's removal of src/generated/*, ast.ts, route-migration.ts
- Dev server files: Kept v2's no-bundle architecture with bun --hot
- Package versions: Took main's higher versions
- New features from main: Accepted (oauth, sandbox jobs, service packages)
- API docs: Took main's updated documentation

Key changes merged from main:
- New standalone service packages (@agentuity/db, @agentuity/email, etc.)
- OAuth service support
- Sandbox job commands
- Updated CLI commands for all cloud services
- API reference documentation updates
Since React is no longer auto-added by the CLI, each project with a
frontend needs its own vite.config.ts with the appropriate plugins.

Added vite.config.ts for:
- apps/docs (React + Tailwind + MDX + TanStack Router)
- apps/testing/e2e-web (React)
- apps/testing/cloud-deployment (React)
- apps/testing/integration-suite (React)
- apps/testing/auth-package-app (React)
- apps/testing/oauth (React)
- apps/testing/webrtc-test (React)
- apps/testing/svelte-web (Svelte - pending investigation for CLI build)

Updated vite-builder.ts to properly merge user vite.config.ts:
- User plugins now come FIRST (important for framework plugins like Svelte)
- User config values are preserved unless overridden by Agentuity-specific needs
- Removed mergeConfig in favor of explicit spread to avoid array merge issues

Note: Svelte builds work with vite v8.0.1 building client environment for production...
�[2K
transforming...✓ 1 modules transformed. but fail when built
through the CLI. This requires further investigation into how the
Svelte plugin interacts with the CLI's build process.
…lity

## Problem
Svelte 5 builds failed when invoked through CLI's programmatic viteBuild()
call, but worked correctly with `bunx vite build`. The error showed the
Svelte compile plugin receiving already-compiled JavaScript instead of
Svelte source code.

## Root Cause
Bun's module loading system has issues with Vite's plugin pipeline when
importing Vite and calling build() programmatically. Certain plugins like
@sveltejs/vite-plugin-svelte receive already-compiled code, possibly due to
module state caching or transformation order issues.

## Solution
For client builds, spawn `bun x vite build` as a subprocess instead of
importing Vite and calling build() programmatically. This gives Vite complete
control over its module loading and plugin execution, avoiding Bun's
module system entirely.

Workbench builds continue using programmatic viteBuild() since those use
our own React plugin without external framework plugins.

## Additional Changes
- Updated vite.config.ts for all test projects to include root and input
  path (required when spawning vite as subprocess)
- Updated svelte-web agentuity.config.ts to v2 format (removed plugins)
- Removed temporary svelte.config.js that was added during debugging

## Testing
All test projects now build successfully:
- apps/testing/e2e-web (React)
- apps/testing/svelte-web (Svelte 5)
- apps/testing/cloud-deployment (React)
- apps/testing/integration-suite (React)
- apps/testing/auth-package-app (React)
- apps/testing/oauth (React)
- apps/testing/webrtc-test (React)
The migrate package was missing from the root tsconfig.json references,
causing it to not be built during CI builds.
The evals package depends on @agentuity/runtime and @agentuity/schema
but was missing the TypeScript project references, causing build failures.
Since the CLI now spawns vite as a subprocess for client builds,
projects need a vite.config.ts file with the proper input path.

Added:
- templates/_base/vite.config.ts with React plugin and input path
- vite and @vitejs/plugin-react to devDependencies in package.json
When vite.config.ts sets root='.' and input='src/web/index.html',
vite outputs the HTML at client/src/web/index.html instead of
client/index.html. The runtime now checks both locations.

This fixes cloud deployment tests that were failing because
the analytics beacon injection couldn't find the HTML file.
Huijiro and others added 10 commits April 13, 2026 14:57
Posts (or updates) a comment on the PR with:
- Published version and npm tag
- Install commands for CLI and core
- Full list of published packages with versions

Uses an HTML marker comment to find and update existing comments
instead of posting new ones on each push.
…tion

npm trusted publishers with sigstore provenance require repository.url
to match the GitHub repo. All packages were missing this field, causing
E422 'Error verifying sigstore provenance bundle' during publish.
…L env support

@ai-sdk/openai v1 doesn't read OPENAI_BASE_URL from process.env —
it only accepts baseURL as a constructor option. v3 uses
loadOptionalSetting with OPENAI_BASE_URL, so agentuity dev's gateway
injection works automatically without code changes in user apps.

ai v6 is required for compatibility with @ai-sdk/openai v3 (model
spec version v2).
catalyst.agentuity.cloud doesn't exist — must be catalyst-{region}.
Default to catalyst-usc.agentuity.cloud. Also check AGENTUITY_CATALYST_URL
in the fallback chain.
Next.js requires @types/node to start with TypeScript.
The app had @types/bun but not @types/node.
Revert to main-only publishing. Branch publishing needs npm trusted
publisher config for PR contexts which isn't set up yet.
Add a new migration mode to @agentuity/migrate that converts v2 projects
(createApp/createAgent from @agentuity/runtime) to v3 framework-agnostic
Hono applications.

New files:
- detect-v3.ts: Detection engine for v2 patterns (createApp, createAgent,
  ctx.*/c.var.* service access, package versions, SPA, config files).
  Classifies agents as simple (handler+schema) or complex.
- migrate-v3.ts: Orchestrator — git check, detect, report, confirm,
  transform, install, typecheck, summary.
- transforms/v3/entry-point.ts: Generates src/index.ts with Hono app +
  agentuity() middleware (replaces app.ts + createApp).
- transforms/v3/agents.ts: Simple agents become plain exported async
  functions with schema preserved. Complex agents get migration comments.
- transforms/v3/services.ts: Generates shared src/services.ts with
  singleton service clients based on detected usage.
- transforms/v3/routes.ts: Rewrites c.var.*/ctx.* service access to
  direct imports from the services module.
- transforms/v3/package-json.ts: Removes @agentuity/runtime, adds hono +
  @agentuity/hono + individual service packages, bumps to ^3.0.0.

Modified:
- bin/migrate.ts: Auto-detects migration mode from @agentuity/runtime
  version (^1.x → v1→v2, ^2.x → v2→v3). Adds --v1-to-v2/--v2-to-v3 flags.
- index.ts: Exports v3 public API alongside existing v1→v2.
- report.ts: Adds v3-specific report rendering functions.
Huijiro added 18 commits April 14, 2026 14:44
- Bump all packages to 3.0.0-alpha.1
- Add --tag flag to publish script for overriding npm dist-tag
- Add alpha dist-tag auto-detection from version string
- Make npm OTP optional for automation tokens
…irectories

- Add template overlay directories for all 7 frameworks:
  nextjs, nuxt, remix, sveltekit, astro, hono, vite-react
- Each template includes a translate API route + branded landing page
  using framework-native conventions and idiomatic data fetching:
  - Next.js: SWR (useSWRMutation)
  - Nuxt: built-in useFetch
  - React Router: server action + fetch
  - SvelteKit: form actions (+page.server.ts)
  - Astro: vanilla fetch (client script)
  - Hono: vanilla fetch (client script)
  - Vite+React: TanStack Query (useMutation)
- Fix Next.js app/ directory conflict (src/app/ instead of root app/)
- Add --tag flag and alpha dist-tag detection to publish script
- Make npm OTP optional for automation tokens
- Delete old inline frameworks-ai-examples.ts and frameworks-landing-pages.ts
- scaffold.ts now uses cpSync overlay instead of string-based file generation
Template files are raw source meant to be copied into user projects,
not compiled as part of the CLI package build.
- Keep v3's deleted vite dev server files (replaced by buildpack pipeline)
- Keep v3's dev/index.ts (new buildpack-based dev architecture)
- Take main's coder types .refine() validation
- Regenerate bun.lock
Previously -alpha. versions mapped to @next, but we publish under
the @Alpha tag. This caused bun create agentuity@^3.0.0-alpha.0
to install @agentuity/cli@next (which pointed to 3.0.0-alpha.0)
instead of @agentuity/cli@alpha (3.0.0-alpha.2 with templates).
Instead of hardcoding alpha→next, simply forward the prerelease
tag directly. So -alpha.0 → @Alpha, -beta.1 → @beta, -rc.2 → @rc.
Stable versions still use exact version numbers.
When the CLI itself is a prerelease (e.g. 3.0.0-alpha.2), scaffolded
projects now get @agentuity/cli@alpha instead of ^3.0.0, which can't
resolve from npm latest. Stable CLI versions still use ^N.0.0 ranges.
- All packages published under @Alpha npm dist-tag
- Fix: applyOverlay uses force:true to overwrite default files
- Fix: scaffolded projects use prerelease dist-tag for @agentuity deps
- Fix: create-agentuity derives CLI dist-tag from prerelease identifier
…packages

Remove the BetterAuth-based auth package, React hooks package, and
framework-agnostic frontend utilities package. These are v2-era packages
that don't fit v3's bring-your-own-framework architecture.

Removed packages:
- @agentuity/auth — BetterAuth config, server middleware, React
  components, Drizzle schema, CLI project auth commands, create flow
  auth setup
- @agentuity/react — AgentuityProvider, useAuth, useWebRTCCall,
  useAnalytics, server/client entrypoints
- @agentuity/frontend — WebSocket/EventStream/WebRTC managers, URL
  builders, reconnect utilities
- @agentuity/core services/auth — shared auth type definitions

Also removed from:
- CI publish workflows
- Root tsconfig references and test:packages script
- CLI package.json dependencies
- Core package.json subpath exports and services barrel
- Testing app and docs app dependencies

Note: OpenCode agent prompts and Claude Code skills still reference
these packages in documentation strings and will be updated in a
separate pass.
Dead files removed:
- core/src/webrtc.ts — WebRTC types with zero consumers after
  frontend/react removal
- cli/src/cmd/build/app-config-extractor.ts — v2 createApp() config
  extractor, never imported
- cli/src/utils/workbench-notify.ts — never imported
- cli/src/utils/bun-version-checker.ts — never imported
- cli/src/utils/dependency-checker.ts — never imported
- cli/src/utils/detectSubagent.ts — never imported
- cli/src/utils/stream-capture.ts — never imported
- cli/src/utils/string.ts — dead re-export of core utils
- cli/src/utils/version-mismatch.ts — never imported
- 5 test files importing from deleted source files

Dead exports/types removed:
- core/index.ts — webrtc type re-export, stale frontend comment
- cli/types.ts — BuildPhase, BuildContext, WorkbenchConfig,
  AnalyticsConfig, AgentuityConfig (v2 config types)
- cli/index.ts — re-exports of above
- drizzle/index.ts — drizzleAdapter re-export from better-auth
- build-report.ts — workbench-build phase
- deploy.ts — workbench-src/workbench/ directory filters

Unused dependencies removed (verified via depcheck):
- cli: @datasert/cronjs-parser, @vitejs/plugin-react, acorn-loose,
  adm-zip, astring, git-url-parse, tar-fs, vite, bun-plugin-tailwind,
  tailwindcss, @types/adm-zip, @types/tar-fs
- coder: @agentuity/server
- hono: @opentelemetry/api
- migrate: @agentuity/core
- drizzle: better-auth
Remove workbench-related code from core, CLI, and VSCode extension:

Core:
- workbench-config.ts — encode/decode/get workbench config utilities
- workbench.ts — re-export module
- ./workbench subpath export from package.json
- subpath-exports.test.ts — tested workbench subpath resolution
- workbench-config.test.ts — tested encode/decode roundtrip

VSCode:
- features/workbench/index.ts — workbench.open command registration
- agentuity.workbench.open command and menu entries from package.json
- getWorkbenchUrl() from urls.ts
- Open in Workbench button from chat participant
- Open in Workbench code lens from agentCodeLensProvider
- openInWorkbench command handler + dead helpers from codeLens/index.ts

CLI:
- Workbench references from onboarding prompt
- Workbench route filter from catchall-routing test
- workbenchPath from backend-proxy test

Note: workbench references in migrate/ kept intentionally (detects
v2 workbench config for migration). workbench references in docs app
and opencode/claude-code skills are part of the broader v3 docs
refresh (memory #102).
Removes the evals package, CLI commands, core services, docs, skills,
and examples — consistent with the workbench and auth removals in v3.

Deleted:
- packages/evals/ — entire package (preset evals, types, tests)
- packages/core/src/services/eval/ — cloud API client types
- packages/cli/src/cmd/cloud/eval/ — agentuity cloud eval commands
- packages/cli/src/cmd/cloud/eval-run/ — agentuity cloud eval-run commands
- apps/docs evals demo (agent, API route, run script, component, routes)
- examples/evals/ — example project
- Docs pages: evaluations.mdx (agents, sdk-reference, api)

Cleaned references from:
- tsconfig.json, release-next.yaml, bun.lock
- Docs: nav-data, code-examples, test-outputs, demo-config
- Skills: agentuity-backend, agentuity-cloud, agentuity-agents
- Opencode agents: expert.ts, expert-backend.ts
- CLI: onboarding prompt, session get command, cloud index
- AGENTS.md (root + docs), apps/docs/package.json
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/release-next.yaml (1)

103-125: ⚠️ Potential issue | 🟡 Minor

Verify the PACKAGES list matches packages intended for npm publishing.

The tsconfig.json references 28 packages, but the PACKAGES array only lists 21. The following packages appear to be missing:

  • packages/adapter
  • packages/analytics
  • packages/telemetry
  • packages/hono
  • packages/local
  • packages/stream
  • packages/vscode

If these packages are intentionally not published to npm (e.g., internal-only, or published via a separate workflow), this is fine. Otherwise, they will be missing from the next release.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release-next.yaml around lines 103 - 125, The PACKAGES
array in the release workflow (PACKAGES) is missing several packages referenced
by tsconfig.json (tsconfig lists 28 but PACKAGES contains 21), so either add the
missing packages—packages/adapter, packages/analytics, packages/telemetry,
packages/hono, packages/local, packages/stream, packages/vscode—to the PACKAGES
array in the workflow or explicitly document/guard their intentional exclusion
(e.g., skip-publish flags or a separate publish workflow) so they aren’t
accidentally omitted from the next npm release; update the PACKAGES variable and
any related publish logic to include these package names or add checks/comments
explaining their exclusion.
🧹 Nitpick comments (1)
.github/workflows/release-next.yaml (1)

79-80: Redundant but harmless condition.

The workflow already only triggers on push to main (lines 3-6), so this condition will always be true when the workflow runs. However, it does provide defense-in-depth if the trigger configuration changes in the future.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release-next.yaml around lines 79 - 80, Remove the
redundant workflow-level condition "if: github.event_name == 'push' &&
github.ref == 'refs/heads/main'" since the workflow trigger already restricts
execution to pushes on main; delete that line (the exact conditional string) to
simplify the workflow, or if you prefer defense-in-depth, keep it but add a
brief comment explaining it's intentionally duplicated for extra safety.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/release-next.yaml:
- Line 14: The workflow currently sets cancel-in-progress: true which can abort
in-flight releases and leave npm packages half-published; change the
cancel-in-progress setting to false (or remove the cancel-in-progress line
entirely) so releases are allowed to finish before new runs start, ensuring the
release job completes atomically.

In `@apps/create-agentuity/bin.js`:
- Around line 17-24: The JSDoc examples incorrectly state that no-version usage
maps to `@agentuity/cli`@latest; update the comment to show that for stable
(non-prerelease) pkg.version the function pins to the exact version string
(e.g., "bun create agentuity → `@agentuity/cli`@<pkg.version>"). Edit the comment
block around the examples to replace the "→ `@agentuity/cli`@latest" line with a
concrete example that uses pkg.version (reference pkg.version in the comment) so
docs match the actual behavior of the version-resolution logic.

In `@apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx`:
- Around line 286-293: Guard reads of ctx.state.get('output') and
ctx.state.get('input') before using them: check that ctx.state.get('input') and
ctx.state.get('output') are present (and have the expected shapes) and handle
the missing case (return early, throw a clear error, or provide defaults) before
calling generateObject; reference the existing symbols in the snippet
(ctx.state.get('input'), ctx.state.get('output'), HelpfulnessJudgment,
generateObject) and ensure the code uses optional typing or null checks to avoid
accessing .question or .answer when those keys are absent.

---

Outside diff comments:
In @.github/workflows/release-next.yaml:
- Around line 103-125: The PACKAGES array in the release workflow (PACKAGES) is
missing several packages referenced by tsconfig.json (tsconfig lists 28 but
PACKAGES contains 21), so either add the missing packages—packages/adapter,
packages/analytics, packages/telemetry, packages/hono, packages/local,
packages/stream, packages/vscode—to the PACKAGES array in the workflow or
explicitly document/guard their intentional exclusion (e.g., skip-publish flags
or a separate publish workflow) so they aren’t accidentally omitted from the
next npm release; update the PACKAGES variable and any related publish logic to
include these package names or add checks/comments explaining their exclusion.

---

Nitpick comments:
In @.github/workflows/release-next.yaml:
- Around line 79-80: Remove the redundant workflow-level condition "if:
github.event_name == 'push' && github.ref == 'refs/heads/main'" since the
workflow trigger already restricts execution to pushes on main; delete that line
(the exact conditional string) to simplify the workflow, or if you prefer
defense-in-depth, keep it but add a brief comment explaining it's intentionally
duplicated for extra safety.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2a81babd-3357-4890-ba0b-6aada53d2d5e

📥 Commits

Reviewing files that changed from the base of the PR and between 73bfd51 and 6a12ae4.

📒 Files selected for processing (56)
  • .claude-plugin/marketplace.json
  • .github/workflows/package-smoke-test.yaml
  • .github/workflows/release-next.yaml
  • AGENTS.md
  • apps/create-agentuity/bin.js
  • apps/create-agentuity/bin.test.js
  • apps/create-agentuity/package.json
  • apps/docs/AGENTS.md
  • apps/docs/README.md
  • apps/docs/package.json
  • apps/docs/scripts/generate-api-reference.ts
  • apps/docs/scripts/generate-nav-data.ts
  • apps/docs/src/agent/evals/agent.ts
  • apps/docs/src/agent/evals/eval.ts
  • apps/docs/src/api/evals/route.ts
  • apps/docs/src/api/index.ts
  • apps/docs/src/api/sandbox/scripts.ts
  • apps/docs/src/run/evals.ts
  • apps/docs/src/web/code-examples.ts
  • apps/docs/src/web/components/EvalsDemo.tsx
  • apps/docs/src/web/components/docs/nav-data.ts
  • apps/docs/src/web/content/agents/ai-sdk-integration.mdx
  • apps/docs/src/web/content/agents/calling-other-agents.mdx
  • apps/docs/src/web/content/agents/evaluations.mdx
  • apps/docs/src/web/content/agents/events-lifecycle.mdx
  • apps/docs/src/web/content/agents/index.mdx
  • apps/docs/src/web/content/agents/state-management.mdx
  • apps/docs/src/web/content/agents/when-to-use.mdx
  • apps/docs/src/web/content/cookbook/integrations/claude-agent.mdx
  • apps/docs/src/web/content/cookbook/integrations/mastra.mdx
  • apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx
  • apps/docs/src/web/content/cookbook/tutorials/rag-agent.mdx
  • apps/docs/src/web/content/cookbook/tutorials/understanding-agents.mdx
  • apps/docs/src/web/content/get-started/project-structure.mdx
  • apps/docs/src/web/content/get-started/what-is-agentuity.mdx
  • apps/docs/src/web/content/reference/api/evaluations.mdx
  • apps/docs/src/web/content/reference/api/index.mdx
  • apps/docs/src/web/content/reference/api/projects.mdx
  • apps/docs/src/web/content/reference/cli/claude-code-plugin.mdx
  • apps/docs/src/web/content/reference/cli/debugging.mdx
  • apps/docs/src/web/content/reference/cli/getting-started.mdx
  • apps/docs/src/web/content/reference/sdk-reference/agents.mdx
  • apps/docs/src/web/content/reference/sdk-reference/evaluations.mdx
  • apps/docs/src/web/content/reference/sdk-reference/events.mdx
  • apps/docs/src/web/content/reference/sdk-reference/index.mdx
  • apps/docs/src/web/content/services/storage/vector.mdx
  • apps/docs/src/web/demo-config.tsx
  • apps/docs/src/web/entry-server.tsx
  • apps/docs/src/web/frontend.tsx
  • apps/docs/src/web/routeTree.gen.ts
  • apps/docs/src/web/routes/_docs/agents/evaluations.tsx
  • apps/docs/src/web/routes/_docs/reference/api/evaluations.tsx
  • apps/docs/src/web/routes/_docs/reference/sdk-reference/evaluations.tsx
  • apps/docs/src/web/routes/explorer/evals.tsx
  • apps/docs/src/web/test-outputs.ts
  • apps/testing/nextjs-app/package.json
💤 Files with no reviewable changes (37)
  • apps/docs/src/web/content/cookbook/tutorials/rag-agent.mdx
  • apps/docs/src/web/content/agents/calling-other-agents.mdx
  • AGENTS.md
  • apps/docs/AGENTS.md
  • apps/docs/src/web/content/cookbook/integrations/claude-agent.mdx
  • apps/docs/scripts/generate-api-reference.ts
  • apps/docs/src/web/content/services/storage/vector.mdx
  • apps/docs/src/web/content/cookbook/tutorials/understanding-agents.mdx
  • apps/docs/src/web/content/cookbook/integrations/mastra.mdx
  • apps/docs/src/web/content/agents/events-lifecycle.mdx
  • apps/docs/src/web/content/get-started/what-is-agentuity.mdx
  • apps/docs/src/web/content/reference/api/index.mdx
  • apps/docs/src/web/content/reference/cli/debugging.mdx
  • apps/docs/src/web/content/reference/cli/getting-started.mdx
  • apps/docs/src/web/content/reference/sdk-reference/index.mdx
  • apps/docs/scripts/generate-nav-data.ts
  • apps/docs/src/web/content/reference/sdk-reference/agents.mdx
  • apps/docs/src/web/content/agents/ai-sdk-integration.mdx
  • apps/docs/src/web/routes/explorer/evals.tsx
  • apps/docs/src/web/test-outputs.ts
  • apps/docs/src/api/index.ts
  • apps/docs/src/web/routes/_docs/agents/evaluations.tsx
  • apps/docs/src/api/sandbox/scripts.ts
  • apps/docs/src/web/code-examples.ts
  • apps/docs/src/web/components/docs/nav-data.ts
  • apps/docs/src/web/content/reference/sdk-reference/evaluations.mdx
  • apps/docs/src/web/content/agents/evaluations.mdx
  • apps/docs/src/web/routes/_docs/reference/sdk-reference/evaluations.tsx
  • apps/docs/src/api/evals/route.ts
  • apps/docs/src/web/demo-config.tsx
  • apps/docs/src/agent/evals/eval.ts
  • apps/docs/src/run/evals.ts
  • apps/docs/src/web/content/reference/api/evaluations.mdx
  • apps/docs/src/agent/evals/agent.ts
  • apps/docs/src/web/routes/_docs/reference/api/evaluations.tsx
  • apps/docs/src/web/components/EvalsDemo.tsx
  • apps/docs/README.md
✅ Files skipped from review due to trivial changes (9)
  • apps/docs/src/web/content/get-started/project-structure.mdx
  • apps/docs/src/web/content/agents/state-management.mdx
  • apps/create-agentuity/package.json
  • apps/docs/src/web/content/reference/cli/claude-code-plugin.mdx
  • apps/docs/src/web/frontend.tsx
  • apps/docs/src/web/content/agents/when-to-use.mdx
  • .claude-plugin/marketplace.json
  • apps/docs/src/web/content/agents/index.mdx
  • apps/docs/src/web/content/reference/sdk-reference/events.mdx
📜 Review details
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use Biome as the code formatter with configuration: tabs (width 3), single quotes, semicolons, lineWidth 100, trailingCommas es5

Use StructuredError from @agentuity/core for error handling

Files:

  • apps/docs/src/web/entry-server.tsx
  • apps/create-agentuity/bin.test.js
  • apps/create-agentuity/bin.js
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript in strict mode with ESNext target and bundler moduleResolution

Files:

  • apps/docs/src/web/entry-server.tsx
apps/docs/src/web/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/docs/AGENTS.md)

React frontend code should be organized in src/web/ directory with proper component structure (App.tsx, frontend.tsx, hooks/, components/)

Files:

  • apps/docs/src/web/entry-server.tsx
apps/docs/src/web/**/*.tsx

📄 CodeRabbit inference engine (apps/docs/AGENTS.md)

Use Tailwind CSS for styling React components in the frontend

Use React 19 features and modern React patterns in frontend components

Files:

  • apps/docs/src/web/entry-server.tsx
apps/docs/**/package.json

📄 CodeRabbit inference engine (apps/docs/AGENTS.md)

Use workspace dependencies with workspace:* for local packages (@agentuity/runtime, @agentuity/react, @agentuity/schema, @agentuity/workbench, @agentuity/cli)

Files:

  • apps/docs/package.json
apps/docs/**/{package.json,agentuity.config.ts}

📄 CodeRabbit inference engine (apps/docs/AGENTS.md)

Reference the local CLI directly using bun ../../packages/cli/bin/cli.ts in scripts rather than npm packages

Files:

  • apps/docs/package.json
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Build the project using `bun run build` from root or individual packages
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Run typecheck using `bun run typecheck`
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Run linting using `bun run lint`
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Format code using `bun run format`
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Run tests using `bun run test`
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Ensure all errors and warnings are zero before tests pass
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Run format, lint, typecheck, build, and test verification before committing
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Never commit directly to the main branch
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Do not create documentation files unless explicitly asked
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:39:42.969Z
Learning: Ask for clarification before making major code changes if unsure
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:40:01.699Z
Learning: Build the SDK Explorer app using `bun run build` command
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:40:01.699Z
Learning: Start the development server using `bun run dev` command
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:40:01.699Z
Learning: Run TypeScript type checking using `bun run typecheck` command
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:40:01.699Z
Learning: Deploy the SDK Explorer app using `bun run deploy` command to deploy to Agentuity cloud
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:40:01.699Z
Learning: Generate and maintain script metadata using `bun run generate:scripts` command to track available demo scripts
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:40:01.699Z
Learning: Store scripts in `src/run/*.ts` and regenerate script metadata with `bun run generate:scripts` after adding new demo scripts
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:40:01.699Z
Learning: Integrate multiple AI SDK providers (OpenAI, Anthropic, Google, Groq) in agent implementations
Learnt from: CR
Repo: agentuity/sdk

Timestamp: 2026-04-17T23:40:01.699Z
Learning: Support multiple API route types: REST, streaming, SSE (Server-Sent Events), and WebSocket in the API routes
📚 Learning: 2026-01-09T16:26:51.893Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 523
File: templates/_base/src/web/frontend.tsx:13-35
Timestamp: 2026-01-09T16:26:51.893Z
Learning: In web frontend code, prefer using the built-in Error class for runtime errors. Do not throw or re-export StructuredError from agentuity/core in web app code. Replace instances of StructuredError with new Error or custom error types that extend Error; ensure error handling logic remains intact and that error messages are descriptive. This guideline applies to all web UI TypeScript/TSX files that run in the browser and import StructuredError from agentuity/core.

Applied to files:

  • apps/docs/src/web/entry-server.tsx
📚 Learning: 2026-02-21T02:05:57.982Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 1010
File: packages/drizzle/test/proxy.test.ts:594-603
Timestamp: 2026-02-21T02:05:57.982Z
Learning: Do not rely on StructuredError from agentuity/core in test files or simple error handling paths. In tests and straightforward error handling, use plain Error objects to represent failures, reserving StructuredError for more complex error scenarios in application logic.

Applied to files:

  • apps/create-agentuity/bin.test.js
🪛 actionlint (1.7.12)
.github/workflows/package-smoke-test.yaml

[error] 70-70: label "blacksmith-2vcpu-ubuntu-2404" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2025-vs2026", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xlarge", "macos-latest-large", "macos-26-intel", "macos-26-xlarge", "macos-26-large", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xlarge", "macos-14-large", "macos-14", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)


[error] 190-190: label "blacksmith-2vcpu-ubuntu-2404" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2025-vs2026", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xlarge", "macos-latest-large", "macos-26-intel", "macos-26-xlarge", "macos-26-large", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xlarge", "macos-14-large", "macos-14", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

🔇 Additional comments (11)
apps/docs/src/web/content/reference/api/projects.mdx (1)

499-555: Looks good — response schema cleanup is consistent with the v3 API surface.

Removing eval-related fields from the List Agents response docs is coherent with the rest of this migration and keeps the reference aligned with current payloads.

apps/docs/package.json (1)

3-3: Version bump looks appropriate for this migration.

Line 3 is consistent with the v3 alpha rollout and the breaking architectural changes described in this PR.

apps/docs/src/web/entry-server.tsx (1)

27-27: SSR provider composition is now consistent with client render.

Line 27 aligns the server tree with apps/docs/src/web/frontend.tsx (ThemeProvider + app/router, no AgentuityProvider), which is the right direction for stable hydration.

.github/workflows/package-smoke-test.yaml (2)

69-95: LGTM! Well-structured consolidation of testing apps.

The job consolidates multiple test apps into a single workflow job, which reduces CI overhead. The bun test invocations are correct—context snippets confirm each app (standalone-backend, oauth, e2e-web, integration-suite) has proper test/*.test.ts files using bun:test.

The static analysis hint about blacksmith-2vcpu-ubuntu-2404 being an unknown runner label is a false positive—Blacksmith is a self-hosted CI runner provider, and custom labels aren't recognized by actionlint's default configuration.


189-219: LGTM! Framework demo test job properly restructured.

Good changes:

  1. Explicit bun run build step added (line 206-207), which is necessary since --skip-build was removed from the script invocation
  2. Increased timeout to 30 minutes is appropriate for comprehensive framework testing
  3. Credentials setup follows the same pattern as other jobs

The static analysis hint about the runner label is the same false positive as noted above.

.github/workflows/release-next.yaml (2)

71-72: LGTM! Good practice to set NODE_ENV: test for unit tests.

This ensures test-specific configurations and behaviors are activated during test execution.


96-96: LGTM! Build command simplification is correct.

The bun run build command (which runs bunx tsc --build) uses TypeScript project references to build all packages in dependency order, achieving the same result as the previous bun run build:packages.

apps/create-agentuity/bin.test.js (1)

6-30: Nice coverage update for prerelease tag mapping.

The new assertions for alpha, rc, and canary match the updated specifier behavior and close earlier gaps well.

apps/create-agentuity/bin.js (1)

30-33: Prerelease tag extraction logic is clear and aligned with tests.

Returning the lowercase prerelease identifier here is consistent with the updated test matrix (alpha, beta, rc, canary, next).

apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx (2)

21-33: Nice shift to event-driven timing guidance.

This update clearly separates inline vs background judgments and points readers to the right lifecycle hook without eval-specific coupling.


257-360: The background-monitoring section is directionally consistent with v3 messaging.

Using completed + ctx.waitUntil() and linking to Events & Lifecycle keeps this cookbook aligned with the modular/event-first architecture.

concurrency:
group: release-next-${{ github.ref_name }}
cancel-in-progress: false
cancel-in-progress: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Note: cancel-in-progress: true may cancel in-flight releases.

With this setting, if a new commit is pushed to main while a release is in progress, the running release will be cancelled. This could leave npm in a partially-published state where some packages have the new version and others don't.

Consider whether this is acceptable for your release workflow. An alternative is to keep cancel-in-progress: false (or remove it, as false is the default) to ensure releases complete fully before the next one starts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release-next.yaml at line 14, The workflow currently sets
cancel-in-progress: true which can abort in-flight releases and leave npm
packages half-published; change the cancel-in-progress setting to false (or
remove the cancel-in-progress line entirely) so releases are allowed to finish
before new runs start, ensuring the release job completes atomically.

Comment on lines +17 to +24
* bun create agentuity@^3.0.0-alpha.0 → @agentuity/cli@alpha
* bun create agentuity@^2.0.0-beta.1 → @agentuity/cli@beta
* bun create agentuity@^2.0.0-rc.2 → @agentuity/cli@rc
* bun create agentuity → @agentuity/cli@latest
* bun create agentuity@2.0.2 → @agentuity/cli@2.0.2 (exact)
*
* Prerelease versions use their dist-tag instead:
* - Beta versions (-beta.) → @beta
* - Other prereleases (-alpha., -rc., etc.) → @next
* For stable versions (no prerelease), we use the exact version number
* so that `bun create agentuity@2.0.2` pins to that specific CLI version.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix JSDoc example mismatch for stable behavior.

Line 20 says no-version usage maps to @agentuity/cli@latest, but for stable pkg.version this function returns the exact version string. The example should reflect exact version pinning to avoid confusion.

Proposed doc fix
- *   bun create agentuity                  → `@agentuity/cli`@latest
+ *   bun create agentuity                  → `@agentuity/cli`@<resolved-version> (exact)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* bun create agentuity@^3.0.0-alpha.0 @agentuity/cli@alpha
* bun create agentuity@^2.0.0-beta.1 @agentuity/cli@beta
* bun create agentuity@^2.0.0-rc.2 @agentuity/cli@rc
* bun create agentuity @agentuity/cli@latest
* bun create agentuity@2.0.2 @agentuity/cli@2.0.2 (exact)
*
* Prerelease versions use their dist-tag instead:
* - Beta versions (-beta.) @beta
* - Other prereleases (-alpha., -rc., etc.) @next
* For stable versions (no prerelease), we use the exact version number
* so that `bun create agentuity@2.0.2` pins to that specific CLI version.
* bun create agentuity@^3.0.0-alpha.0 `@agentuity/cli`@alpha
* bun create agentuity@^2.0.0-beta.1 `@agentuity/cli`@beta
* bun create agentuity@^2.0.0-rc.2 `@agentuity/cli`@rc
* bun create agentuity `@agentuity/cli`@<resolved-version> (exact)
* bun create agentuity@2.0.2 `@agentuity/cli`@2.0.2 (exact)
*
* For stable versions (no prerelease), we use the exact version number
* so that `bun create agentuity@2.0.2` pins to that specific CLI version.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/create-agentuity/bin.js` around lines 17 - 24, The JSDoc examples
incorrectly state that no-version usage maps to `@agentuity/cli`@latest; update
the comment to show that for stable (non-prerelease) pkg.version the function
pins to the exact version string (e.g., "bun create agentuity →
`@agentuity/cli`@<pkg.version>"). Edit the comment block around the examples to
replace the "→ `@agentuity/cli`@latest" line with a concrete example that uses
pkg.version (reference pkg.version in the comment) so docs match the actual
behavior of the version-resolution logic.

Comment on lines +286 to 293
const output = ctx.state.get('output') as { answer: string };
const input = ctx.state.get('input') as { question: string };

const { object } = await generateObject({
model: groq('openai/gpt-oss-120b'),
schema: HelpfulnessJudgment,
prompt: `Evaluate this response.

Question: ${input.question}
Response: ${output.answer}

SCORE:
- helpfulness: How useful is this response? (0 = useless, 1 = extremely helpful)

CHECKS:
- answersQuestion: Does it directly answer what was asked?
- actionable: Can the user act on this information?`,
prompt: `Evaluate this response.\nQuestion: ${input.question}\nResponse: ${output.answer}\nScore helpfulness 0-1.`,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether completed-event context reliably seeds `ctx.state` with `input` and `output`.
# Expected: either explicit state.set(...) of these keys in runtime internals,
# or type-level/event docs guaranteeing availability.

rg -n -C3 --type=ts "addEventListener\\('completed'"

rg -n -C3 --type=ts "state\\.get\\(['\"](input|output)['\"]\\)"

rg -n -C3 --type=ts "state\\.set\\(['\"](input|output)['\"]\\)"

rg -n -C3 --type=ts "interface .*EventContext|type .*EventContext|class .*Context"

Repository: agentuity/sdk

Length of output: 2089


🏁 Script executed:

# First, let's read the llm-as-a-judge.mdx file to see full context around lines 286-293
wc -l apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx

Repository: agentuity/sdk

Length of output: 123


🏁 Script executed:

# Check the file context (read around the target lines with more context)
sed -n '250,320p' apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx | cat -n

Repository: agentuity/sdk

Length of output: 2455


🏁 Script executed:

# Search for where input/output state keys are set in the codebase
rg -n "state\.set\(" --type=ts -A 2 | grep -E "(input|output)" -B 2 -A 2

Repository: agentuity/sdk

Length of output: 472


🏁 Script executed:

# Find type definitions for the completed event handler context
rg -n "CompletedEvent|completed.*event" --type=ts | head -20

Repository: agentuity/sdk

Length of output: 423


🏁 Script executed:

# Search for EventContext or similar type definitions
fd -e ts -e d.ts | xargs rg -l "EventContext|CompletedEventHandler" | head -10

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Search for where input/output might be populated in state during agent lifecycle
fd -e ts -e d.ts | xargs rg -l "state\\.set" | head -20

Repository: agentuity/sdk

Length of output: 358


🏁 Script executed:

# Find type definitions for CompletedEvent or event context
fd -e ts -e d.ts | xargs rg "interface.*EventContext|type.*EventContext" | head -20

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Search for runtime implementation files that handle the completed event
fd -e ts | xargs rg "addEventListener.*completed" -l | head -10

Repository: agentuity/sdk

Length of output: 148


🏁 Script executed:

# Look for where agent state is initialized or populated with input/output
fd -e ts | xargs rg "ctx\\.state|this\\.state" -B 3 -A 3 | grep -E "(input|output)" -B 3 -A 3 | head -30

Repository: agentuity/sdk

Length of output: 2003


🏁 Script executed:

# Check the main agent runtime file for state handling
fd -e ts -path "*/runtime/*" -name "*agent*" -o -path "*/runtime/*" -name "*context*" | head -20

Repository: agentuity/sdk

Length of output: 225


🏁 Script executed:

# Find the runtime package directory
find . -name "package.json" -exec grep -l "@agentuity/runtime\|agentuity.*runtime" {} \; | head -10

Repository: agentuity/sdk

Length of output: 451


🏁 Script executed:

# Look for the actual runtime source in packages
fd -type d -path "*/packages/*/src" | head -20

Repository: agentuity/sdk

Length of output: 227


🏁 Script executed:

# Search for where the handler is invoked and state is managed
rg -n "handler\(" --type=ts -B 5 -A 5 | head -40

Repository: agentuity/sdk

Length of output: 2485


🏁 Script executed:

# Look for type definitions related to agent context
rg -n "type.*Context\|interface.*Context" --type=ts | grep -i "event\|agent\|handler" | head -20

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Check if input/output from schema are automatically put into state
rg -n "schema.*input\|schema.*output" --type=ts -A 3 -B 3 | head -40

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# List packages directory structure
ls -la packages/ | head -30

Repository: agentuity/sdk

Length of output: 1771


🏁 Script executed:

# Find files related to runtime module
find packages -name "*.ts" -o -name "*.d.ts" | xargs grep -l "addEventListener\|createAgent" | head -10

Repository: agentuity/sdk

Length of output: 478


🏁 Script executed:

# Look for event types and context definitions
rg -n "EventName|event.*handler|CompletedEvent" --type=ts | head -30

Repository: agentuity/sdk

Length of output: 1429


🏁 Script executed:

# Check the main index/types export of runtime if it exists
find packages -name "index.ts" -o -name "index.d.ts" | xargs rg "export.*Event\|export.*Context" 2>/dev/null | head -30

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Search documentation about state in the docs
rg -n "ctx\.state\|state\.get\|state\.set" apps/docs/src --type=mdx | head -30

Repository: agentuity/sdk

Length of output: 84


🏁 Script executed:

# Look at runtime package structure
ls -la packages/runtime/

Repository: agentuity/sdk

Length of output: 411


🏁 Script executed:

# Check runtime source files
find packages/runtime -name "*.ts" -o -name "*.d.ts" | head -20

Repository: agentuity/sdk

Length of output: 86


🏁 Script executed:

# Search for EventContext, CompletedEvent, or similar types in runtime
find packages/runtime -name "*.ts" -o -name "*.d.ts" | xargs rg "type.*Event|interface.*Event|type.*Context" 2>/dev/null | head -40

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Look at the main runtime entry point
cat packages/runtime/package.json

Repository: agentuity/sdk

Length of output: 991


🏁 Script executed:

# Search for addEventListener implementation and its type signature
find packages/runtime -name "*.ts" | xargs rg "addEventListener" -A 10 -B 2 | head -60

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Check if there's documentation about state contract in events
find packages/runtime -name "*.ts" | xargs rg "state\.get\|state\.set" -B 3 -A 3 | head -50

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Read the runtime index.ts file
cat packages/runtime/src/index.ts

Repository: agentuity/sdk

Length of output: 2188


🏁 Script executed:

# Check the full imports and context of the llm-as-a-judge.mdx file
head -50 apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx

Repository: agentuity/sdk

Length of output: 2093


🏁 Script executed:

# Check if there are other createAgent implementations
rg "export.*createAgent" --type=ts | head -20

Repository: agentuity/sdk

Length of output: 955


🏁 Script executed:

# Verify the import in the mdx file more carefully
sed -n '1,30p' apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx

Repository: agentuity/sdk

Length of output: 1282


🏁 Script executed:

# Look at the full code block context in the mdx file (lines ~280-310)
sed -n '280,310p' apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx

Repository: agentuity/sdk

Length of output: 923


🏁 Script executed:

# Check if there's a different createAgent in the codebase (like from `@agentuity/sdk` or other packages)
rg "export function createAgent|export const createAgent" --type=ts | head -20

Repository: agentuity/sdk

Length of output: 218


🏁 Script executed:

# Check if there's an actual current createAgent implementation (not deprecated)
rg "function createAgent\|const createAgent" --type=ts | grep -v "test\|deprecated" | head -20

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Look at the opencode agents to see what createAgent they're using
head -20 packages/opencode/src/agents/expert-backend.ts

Repository: agentuity/sdk

Length of output: 1017


🏁 Script executed:

# Check package.json of opencode to see what it imports
cat packages/opencode/package.json | grep -A 5 "dependencies\|import"

Repository: agentuity/sdk

Length of output: 189


🏁 Script executed:

# Look for where state is actually populated in agent lifecycle in real examples
rg "ctx\.state" packages/opencode/src/agents -B 5 -A 5

Repository: agentuity/sdk

Length of output: 1903


🏁 Script executed:

# Check if there are any type definitions for the agent context in the modern code
find packages -name "*.d.ts" | xargs rg "EventContext\|CompletedEvent" 2>/dev/null | head -20

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Look at real agent examples to understand current patterns
cat packages/opencode/src/agents/expert-backend.ts | head -150

Repository: agentuity/sdk

Length of output: 5054


🏁 Script executed:

# Check if there's any documentation about state being populated with input/output automatically
rg "state.*input|state.*output" apps/docs --type=md -B 3 -A 3 | head -60

Repository: agentuity/sdk

Length of output: 944


🏁 Script executed:

# Look for where the runtime might auto-populate state in the completed event
find packages/core packages/server packages/adapter -name "*.ts" | xargs rg "state\.set.*input\|state\.set.*output" 2>/dev/null | head -20

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Check the Events & Lifecycle documentation referenced in the mdx file
find apps/docs -name "*event*" -o -name "*lifecycle*" | head -20

Repository: agentuity/sdk

Length of output: 296


🏁 Script executed:

# Look at the full context of the completed event usage in the documented example
sed -n '270,330p' apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx

Repository: agentuity/sdk

Length of output: 1585


🏁 Script executed:

# Check the Events & Lifecycle documentation for information about state contract
cat apps/docs/src/web/content/agents/events-lifecycle.mdx

Repository: agentuity/sdk

Length of output: 6886


🏁 Script executed:

# Check if the event object passed to addEventListener contains input/output
sed -n '1,100p' apps/docs/src/web/content/agents/events-lifecycle.mdx | grep -A 20 "addEventListener"

Repository: agentuity/sdk

Length of output: 2171


🏁 Script executed:

# Look for type definitions of the event parameter in addEventListener
rg "event\.input|event\.output|event\.result" apps/docs --type=mdx

Repository: agentuity/sdk

Length of output: 84


🏁 Script executed:

# Check if there's any place where output is explicitly set to state before the completed event
rg "state\.set.*output\|state\.set.*answer" apps/docs

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Look at the doc_processing example which was shown earlier to see if it sets state
cat apps/docs/src/agent/doc_processing/agent.ts

Repository: agentuity/sdk

Length of output: 2398


🏁 Script executed:

# Check if the event object contains input/output properties
rg "event\.input|event\.output|interface.*Event|type.*Event" packages --type=ts | grep -i "input\|output" | head -20

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Look for any pattern where input/output are explicitly set in state before the completed event
rg "state\.set.*input|state\.set.*output" apps/docs/src/agent --type=ts

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Check if there's any middleware or wrapper that auto-populates state with input/output
rg "addEventListener.*completed" packages --type=ts -A 20 | head -80

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Search for where the completed event is actually emitted to see what context it receives
rg "emit.*completed|dispatch.*completed" packages --type=ts -B 5 -A 10 | head -60

Repository: agentuity/sdk

Length of output: 39


🏁 Script executed:

# Final check: verify that the createAgent example in llm-as-a-judge never sets input/output in state before completed event
sed -n '200,295p' apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx

Repository: agentuity/sdk

Length of output: 2918


🏁 Script executed:

# Also check if there's any documentation that says input/output are auto-populated in state
rg "input.*output.*state|state.*input.*output" apps/docs --type=md -B 2 -A 2 | head -40

Repository: agentuity/sdk

Length of output: 39


Guard ctx.state reads in the example against missing input/output.

The example assumes input and output are always present in ctx.state without defensive checks. Unlike the handler (which returns the output directly), there's no explicit code that populates these state keys before the completed event fires. If this contract is absent, the sample will throw at runtime.

Add optional typing and a guard check:

Proposed fix
 agent.addEventListener('completed', (event, agent, ctx) => {
   ctx.waitUntil(async () => {
-    const output = ctx.state.get('output') as { answer: string };
-    const input = ctx.state.get('input') as { question: string };
+    const output = ctx.state.get('output') as { answer: string } | undefined;
+    const input = ctx.state.get('input') as { question: string } | undefined;
+
+    if (!input?.question || !output?.answer) {
+      ctx.logger.warn('Skipping background quality check: missing input/output in state');
+      return;
+    }
 
     const { object } = await generateObject({
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const output = ctx.state.get('output') as { answer: string };
const input = ctx.state.get('input') as { question: string };
const { object } = await generateObject({
model: groq('openai/gpt-oss-120b'),
schema: HelpfulnessJudgment,
prompt: `Evaluate this response.
Question: ${input.question}
Response: ${output.answer}
SCORE:
- helpfulness: How useful is this response? (0 = useless, 1 = extremely helpful)
CHECKS:
- answersQuestion: Does it directly answer what was asked?
- actionable: Can the user act on this information?`,
prompt: `Evaluate this response.\nQuestion: ${input.question}\nResponse: ${output.answer}\nScore helpfulness 0-1.`,
});
const output = ctx.state.get('output') as { answer: string } | undefined;
const input = ctx.state.get('input') as { question: string } | undefined;
if (!input?.question || !output?.answer) {
ctx.logger.warn('Skipping background quality check: missing input/output in state');
return;
}
const { object } = await generateObject({
model: groq('openai/gpt-oss-120b'),
schema: HelpfulnessJudgment,
prompt: `Evaluate this response.\nQuestion: ${input.question}\nResponse: ${output.answer}\nScore helpfulness 0-1.`,
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/docs/src/web/content/cookbook/patterns/llm-as-a-judge.mdx` around lines
286 - 293, Guard reads of ctx.state.get('output') and ctx.state.get('input')
before using them: check that ctx.state.get('input') and ctx.state.get('output')
are present (and have the expected shapes) and handle the missing case (return
early, throw a clear error, or provide defaults) before calling generateObject;
reference the existing symbols in the snippet (ctx.state.get('input'),
ctx.state.get('output'), HelpfulnessJudgment, generateObject) and ensure the
code uses optional typing or null checks to avoid accessing .question or .answer
when those keys are absent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants