refactor(harnesses): extract Harness interface + thread store/events refactor#3381
refactor(harnesses): extract Harness interface + thread store/events refactor#3381tlgimenes wants to merge 32 commits into
Conversation
…ecopilot/claude-code/codex Establishes a Harness abstraction with factory-based registry and localDispatch. Migrates decopilot, claude-code, and codex into the new harnesses layout (system-prompt, built-in-tools, agents/prompts/connections blocks, title-generator, tool/prompt assembly, streamText loop). stream-core now dispatches through the harness registry. Includes: - CLI harness fixes (UIMessage->ModelMessage conversion, usage tracking, thread_id emission) - Shared stream-error helpers - AI SDK system-in-messages warning suppression - Codex tier defaults and model list sync - Sandbox image corepack retry - Various dependency bumps (React, AI SDK, OpenRouter, Anthropic, MCP SDK) Squashed from 11 commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cade Two coupled changes that together preserve the user's tool-approval-level selection through the synchronous accept/continuation cascade: * highlight/approval.tsx + highlight/index.tsx: pass the freshly-selected approval level explicitly when the yolo dropdown triggers Accept All. setPreferences hasn't propagated to React state (or, in mutation-timing edge cases, to localStorage) by the time the cascade fires this same tick, so the explicit argument carries the value the user just chose. * chat-context.tsx: trust requestMetadata.toolApprovalLevel when supplied; fall back to localStorage only for legacy callers. * use-thread-chat.ts: add continuation seeding for tool output / approval responses. When the backend resumes the same assistant message after addToolOutput / addToolApprovalResponse, the next sub-stream's start chunk carries the existing messageId — promote the local assistant into streamingStore so readUIMessageStream seeds from it instead of throwing "No tool invocation found" on the tool-output-available chunk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e + useThreadActions)
…egacy KEYS.tasks Migrate 4 useTaskActions callers to useThreadActions (absorbs the full low-level CRUD surface). Replace tasksPrefix cache invalidations with targeted prepends (for creates) or bridge-handles comments (for updates). Remove legacy tasks() and tasksPrefix() keys from query-keys.ts. Delete use-tasks.ts hook. Update test assertions to use threadsPrefix. Fix unused variable in use-task-expanded-tools. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tStore The hook now owns only React-level concerns: - useRef for latest-handler closures - useState for the stable ThreadChatStore instance - one useEffect for connect/disconnect lifecycle - useSyncExternalStore for snapshot subscription - mergeWithServer in React because it depends on the initialMessages prop All imperative state — SSE persistent loop, sub-stream demuxer, SSE-finish backstop counter, continuation seeding, optimistic message buffer — moves into ThreadChatStore (apps/mesh/src/web/components/chat/hooks/thread-chat-store.ts) where it is unit-testable without a DOM. The single useEffect is justified for connect/disconnect lifecycle (the imperative connection cannot be modeled as derived state). Suppressed ban-use-effect via oxlint-disable-next-line. File shrinks from 843 to 88 lines. Public hook signature unchanged; chat-context.tsx consumer typechecks without modification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two coupled fixes for the case where a thread is opened while a `requires_action` is already pending (no in-flight streaming, no local assistant — the message we need to patch lives only in initialMessages delivered by useTaskMessages). * ThreadChatStore now accepts an optional `initialMessagesRef` and reads it as a fallback in patchLastAssistant: if streaming is empty and local has no trailing assistant, promote a patched copy of the server snapshot's last assistant into local. The store remains prop-agnostic — initialMessagesRef is just a live read, not a coupling. * useThreadChat passes the ref and rewrites the render-time merge so a streaming message that shares an id with the server snapshot replaces the matching entry in place instead of duplicating it. The previous code appended streaming unconditionally, which produced "Encountered two children with the same key" once T13 made the continuation path reachable. Verified end-to-end in the browser: opening a thread with 3 pending approvals, clicking Accept All, the backend resumes the run, tool calls execute, and the final report renders with zero React warnings and zero console errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🧪 BenchmarkShould we run the Virtual MCP strategy benchmark for this PR? React with 👍 to run the benchmark.
Benchmark will run on the next push after you react. |
Release OptionsSuggested: Patch ( React with an emoji to override the release type:
Current version:
|
There was a problem hiding this comment.
6 issues found across 102 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="apps/mesh/src/harnesses/decopilot/stream-error.ts">
<violation number="1" location="apps/mesh/src/harnesses/decopilot/stream-error.ts:29">
P2: When all sentences are filtered, sanitization falls back to the original message and leaks provider-specific content.</violation>
<violation number="2" location="apps/mesh/src/harnesses/decopilot/stream-error.ts:57">
P2: Non-credit errors are returned unsanitized, which can leak provider-specific details despite this helper’s sanitization contract.</violation>
</file>
<file name="apps/mesh/src/web/hooks/use-ensure-task.ts">
<violation number="1" location="apps/mesh/src/web/hooks/use-ensure-task.ts:102">
P2: Avoid unconditionally prepending `createdRow` to threads caches; add an `id` dedupe guard to prevent duplicate thread entries.</violation>
</file>
<file name="apps/mesh/src/web/components/chat/task/thread-events.tsx">
<violation number="1" location="apps/mesh/src/web/components/chat/task/thread-events.tsx:33">
P1: Avoid inserting synthetic rows for unknown thread IDs when patching all thread scopes; this can leak a thread into unrelated agent-scoped caches.</violation>
<violation number="2" location="apps/mesh/src/web/components/chat/task/thread-events.tsx:59">
P2: Re-sort the cached thread list after updating `updated_at`; otherwise recent threads can stay out of order.</violation>
</file>
<file name="apps/mesh/src/web/components/chat/task/thread-store.ts">
<violation number="1" location="apps/mesh/src/web/components/chat/task/thread-store.ts:45">
P2: Include `status` in the React Query key; otherwise open and archived thread lists collide in cache and can show stale/wrong data.</violation>
</file>
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
Re-trigger cubic
| if (!data) return data; | ||
| const idx = data.items.findIndex((t) => t.id === patch.id); | ||
|
|
||
| if (idx === -1) { |
There was a problem hiding this comment.
P1: Avoid inserting synthetic rows for unknown thread IDs when patching all thread scopes; this can leak a thread into unrelated agent-scoped caches.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/chat/task/thread-events.tsx, line 33:
<comment>Avoid inserting synthetic rows for unknown thread IDs when patching all thread scopes; this can leak a thread into unrelated agent-scoped caches.</comment>
<file context>
@@ -0,0 +1,87 @@
+ if (!data) return data;
+ const idx = data.items.findIndex((t) => t.id === patch.id);
+
+ if (idx === -1) {
+ // Thread not in this cache yet (created in another tab, or never
+ // loaded). Insert at the top with a minimal synthetic row; the next
</file context>
| const cleaned = sentences.filter( | ||
| (s) => !/https?:\/\//i.test(s) && !/openrouter/i.test(s), | ||
| ); | ||
| if (cleaned.length === 0) return message; |
There was a problem hiding this comment.
P2: When all sentences are filtered, sanitization falls back to the original message and leaks provider-specific content.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/harnesses/decopilot/stream-error.ts, line 29:
<comment>When all sentences are filtered, sanitization falls back to the original message and leaks provider-specific content.</comment>
<file context>
@@ -0,0 +1,60 @@
+ const cleaned = sentences.filter(
+ (s) => !/https?:\/\//i.test(s) && !/openrouter/i.test(s),
+ );
+ if (cleaned.length === 0) return message;
+ const result = cleaned.join(". ").trim();
+ return result.endsWith(".") ? result : `${result}.`;
</file context>
| // without fragile string matching on provider-specific messages. | ||
| return `[CREDITS] ${stripProviderSpecificDetails(error.message)}`; | ||
| } | ||
| return error.message; |
There was a problem hiding this comment.
P2: Non-credit errors are returned unsanitized, which can leak provider-specific details despite this helper’s sanitization contract.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/harnesses/decopilot/stream-error.ts, line 57:
<comment>Non-credit errors are returned unsanitized, which can leak provider-specific details despite this helper’s sanitization contract.</comment>
<file context>
@@ -0,0 +1,60 @@
+ // without fragile string matching on provider-specific messages.
+ return `[CREDITS] ${stripProviderSpecificDetails(error.message)}`;
+ }
+ return error.message;
+ }
+ return stringifyError(error);
</file context>
| if (!cached) return cached; | ||
| return { | ||
| ...cached, | ||
| items: [createdRow, ...cached.items], |
There was a problem hiding this comment.
P2: Avoid unconditionally prepending createdRow to threads caches; add an id dedupe guard to prevent duplicate thread entries.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/hooks/use-ensure-task.ts, line 102:
<comment>Avoid unconditionally prepending `createdRow` to threads caches; add an `id` dedupe guard to prevent duplicate thread entries.</comment>
<file context>
@@ -83,20 +83,26 @@ export function useEnsureTask(id: string, virtualMcpId: string): State {
+ if (!cached) return cached;
+ return {
+ ...cached,
+ items: [createdRow, ...cached.items],
+ };
+ },
</file context>
| items[idx] = next; | ||
| return { ...data, items }; |
There was a problem hiding this comment.
P2: Re-sort the cached thread list after updating updated_at; otherwise recent threads can stay out of order.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/chat/task/thread-events.tsx, line 59:
<comment>Re-sort the cached thread list after updating `updated_at`; otherwise recent threads can stay out of order.</comment>
<file context>
@@ -0,0 +1,87 @@
+ branch: "branch" in patch ? (patch.branch ?? null) : current.branch,
+ };
+ const items = [...data.items];
+ items[idx] = next;
+ return { ...data, items };
+}
</file context>
| items[idx] = next; | |
| return { ...data, items }; | |
| items[idx] = next; | |
| items.sort((a, b) => b.updated_at.localeCompare(a.updated_at)); | |
| return { ...data, items }; |
| }); | ||
|
|
||
| const { data, refetch } = useSuspenseQuery<TasksQueryData>({ | ||
| queryKey: KEYS.threads(locator, scope), |
There was a problem hiding this comment.
P2: Include status in the React Query key; otherwise open and archived thread lists collide in cache and can show stale/wrong data.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/chat/task/thread-store.ts, line 45:
<comment>Include `status` in the React Query key; otherwise open and archived thread lists collide in cache and can show stale/wrong data.</comment>
<file context>
@@ -0,0 +1,86 @@
+ });
+
+ const { data, refetch } = useSuspenseQuery<TasksQueryData>({
+ queryKey: KEYS.threads(locator, scope),
+ queryFn: async () => {
+ if (!client) throw new Error("MCP client is not available");
</file context>
Summary
Re-opens the work from #3358 (accidentally merged then reverted) plus the follow-up thread-store and chat refactors.
Harnessabstraction with factory-based registry andlocalDispatch; migratesdecopilot,claude-code, andcodexinto the new harnesses layout (system-prompt, built-in-tools, agents/prompts/connections blocks, title-generator, tool/prompt assembly, streamText loop).stream-corenow dispatches through the harness registry.thread-filter,thread-store,thread-actions,useThreads,useThreadActions, andThreadEventsBridgeSSE patcher.useTaskManagercollapses to a thin facade;useTasksAutoRefresh/useTasks/updateTaskInCacheremoved.sendMessage, sub-stream demuxer with finish promotion, SSE finish backstop, continuation seeding onstart.messageIdmatch, tool output/approval handlers + auto-send, lifecycle (connect/disconnect/stop) with reconnect discard.useThreadChatcollapses to a thin wrapper.toolApprovalLevelthrough the approval continuation cascade; patch server-snapshot assistant when opening midrequires_action.Test plan
bun run checkbun run lintbun testrequires_actionand confirm the assistant snapshot patches correctlyThreadEventsBridge🤖 Generated with Claude Code
Summary by cubic
Introduces a shared
Harnessinterface and registry withlocalDispatch, movingdecopilot,claude-code, andcodexinto modular harnesses. Rebuilds thread/chat flow with a single thread store and SSE bridge, and fixes tool-approval continuation and mid-requires_actionopen cases.New Features
harnesses: factory-based registry,localDispatch(id, input, ctx), shared types and usage accumulator.decopilotharness: extracted system-prompt, built-in tools, prompt assembly, andstreamTextloop.claude-codeandcodexwrappers with UI→model message conversion, resume (Claude) and safe provider cleanup.language-modelfactory: builds models from provider +ModelInfo, auto-enables reasoning when supported.ThreadEventsBridgeSSE patcher anduseThreads+useThreadActionsfor fast, optimistic updates.ThreadChatStorewith snapshot/subscribe, optimisticsendMessage, sub-stream demux, continuation seeding, and lifecycle management.codex:gpt-5.5, revises tiers).Migration
useTasks/useTasksAutoRefreshwithuseThreadsandThreadEventsBridge(mounted in org shell).useThreadActionsfor thread CRUD; legacyupdateTaskInCacheand related hooks are removed.KEYS.threads(...)/KEYS.threadsPrefix(...); filter dimensions are client-side only.Written for commit 84239e5. Summary will update on new commits. Review in cubic