refactor(harnesses): extract Harness interface + thread store/events refactor by tlgimenes · Pull Request #3381 · decocms/studio

tlgimenes · 2026-05-15T15:32:08Z

Summary

Re-opens the work from #3358 (accidentally merged then reverted) plus the follow-up thread-store and chat refactors.

Harness interface — introduces a shared Harness abstraction with factory-based registry and localDispatch; migrates decopilot, claude-code, and codex into the new harnesses layout (system-prompt, built-in-tools, agents/prompts/connections blocks, title-generator, tool/prompt assembly, streamText loop). stream-core now dispatches through the harness registry.
Threads module — new thread-filter, thread-store, thread-actions, useThreads, useThreadActions, and ThreadEventsBridge SSE patcher. useTaskManager collapses to a thin facade; useTasksAutoRefresh/useTasks/updateTaskInCache removed.
ThreadChatStore — new client-side store: snapshot/subscribe, optimistic sendMessage, sub-stream demuxer with finish promotion, SSE finish backstop, continuation seeding on start.messageId match, tool output/approval handlers + auto-send, lifecycle (connect/disconnect/stop) with reconnect discard. useThreadChat collapses to a thin wrapper.
Chat fixes — thread toolApprovalLevel through the approval continuation cascade; patch server-snapshot assistant when opening mid requires_action.

Test plan

bun run check
bun run lint
bun test
Smoke: open a chat mid requires_action and confirm the assistant snapshot patches correctly
Smoke: tool-approval level persists through the accept/continuation cascade
Smoke: tasks panel + thread list still update via ThreadEventsBridge

🤖 Generated with Claude Code

Summary by cubic

Introduces a shared Harness interface and registry with localDispatch, moving decopilot, claude-code, and codex into modular harnesses. Rebuilds thread/chat flow with a single thread store and SSE bridge, and fixes tool-approval continuation and mid-requires_action open cases.

New Features
- harnesses: factory-based registry, localDispatch(id, input, ctx), shared types and usage accumulator.
- decopilot harness: extracted system-prompt, built-in tools, prompt assembly, and streamText loop.
- CLI harnesses: claude-code and codex wrappers with UI→model message conversion, resume (Claude) and safe provider cleanup.
- language-model factory: builds models from provider + ModelInfo, auto-enables reasoning when supported.
- Threads: ThreadEventsBridge SSE patcher and useThreads + useThreadActions for fast, optimistic updates.
- Chat: ThreadChatStore with snapshot/subscribe, optimistic sendMessage, sub-stream demux, continuation seeding, and lifecycle management.
- Models: Codex list updated (adds codex:gpt-5.5, revises tiers).
Migration
- Replace useTasks/useTasksAutoRefresh with useThreads and ThreadEventsBridge (mounted in org shell).
- Use useThreadActions for thread CRUD; legacy updateTaskInCache and related hooks are removed.
- Switch query keys to KEYS.threads(...)/KEYS.threadsPrefix(...); filter dimensions are client-side only.

^{Written for commit 84239e5. Summary will update on new commits. Review in cubic}

…ecopilot/claude-code/codex Establishes a Harness abstraction with factory-based registry and localDispatch. Migrates decopilot, claude-code, and codex into the new harnesses layout (system-prompt, built-in-tools, agents/prompts/connections blocks, title-generator, tool/prompt assembly, streamText loop). stream-core now dispatches through the harness registry. Includes: - CLI harness fixes (UIMessage->ModelMessage conversion, usage tracking, thread_id emission) - Shared stream-error helpers - AI SDK system-in-messages warning suppression - Codex tier defaults and model list sync - Sandbox image corepack retry - Various dependency bumps (React, AI SDK, OpenRouter, Anthropic, MCP SDK) Squashed from 11 commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…lter tests

…Threads

…ridge)

…cade Two coupled changes that together preserve the user's tool-approval-level selection through the synchronous accept/continuation cascade: * highlight/approval.tsx + highlight/index.tsx: pass the freshly-selected approval level explicitly when the yolo dropdown triggers Accept All. setPreferences hasn't propagated to React state (or, in mutation-timing edge cases, to localStorage) by the time the cascade fires this same tick, so the explicit argument carries the value the user just chose. * chat-context.tsx: trust requestMetadata.toolApprovalLevel when supplied; fall back to localStorage only for legacy callers. * use-thread-chat.ts: add continuation seeding for tool output / approval responses. When the backend resumes the same assistant message after addToolOutput / addToolApprovalResponse, the next sub-stream's start chunk carries the existing messageId — promote the local assistant into streamingStore so readUIMessageStream seeds from it instead of throwing "No tool invocation found" on the tool-output-available chunk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…e + useThreadActions)

…egacy KEYS.tasks Migrate 4 useTaskActions callers to useThreadActions (absorbs the full low-level CRUD surface). Replace tasksPrefix cache invalidations with targeted prepends (for creates) or bridge-handles comments (for updates). Remove legacy tasks() and tasksPrefix() keys from query-keys.ts. Delete use-tasks.ts hook. Update test assertions to use threadsPrefix. Fix unused variable in use-task-expanded-tools. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…atch

…k 13)

…onnect discard

…tStore The hook now owns only React-level concerns: - useRef for latest-handler closures - useState for the stable ThreadChatStore instance - one useEffect for connect/disconnect lifecycle - useSyncExternalStore for snapshot subscription - mergeWithServer in React because it depends on the initialMessages prop All imperative state — SSE persistent loop, sub-stream demuxer, SSE-finish backstop counter, continuation seeding, optimistic message buffer — moves into ThreadChatStore (apps/mesh/src/web/components/chat/hooks/thread-chat-store.ts) where it is unit-testable without a DOM. The single useEffect is justified for connect/disconnect lifecycle (the imperative connection cannot be modeled as derived state). Suppressed ban-use-effect via oxlint-disable-next-line. File shrinks from 843 to 88 lines. Public hook signature unchanged; chat-context.tsx consumer typechecks without modification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two coupled fixes for the case where a thread is opened while a `requires_action` is already pending (no in-flight streaming, no local assistant — the message we need to patch lives only in initialMessages delivered by useTaskMessages). * ThreadChatStore now accepts an optional `initialMessagesRef` and reads it as a fallback in patchLastAssistant: if streaming is empty and local has no trailing assistant, promote a patched copy of the server snapshot's last assistant into local. The store remains prop-agnostic — initialMessagesRef is just a live read, not a coupling. * useThreadChat passes the ref and rewrites the render-time merge so a streaming message that shares an id with the server snapshot replaces the matching entry in place instead of duplicating it. The previous code appended streaming unconditionally, which produced "Encountered two children with the same key" once T13 made the continuation path reachable. Verified end-to-end in the browser: opening a thread with 3 pending approvals, clicking Accept All, the backend resumes the run, tool calls execute, and the final report renders with zero React warnings and zero console errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-15T15:32:30Z

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction	Action
👍	Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

github-actions · 2026-05-15T15:32:38Z

Release Options

Suggested: Patch (2.327.6) — based on refactor: prefix

React with an emoji to override the release type:

Reaction	Type	Next Version
👍	Prerelease	`2.327.6-alpha.1`
🎉	Patch	`2.327.6`
❤️	Minor	`2.328.0`
🚀	Major	`3.0.0`

Current version: 2.327.5

Note: If multiple reactions exist, the smallest bump wins. If no reactions, the suggested bump is used (default: patch).

cubic-dev-ai

6 issues found across 102 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/harnesses/decopilot/stream-error.ts">

<violation number="1" location="apps/mesh/src/harnesses/decopilot/stream-error.ts:29">
P2: When all sentences are filtered, sanitization falls back to the original message and leaks provider-specific content.</violation>

<violation number="2" location="apps/mesh/src/harnesses/decopilot/stream-error.ts:57">
P2: Non-credit errors are returned unsanitized, which can leak provider-specific details despite this helper’s sanitization contract.</violation>
</file>

<file name="apps/mesh/src/web/hooks/use-ensure-task.ts">

<violation number="1" location="apps/mesh/src/web/hooks/use-ensure-task.ts:102">
P2: Avoid unconditionally prepending `createdRow` to threads caches; add an `id` dedupe guard to prevent duplicate thread entries.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/task/thread-events.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/task/thread-events.tsx:33">
P1: Avoid inserting synthetic rows for unknown thread IDs when patching all thread scopes; this can leak a thread into unrelated agent-scoped caches.</violation>

<violation number="2" location="apps/mesh/src/web/components/chat/task/thread-events.tsx:59">
P2: Re-sort the cached thread list after updating `updated_at`; otherwise recent threads can stay out of order.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/task/thread-store.ts">

<violation number="1" location="apps/mesh/src/web/components/chat/task/thread-store.ts:45">
P2: Include `status` in the React Query key; otherwise open and archived thread lists collide in cache and can show stale/wrong data.</violation>
</file>

_{Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.

On a pro plan you can use ultrareview for larger PRs.
Re-trigger cubic}

cubic-dev-ai · 2026-05-15T15:37:17Z

+  if (!data) return data;
+  const idx = data.items.findIndex((t) => t.id === patch.id);
+
+  if (idx === -1) {


P1: Avoid inserting synthetic rows for unknown thread IDs when patching all thread scopes; this can leak a thread into unrelated agent-scoped caches.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/chat/task/thread-events.tsx, line 33: <comment>Avoid inserting synthetic rows for unknown thread IDs when patching all thread scopes; this can leak a thread into unrelated agent-scoped caches.</comment> <file context> @@ -0,0 +1,87 @@ + if (!data) return data; + const idx = data.items.findIndex((t) => t.id === patch.id); + + if (idx === -1) { + // Thread not in this cache yet (created in another tab, or never + // loaded). Insert at the top with a minimal synthetic row; the next </file context>

cubic-dev-ai · 2026-05-15T15:37:17Z

+  const cleaned = sentences.filter(
+    (s) => !/https?:\/\//i.test(s) && !/openrouter/i.test(s),
+  );
+  if (cleaned.length === 0) return message;


P2: When all sentences are filtered, sanitization falls back to the original message and leaks provider-specific content.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/harnesses/decopilot/stream-error.ts, line 29: <comment>When all sentences are filtered, sanitization falls back to the original message and leaks provider-specific content.</comment> <file context> @@ -0,0 +1,60 @@ + const cleaned = sentences.filter( + (s) => !/https?:\/\//i.test(s) && !/openrouter/i.test(s), + ); + if (cleaned.length === 0) return message; + const result = cleaned.join(". ").trim(); + return result.endsWith(".") ? result : `${result}.`; </file context>

cubic-dev-ai · 2026-05-15T15:37:17Z

+      // without fragile string matching on provider-specific messages.
+      return `[CREDITS] ${stripProviderSpecificDetails(error.message)}`;
+    }
+    return error.message;


P2: Non-credit errors are returned unsanitized, which can leak provider-specific details despite this helper’s sanitization contract.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/harnesses/decopilot/stream-error.ts, line 57: <comment>Non-credit errors are returned unsanitized, which can leak provider-specific details despite this helper’s sanitization contract.</comment> <file context> @@ -0,0 +1,60 @@ + // without fragile string matching on provider-specific messages. + return `[CREDITS] ${stripProviderSpecificDetails(error.message)}`; + } + return error.message; + } + return stringifyError(error); </file context>

cubic-dev-ai · 2026-05-15T15:37:17Z

+          if (!cached) return cached;
+          return {
+            ...cached,
+            items: [createdRow, ...cached.items],


P2: Avoid unconditionally prepending createdRow to threads caches; add an id dedupe guard to prevent duplicate thread entries.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/hooks/use-ensure-task.ts, line 102: <comment>Avoid unconditionally prepending `createdRow` to threads caches; add an `id` dedupe guard to prevent duplicate thread entries.</comment> <file context> @@ -83,20 +83,26 @@ export function useEnsureTask(id: string, virtualMcpId: string): State { + if (!cached) return cached; + return { + ...cached, + items: [createdRow, ...cached.items], + }; + }, </file context>

cubic-dev-ai · 2026-05-15T15:37:17Z

+  items[idx] = next;
+  return { ...data, items };


P2: Re-sort the cached thread list after updating updated_at; otherwise recent threads can stay out of order.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/chat/task/thread-events.tsx, line 59: <comment>Re-sort the cached thread list after updating `updated_at`; otherwise recent threads can stay out of order.</comment> <file context> @@ -0,0 +1,87 @@ + branch: "branch" in patch ? (patch.branch ?? null) : current.branch, + }; + const items = [...data.items]; + items[idx] = next; + return { ...data, items }; +} </file context>

Suggested change

items[idx] = next;

return { ...data, items };

items[idx] = next;

items.sort((a, b) => b.updated_at.localeCompare(a.updated_at));

return { ...data, items };

cubic-dev-ai · 2026-05-15T15:37:17Z

+  });
+
+  const { data, refetch } = useSuspenseQuery<TasksQueryData>({
+    queryKey: KEYS.threads(locator, scope),


P2: Include status in the React Query key; otherwise open and archived thread lists collide in cache and can show stale/wrong data.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/chat/task/thread-store.ts, line 45: <comment>Include `status` in the React Query key; otherwise open and archived thread lists collide in cache and can show stale/wrong data.</comment> <file context> @@ -0,0 +1,86 @@ + }); + + const { data, refetch } = useSuspenseQuery<TasksQueryData>({ + queryKey: KEYS.threads(locator, scope), + queryFn: async () => { + if (!client) throw new Error("MCP client is not available"); </file context>

tlgimenes and others added 30 commits May 15, 2026 12:31

feat(threads): add pure thread-filter module

7d66869

refactor(threads): tighten thread-filter guards to undefined-checks

fc921f9

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(threads): add KEYS.threads/threadsPrefix

daf3a63

fix(threads): add non-null assertions for strict indexed-access in fi…

da96df4

…lter tests

feat(threads): add useThreads base hook

ba242ed

feat(threads): add ThreadEventsBridge SSE patcher

20b27e6

feat(threads): mount ThreadEventsBridge in org shell

f7909e2

feat(threads): add useThreadActions with optimistic patches

3df10a7

refactor(threads): migrate panel-state to useThreads

4665e4a

refactor(threads): collapse tasks-panel to single useThreads + filter…

0473595

…Threads

refactor(threads): make useTaskManager a thin facade over thread module

cd75258

chore(threads): delete useTasksAutoRefresh (replaced by ThreadEventsB…

44958f5

…ridge)

refactor(threads): strip thread-list invalidation from useStreamManager

216e71c

chore(threads): drop updateTaskInCache (replaced by ThreadEventsBridg…

9a55546

…e + useThreadActions)

feat(chat): scaffold ThreadChatStore with snapshot/subscribe

5d434b9

test(threads): assert cache-key contract

35bdde9

fix(chat): silence unused-field TS errors in ThreadChatStore scaffold

e772649

feat(chat): ThreadChatStore.sendMessage with optimistic append + POST

8cbc4da

feat(chat): ThreadChatStore.setMessages + mergeWithServer helper

147abb3

docs(threads): document public API and cache contract

53f7359

feat(chat): ThreadChatStore chunk fan-out (onData, onToolCall)

f0d5e35

feat(chat): ThreadChatStore sub-stream demuxer with finish promotion

a2a62f4

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(chat): ThreadChatStore SSE finish backstop with counter

b7e6565

feat(chat): ThreadChatStore continuation seeding on start.messageId m…

f1143e3

…atch

feat(chat): ThreadChatStore tool output/approval handlers + auto-send

4f13faa

fix(threads): migrate chat-context to useThreadActions (missed in Tas…

0c2ae5d

…k 13)

feat(chat): ThreadChatStore lifecycle (connect/disconnect/stop) + rec…

064b812

…onnect discard

tlgimenes and others added 2 commits May 15, 2026 12:31

cubic-dev-ai Bot reviewed May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(harnesses): extract Harness interface + thread store/events refactor#3381

refactor(harnesses): extract Harness interface + thread store/events refactor#3381
tlgimenes wants to merge 32 commits into
mainfrom
tlgimenes/harness-shared-interface

tlgimenes commented May 15, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot May 15, 2026

Uh oh!

cubic-dev-ai Bot May 15, 2026

Uh oh!

cubic-dev-ai Bot May 15, 2026

Uh oh!

cubic-dev-ai Bot May 15, 2026

Uh oh!

cubic-dev-ai Bot May 15, 2026

Uh oh!

cubic-dev-ai Bot May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tlgimenes commented May 15, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by cubic

Uh oh!

github-actions Bot commented May 15, 2026

🧪 Benchmark

Uh oh!

github-actions Bot commented May 15, 2026

Release Options

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tlgimenes commented May 15, 2026 •

edited by cubic-dev-ai Bot

Loading