fix(tui): resolve checkpoint mismatch and optimize rendering performance by EngineerProjects · Pull Request #34 · EngineerProjects/nexus-engine

EngineerProjects · 2026-06-08T07:32:43Z

Resolves the checkpoint mismatch issue on session load/delete and optimizes the TUI rendering performance to eliminate keystroke and scrolling latency.

Header now shows ● execute (muted) / ◈ plan (orange) / ◎ pair (lighter orange) via ExecutionMode() instead of the binary PlanMode() check. AddToolProgress intercepts enter/exit_pair_programming_mode alongside the existing plan mode tools — updates pairDepth counter, suppresses tool rows.

…anel The engine's FormatTextWithLineNumbers embeds "File:/Lines:" header and N→ line-number prefixes in the content metadata field. Glamour cannot parse that as markdown, producing raw symbol noise in the detail sidebar. parseReadContent strips the header block and N→ prefixes, returning the clean file body plus the 0-based start line. detailBody and inlinePreview now use the clean body; code files pass startLine as offset to renderCodeBody so line numbers still reflect actual file positions.

Deleting the active session from the sessions panel now immediately: - clears m.activeSession and resets to the welcome screen - clears the chat (messages, tool selections, planDepth, pairDepth) - clears lastTurnErr / lastErr / busy so no stale error leaks into new sessions Clear() also resets planDepth and pairDepth so mode badges start clean on every session switch.

Previously, loading a previous session showed only "Resumed session" with an empty chat. Now the full conversation is replayed from the stored transcript: user messages, assistant text, thinking blocks, and completed tool rows (with their input/result metadata for the detail sidebar). The conversion happens in buildSessionHistory (cmd/cli/tui.go), which pairs each ToolUseContent with its matching ToolResultContent, then sends the result as []HistoryEntry in SessionLoadedMsg. Also exports sdk.ToolResultContent from pkg/sdk/types.go.

buildSessionHistory now reads ToolResultContent.Metadata (already written by buildToolResultMessages in the engine) which carries the complete TUI metadata map: content, execution_duration_ms, lines_added, lines_removed, exit_code, cwd, type, url, title, provider, result_count, etc. The replay loop in model.go copies this map and injects tool_input so all detail panel renderers (file content, diff, bash output, web results) work exactly as during the live session. HistoryTool.Result removed in favour of HistoryTool.Metadata; fallback to {content: rawString} for old sessions that predate this change.

Migration 20260607_007_session_files creates session_files with: (session_id, file_path, operation, timestamp_unix, lines_added, lines_removed) Two indexes: by session (for fast per-session lookup) and by path (for cross-session "who touched this file?" queries). Live recording: onProgress writes to session_files whenever write_file, edit_file, or apply_patch completes during an active session. Backfill: when LoadSession runs, if no session_files rows exist for that session the transcript is scanned and rows are inserted retroactively, covering all sessions created before this change. operation values: "create" | "update" (write_file), "edit" (edit_file), "patch" (apply_patch). file_path and line counters come from the ToolResultContent.Metadata already stored in the transcript.

…tadata Without tool_use_id, retrieving a diff from session_files required scanning the entire transcript JSON. With tool_use_id stored: session_files.tool_use_id → session_transcript_entries.entry_json → ToolResultContent.Metadata["structured_patch" | "git_diff" | "content"] The migration is updated before it ever ran (table didn't exist in live DB), so no ALTER TABLE needed. A dedicated index on tool_use_id is added for direct lookup by tool call.

…SW RAG backend DB / SQLite: - Add perf pragmas: 20 MB page cache, 128 MB mmap, WAL temp in RAM, autocheckpoint - Run PRAGMA optimize on Close for query-planner housekeeping - Fix UpsertSessionFile to INSERT OR IGNORE + unique partial index on tool_use_id - Fix HasSessionFileEntry and HasNamespace to use SELECT EXISTS instead of COUNT(*) - Fix DeleteSession to rely solely on FK CASCADE (single DELETE FROM session_metadata) - Fix GetTeamAgents to use raw SQL SELECT DISTINCT instead of full-struct GORM scan - Add migration 008: dedup session_files + mailbox unread/history partial indexes - Add migration 009: FTS5 virtual table session_transcript_fts with insert/delete triggers that stay in sync with CASCADE deletes from session_metadata Vector / RAG: - Add BackendHNSW: pure-Go HNSW store (github.com/coder/hnsw), no CGO, no external service - Per-namespace persistence: <slug>.hnsw (graph) + <slug>.meta.json (text + metadata) - O(log n) ANN search vs previous O(n) brute-force; scores normalized to cosine similarity - Hybrid keyword blend when HybridWeight > 0 + QueryText set - Wire HNSW backend into CLI via buildRAGService; activates only when RAG_EMBEDDING_URL + RAG_EMBEDDING_MODEL env vars are present - Add HNSWDataDir helper to runtimepath - Add tests: upsert/search/persistence/delete and hybrid keyword ranking - Add complete database schema doc (docs/database-schema.md)

- Replace mouse-click hint with ctrl+t keyboard hint in thinking block footer - Replace HandleMouseDown/Up tool detail zone click with HasSelectedTool + ToggleDetails (tool detail pane is now keyboard-driven, not mouse-zone-click-driven) - Update golden snapshots to match new rendering - Update TUI roadmap: mark config isolation, credentials DB, clipboard paste as done; expand in-progress and upcoming sections

C1 — Session leak in task manager: Add committed bool + defer pattern; session.Close() is called if RegisterTools fails before the goroutine takes ownership. C2 — HNSW partial write undetected: Replace two separate error checks with errors.Join so both saveErr and metaErr are always surfaced to the caller. C3 — FTS5 migration errors silently ignored: Replace `_ = err` with log.Printf in both migrateSQLiteVectorFTS5 and migrateSQLiteTranscriptFTS5; startup no longer fails but the operator sees a warning when hybrid search degrades to LIKE scan. C4 — JSON metadata unmarshal silently ignored: Replace `_ = json.Unmarshal(...)` with explicit error logging in hnsw_store.go and sqlite_store.go; corrupted metadata is visible in logs instead of silently returning nil. C5 — context.Background() hardcoded in sqlite_backend: Add dbCtx() helper returning context.WithTimeout; DeleteSession uses 10s timeout, AppendTranscriptEntries and ReplaceTranscript use 30s. Full ctx propagation on the Backend interface is tracked as L-A. M1 — Embedding dimension never validated: Both embedOpenAI and embedOllama now check that every returned vector is non-empty and that all vectors in a batch share the same dimension. Also fix: HNSW hybrid search result order hnswBlendKeyword was modifying scores but not re-sorting, causing keyword-boosted records to be returned out of order. Add sort.Slice descending by score after blending. Caught by TestHNSWStore_HybridKeywordBlend. Add docs/audit/codebase-audit-2026-06.md with full audit findings split into NOW (fixed) and LATER (community issues L-A through L-M).

…eld aliases Three root causes identified from runtime observation (agents completing in 20-52ms, never making LLM calls): 1. Missing InputSchema on AgentTool.Definition(): The 'agent' tool had no JSON Schema, forcing the LLM to guess field names from description text. With spawn_agent (which has a full schema) registered in the same registry, the LLM was cross-contaminating field names. Fix: add InputSchema with type enum, task, maxTurns, run_in_background, fork, isolation, and tools properties — matching the Description contract exactly. 2. 'agent_type' alias not handled: Call() only read parsedInput["type"], not parsedInput["agent_type"]. The LLM used "agent_type" (spawn_agent convention) causing agentType=="" → fast return "type is required" every time. Fix: accept "agent_type" as fallback alias for "type". 3. Error message for missing type hid valid values: "Error: type is required" gave the LLM nothing to self-correct with. Fix: include the full list of available agent types in the error response, matching the existing behavior for "unknown agent type". Also improve wait_agent error hint when the agent_id looks like a tool_use_id (UUID format), helping the LLM distinguish spawn_agent IDs from tool_use_ids.

Redirect TUI stdlib log output to ~/.config/nexus-cli/logs/cli.log so errors are observable in TUI mode instead of silently discarded. Strip orphaned tool_result blocks before sending to OpenAI-compat APIs to prevent invalid_request_message_order errors (z-ai/GLM-4.5 etc.) when parallel agent failures leave tool_results without a matching assistant tool_call. Sanitizer is a no-op when no assistant tool_calls exist in the conversation, preserving valid single-turn tool results.

Pass cumulative toolUses count through RunConfig.Callback so AsyncAgent.ToolUses stays accurate turn-by-turn instead of remaining 0 during execution. Sync final ToolUses from RunResult after RunAgent() completes to cover any missed updates. Call Cleanup() in Shutdown() after all goroutines finish to release memory held by completed/failed/cancelled agents. Removed lazy cleanup from StartAgent() since it would break wait_agent by deleting completed agents before the LLM retrieves them.

ESC (and ctrl+c) now cancels the running agent turn immediately by cancelling the per-submit context, stopping the API call in progress. The footer shows "interrupting…" while waiting for the goroutine to drain. context.Canceled errors are suppressed so no red error banner appears after a deliberate user interrupt. SearXNG is now configurable from the web search panel: pressing Enter on SearXNG opens an "Instance URL" field (not masked, not a secret). The URL is persisted to the DB under "SEARXNG_BASE_URL" and applied as an env var at startup via loadCredsIntoConfig, so NewSearXNGProvider() picks it up on every run. The mode selector can then be set to "searxng" to route all web searches through the configured self-hosted instance.

Root cause: SetSize() reset detailKey to "" which forced GotoTop() on every streaming update (chat height grows → SetSize called → detailKey="" → next render sees detailKey≠key → GotoTop). Removed the reset. Rewrote renderToolDetail cache logic with three distinct cases: - New tool selected (detailToolID changed): reset to top - Same tool, content grew (streaming) or size changed: preserve yOffset - Size only changed, identical content: re-layout preserving yOffset Also fixed ctrl+o auto-switching focus to uiFocusMain when the sidebar opens, so arrow keys scroll immediately without requiring an extra Tab press. Closing the sidebar returns focus to the editor input.

Background sub-agents spawned via spawn_agent were running with the parent session's turn context. When that turn ended, defer cancel() fired and killed the still-running sub-agent's API calls and permission prompts, producing 'permission denied: prompt failed: context canceled'. Three changes: - async.go runAgent: replace config.Context with agent.Ctx so the goroutine uses its own independent context regardless of parent state - runner.go RunConfig: add PermissionMode field to let callers override the session's permission mode after creation - spawn_agent.go: set PermissionMode=bypass so background agents auto- approve tools without blocking on interactive prompts that no longer have a valid TUI context

Sub-agents now automatically collect every file path, URL, and search query they consult during execution. Sources are deduplicated and attached to RunResult.Sources as []SourceRef{Type, Value}. The parent agent receives sources in the tool_result data payload (agent tool, wait_agent, fork mode, worktree mode). wait_agent also appends a formatted source list at the end of its Content string so the parent LLM sees them inline without parsing JSON. Extracted automatically from: read_file, write_file, edit_file, glob, grep, web_search, web_fetch, web_crawl, web_map, browser_navigate, browser_open, wikipedia, scholarly_search, langsearch.

Sub-agents now persist their session ID in RunResult and AsyncAgent after each run. A new resume_agent tool reopens the persisted session via Engine.OpenSession and submits a new task into the existing conversation history, so the agent retains full context of everything it read, fetched, and wrote previously. Changes: - engine.go: add OpenSession(ctx, sessionID) via optional sessionRestorer interface - session.go: add GetSessionID() accessor - runner.go: add SessionID to RunResult, ResumeFromSessionID to RunConfig; RunAgent branches on ResumeFromSessionID to restore instead of create - async.go: add SessionID field to AsyncAgent, captured from RunResult on completion - wait_agent.go: expose session_id in result JSON + update description - resume_agent.go: new tool accepting session_id (or agent_id) + task; supports sync (blocking) and async (background) modes - sdk/client.go: register resume_agent alongside spawn_agent

Three bugs fixed in the session browser (Ctrl+S): 1. UpdatedAt/CreatedAt not propagated — sessions always showed "—" for age. cmd/cli/tui.go was discarding the int64 unix timestamps from state.SessionInfo instead of converting them to time.Time for tui.SessionInfo. 2. Session load errors silently swallowed — when RestoreSessionState failed (checkpoint mismatch, compaction boundary error, etc.) the error was stored in m.lastErr which is never rendered, leaving the user with a blank chat and no feedback. Now also sets m.lastTurnErr so the status bar shows the failure. 3. No session preview — the session picker only showed an 8-char ID, age, and turn count with no context about what the session was about. Like Codex, the first user message is now extracted during SaveSessionState, stored in metadata.Additional["canonical_transcript"]["first_user_message"], surfaced in SessionInfo.Preview, and rendered below the meta line in the picker. Search now also matches against the preview text.

…earer api.z.ai returns 'x-api-key header is required' on 401, meaning the endpoint uses Anthropic-style authentication despite serving OpenAI-compat request bodies. - Add zAiAdapter that embeds openAICompatAdapter (same /chat/completions body and response format) but overrides applyAuthHeaders to send x-api-key - Update adapterForProvider to route APIProviderZAi to zAiAdapter - Update BuildAuthHeaders in config.go to use x-api-key for ZAi - Update the provider test to match the new auth header

DeleteSession now removes associated browser artifacts from storage (screenshots, downloads) and cleans up in-memory plan state/files. Session list shows the first user message line as the primary title with ID/age/turns as secondary metadata, matching Codex's approach.

All session data now lives under sessions/{session_id}/ directly in the app root (~/.config/nexus-cli/). Screenshots go to sessions/{id}/images/, downloads to sessions/{id}/tools/, and plan files to sessions/{id}/plans/. Deleting a session is now two calls: store.DeleteSession for the DB and appdir.DeleteSessionDir (os.RemoveAll) for all physical files — no more per-namespace artifact listing. New appdir package centralises all path resolution for cmd/cli. Session directories are created via EnsureSessionDir at session open/resume.

Web-scraped content moves from global artifacts/web/ into sessions/{id}/artifacts/web/, making it session-scoped and cleaned up automatically on session delete. Browser screenshots move to sessions/{id}/screenshots/ (renamed from images/ to avoid ambiguity). Adds key builders and store functions for: - GeneratedImageKey / StoreGeneratedImageRef → sessions/{id}/artifacts/images/ - AudioKey / StoreAudioRef → sessions/{id}/artifacts/audio/ - WebArtifactKey / StoreWebArtifactRef → sessions/{id}/artifacts/web/ Threads sessionID through the fetch pipeline (Fetch → fetchViaHTTP → persistArtifact) so web fetches land in the right session directory. Storage GC reaper no longer needed for these namespaces.

Root causes fixed: - loadRuntimeOptions applied overrides.Model AFTER loadCredsIntoConfig, so the credential lookup used the wrong provider (anthropic default) and returned an empty API key for z-ai - SaveProviderField called reloadClient with no model set, building a keyless client that raced with SetModel's correct reload - CreateSession and LoadSession goroutines could grab w.client before SetModel's reloadClient goroutine finished (reloadMu added to serialise) Also fixed: - LangSearch API key not restored from DB on startup - LangSearch missing from 'nexus config --search' and config summary - pendingSubmitMsg dropped when Enter pressed before session created - /cli binary added to .gitignore

Refactors the TUI and configuration logic to ensure immediate application of API keys and prevent accidental file path insertions during setup. - Fixes circular dependency in loadCredsIntoConfig. - Triggers client reload on all sensitive configuration changes in TUI. - Normalizes Z.ai provider identifiers (zai, z-ai, z.ai) across the stack. - Strictly disables file completions during configuration states. - Refactors TUI chat components into specialized files for better maintainability.

…s.json

EngineerProjects added 30 commits June 7, 2026 00:39

fix(cli): parse permission mode case-insensitively during client reload

bd19398

feat(permissions): add session-level auto-approvals for tools

cb11b90

feat(permissions): persist session-level tool approvals to permission…

ac1a312

…s.json

fix(state): resolve checkpoint mismatch on session load and delete

c17044e

perf(tui): optimize keystroke and scrolling rendering performance

b36b3de

EngineerProjects merged commit 85eec1c into dev Jun 8, 2026
5 checks passed

EngineerProjects deleted the fix/tui-rewrite branch June 8, 2026 07:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tui): resolve checkpoint mismatch and optimize rendering performance#34

fix(tui): resolve checkpoint mismatch and optimize rendering performance#34
EngineerProjects merged 30 commits into
devfrom
fix/tui-rewrite

EngineerProjects commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EngineerProjects commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant