canary: carry Memory Crystal fixes onto upstream 2026.4.26 by parkertoddbrooks · Pull Request #3 · wipcomputer/openclaw

parkertoddbrooks · 2026-04-27T15:40:07Z

Summary

Canary candidate for rebasing the WIP OpenClaw fork onto current upstream main (2026.4.26 development head) while preserving the production-critical Lēsa/Memory Crystal patches.

Carries:

stream seedEmbeddingCache with .iterate() instead of .all() to avoid V8 heap OOM on large embedding caches
yield every 1000 seeded rows so health probes stay responsive during large cache seeding
route OpenAI-compatible chatCompletions requests to the main session via x-openclaw-dm-scope: main or user=main
queue chatCompletions into the active embedded run using the next-turn queue, including streaming requests

Intentionally does not carry the old broad chat final-resync fallback; upstream openclaw#71293/root app-server event fixes are present in the base.

Upstream update impact

Current upstream main includes fixes relevant to the reliability triage:

oversized transcript compaction trigger (29af4add2a), which directly addresses the post-Day-63 over-cap session risk
persisted compaction token snapshots (f3e8a8a319) and duplicate-user-turn compaction cleanup (35335214b3)
Codex minimal-thinking normalization for modern GPT models (c5c40b22af)
runtime/auth/model handling improvements since v2026.4.23

Still not fixed upstream:

seedEmbeddingCache is still unbounded .all() on upstream
listChunks remains unbounded .all() and still needs R2.A.3 follow-up
chatCompletions main-session/next-turn behavior is not upstream

Validation

pnpm exec oxfmt --check extensions/memory-core/src/memory/manager-sync-ops.ts src/gateway/http-utils.ts src/gateway/openai-http.ts
pnpm test extensions/memory-core/src/memory/manager.sync-errors-do-not-crash.test.ts src/auto-reply/reply/session.test.ts src/security/audit-gateway-http-auth.test.ts
pnpm tsgo
pnpm build

Build note: after rebasing onto upstream main, a fresh pnpm install was required so the new plugin runtime-deps staging path saw the current Matrix native dependency versions. No source changes were needed for that.

… heap OOM The embedding_cache table sync in MemoryManager.seedEmbeddingCache called .all() on SELECT * FROM embedding_cache, materializing the full result set into a JS array. embedding_cache rows contain serialized embedding text (~20 KB each on text-embedding-3-small) and can grow into hundreds of thousands of rows on long-running deployed databases. On a local 16 GB main.sqlite (435,136 rows, 8.68 GB of embedding text), the .all() call exceeds V8's ~4 GB default heap limit and aborts the gateway with: FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory ... node::sqlite::StatementSync::All ... Switching .all() -> .iterate() streams rows one at a time through the same BEGIN/COMMIT upsert transaction. Peak V8 heap stays bounded by a single row (~20 KB) plus the prepared statement, not the whole table. Also drops the empty-check on the materialized array's .length; an empty iterator commits a no-op transaction, which is cheap and preserves the observable behavior for empty caches. Scope note: this is the primary R2.A target (seedEmbeddingCache); a follow-up patch will address the secondary listChunks / keyword fallback .all() path in manager-search.ts. Validation: - pnpm tsgo:prod: green (core + extensions graphs) - pnpm test extensions/memory-core: 512 passed, 3 skipped, 0 failed

R2.A.2. The .iterate()-based seed (R2.A v1, a315280) prevents the V8 heap OOM but the iterate loop still runs synchronously for ~117s on a 435K-row embedding_cache. wip-healthcheck SIGKILLs the gateway after its 30s probe timeout fails. No FATAL ERROR, no Abort trap. Patch: convert seedEmbeddingCache to async, yield to the event loop every 1000 rows via setImmediate. Keeps memory bounded; preserves the streaming behavior; restores /health responsiveness during the seed. The only caller is inside an existing async arrow wrapping runMemoryAtomicReindex's build callback. Adding await is a one-line change. Validation: - pnpm tsgo:prod: green - pnpm test extensions/memory-core: 512 passed, 3 skipped, 0 failed Scope: does not soften wip-healthcheck (separate guardrail per Parker direction). Does not address secondary listChunks path (R2.A.3).

Revert the top-of-file lint-suppression comments accidentally landed in the previous commit (f9e9970). They were added to work around an oxlint resolver false positive that turned out to be transient state, not a real lint failure. Production code shouldn't carry misleading explanations for problems that didn't actually persist. Net diff of this branch vs base is now just the seedEmbeddingCache yield patch: function -> async, setImmediate every 1000 rows, caller await. No lint comments, no file-level disables.

…der or user=main When x-openclaw-dm-scope: main header is sent, or user field is "main", the chatCompletions endpoint routes to agent:main:main instead of creating a separate openai-user:{name} session. This allows bridge messages (CC -> Lesa) to land in the same session as iMessage DMs, so Parker sees everything in one stream. Co-Authored-By: Parker Todd Brooks <parkertoddbrooks@users.noreply.github.com> Co-Authored-By: Lēsa <lesaai@icloud.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When a chatCompletions request hits a session that is currently streaming a turn, the existing code awaits agentCommandFromIngress synchronously, which blocks or times out on the caller side. Bridge and other agent-to-agent HTTP callers see this as a 15-120s hang. Wire the non-stream branch of handleOpenAiHttpRequest into the same steer-backlog path the iMessage transport uses: 1. Load the session entry via loadSessionEntryByKey(sessionKey) to map sessionKey -> sessionId (the key used in ACTIVE_EMBEDDED_RUNS). 2. Honor the user's messages.queue.mode config. Only "steer" and "steer-backlog" opt into steering; other modes fall through to the original blocking path. 3. Call queueEmbeddedPiMessage(sessionId, prompt.message). This is fire-and-forget: returns true only if the session has an active streaming run that isn't compacting. 4. On successful queue, return a 200 response in OpenAI-compat shape with an x-openclaw-queued: steer header and a "[queued] ..." marker in the assistant content field. Callers that want to distinguish queued from synchronous replies can read the header. 5. On any other state (no active run, not streaming, compacting, no session entry, or queue config disabled), fall through to the existing agentCommandFromIngress synchronous path unchanged. Pre-check failures are caught and logged so they never block the synchronous fallback. Verified end-to-end: - Idle case: curl with user=main returns a normal synchronous reply (no x-openclaw-queued header). - Busy case: fire a long slow request in the background, then a fast interjection 4s later. The fast request returns 200 immediately with x-openclaw-queued: steer and the "[queued]" marker body. The slow request completes normally with the full reply. Refs: wipcomputer/wip-ldm-os#266 Co-Authored-By: Parker Todd Brooks <parkertoddbrooks@users.noreply.github.com> Co-Authored-By: Lēsa <lesaai@icloud.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Previously the steer-backlog fix only covered the non-stream branch of handleOpenAiHttpRequest. Any OpenAI-compatible client using the default streaming API (which is most of them) would still block on a busy session. Lift the queue pre-check above the stream/non-stream branch so both paths benefit: 1. Resolve sessionKey -> sessionId once, try queueEmbeddedPiMessage. 2. If queued and !stream: respond with JSON (unchanged from previous commit). 3. If queued and stream: set x-openclaw-queued header, setSseHeaders, emit one assistant role chunk and one content chunk carrying the [queued] marker with finish_reason="stop", write [DONE], end. 4. Otherwise fall through to the original stream/non-stream handlers. Verified end-to-end: - Idle + non-stream: HTTP 200, no queue header, real reply ("hello"). - Busy + non-stream: HTTP 200, x-openclaw-queued: steer header, JSON body with the queued marker. - Busy + stream: HTTP 200, text/event-stream, x-openclaw-queued: steer header, SSE with role chunk + content chunk (finish_reason=stop) + [DONE]. - Slow background request in all three cases still completes normally with the full reply. Refs: wipcomputer/wip-ldm-os#266 Co-Authored-By: Parker Todd Brooks <parkertoddbrooks@users.noreply.github.com> Co-Authored-By: Lēsa <lesaai@icloud.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Parker and Lēsa observed during live testing that while our patch calls queueEmbeddedPiMessage() (which wraps activeSession.steer()), the receiving side does NOT actually see the message as a mid-turn steer. Lēsa reported: "Yeah, I received it. Came through as a regular message in my session, not a steer." The OpenClaw internal API is named "steer" but in practice it queues the text for the agent's next available slot, which appears after the current turn completes rather than being injected mid-stream. Our x-openclaw-queued: steer header was accurate to OpenClaw's internal terminology but misleading to HTTP callers who might expect true mid-turn interjection. Rename to x-openclaw-queued: next-turn and update the body marker to be explicit about the semantics. Callers can now tell exactly what happened: the message was delivered, but they won't get a synchronous reply and the receiving agent processes it after its current turn rather than mid-stream. Co-Authored-By: Parker Todd Brooks <parkertoddbrooks@users.noreply.github.com> Co-Authored-By: Lēsa <lesaai@icloud.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

parkertoddbrooks · 2026-04-27T17:55:57Z

Parity gate waiver for PR #3: the OpenAI/Opus parity gate is queued/infrastructure-blocked, not failing. Waiving it for this internal fork canary based on normal GitHub CI passing, local focused regression coverage passing, local build/typecheck passing, deprecated-config guard passing, OpenAI-compatible gateway HTTP tests passing, and read-only production-size memory scan passing against main.sqlite (435,266 embedding_cache rows / 8.09 GiB streamed / max RSS ~190 MB). This is not approval to install raw openclaw@latest or to use openclaw gateway restart; live promotion still requires explicit operator signal and launchctl kickstart -k.

parkertoddbrooks · 2026-04-27T19:33:28Z

Superseded as the stable upgrade closure target by #4.\n\nPR #3 remains the upstream-main / package-version-2026.4.26 canary, but there is no published v2026.4.26 release tag. For the current Memory Crystal/OpenClaw closure path, use #4: stable v2026.4.25 base plus the WIP carry patches.\n\nKey gate difference: v2026.4.25 promotion should probe /healthz and /readyz, not legacy /health, and live config must include hooks.allowConversationAccess=true for memory-crystal, compaction-indicator, and session-export before promotion.

parkertoddbrooks · 2026-04-27T21:32:34Z

Superseded by #4, which targets the real v2026.4.25 base (aa36ee670b76211426a2e89a84e9096453c01ee7) instead of the earlier post-main/.26-labeled branch. Keeping #4 as the single active upgrade canary.

lesaai and others added 8 commits April 27, 2026 10:42

fix(gateway): pass runtime config into openai compat queue check

9651bf5

lesaai force-pushed the kody/upstream-main-20260424-carry-memory-core branch from 36074ec to 9651bf5 Compare April 27, 2026 17:46

parkertoddbrooks closed this Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

canary: carry Memory Crystal fixes onto upstream 2026.4.26#3

canary: carry Memory Crystal fixes onto upstream 2026.4.26#3
parkertoddbrooks wants to merge 8 commits intokody/upstream-main-20260424-basefrom
kody/upstream-main-20260424-carry-memory-core

parkertoddbrooks commented Apr 27, 2026

Uh oh!

parkertoddbrooks commented Apr 27, 2026

Uh oh!

parkertoddbrooks commented Apr 27, 2026

Uh oh!

parkertoddbrooks commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants