Crash: SIGBUS/SIGTRAP in onnxruntime-node ReleaseIoBinding during local embedding load (concurrent opencode sessions, macOS arm64)

## Description

Both opencode TUI sessions crash with a native SIGBUS / SIGTRAP inside `onnxruntime-node` while the magic-context plugin loads the local embedding model (`Xenova/all-MiniLM-L6-v2`). The crashes occurred in two concurrently-running opencode processes (separate sessions, separate cwd) within seconds of each other — strongly suggests a concurrency / double-free bug in the local embedding loader when two processes touch the same cached model at the same time.

## Environment

- Plugin: `@cortexkit/opencode-magic-context@0.9.1`
- OpenCode: `1.4.7`
- OS: `macOS 26.4.1` (build 25E253)
- Arch: `arm64` (Apple Silicon, M2 Max — `Mac14,6`)
- Node: `v24.15.0`
- `onnxruntime-node`: `1.21.0` (npm) — **bundles `libonnxruntime.1.14.0.dylib`** (native C++ lib)
- Transitive dep via: `@huggingface/transformers@~3.7.6`

## Configuration

`~/.config/opencode/magic-context.jsonc`:

```jsonc
{
  "$schema": "https://raw.githubusercontent.com/cortexkit/opencode-magic-context/master/assets/magic-context.schema.json",
  "enabled": true,
  "historian": {
    "model": "anthropic/claude-haiku-4-5",
    "fallback_models": ["opencode-go/glm-5"]
  },
  "dreamer": {
    "enabled": true,
    "model": "anthropic/claude-sonnet-4-6"
  },
  "sidekick": {
    "model": "anthropic/claude-haiku-4-5"
  },
  "experimental": {
    "user_memories": { "enabled": false },
    "pin_key_files": { "enabled": false }
  },
  "compaction_markers": false
}
```

No explicit `embedding` block, so the default (`provider: "local"`, model `Xenova/all-MiniLM-L6-v2`) is used.

## Reproduction

1. macOS 26.4.1 arm64, opencode 1.4.7, magic-context 0.9.1 (default local embedding config).
2. Cached model already present at `~/.local/share/opencode/storage/plugin/magic-context/models/Xenova/all-MiniLM-L6-v2/onnx/model.onnx` (90 MB, `md5 = 84f837de2a0f667784facf2ba0f36b22`).
3. Start **two** opencode sessions in separate terminals, in different project directories:
   - `opencode -s ses_267e0c216ffeI6nQIAWBnt6G20`
   - `opencode -s ses_26616a332ffenMsmj6S59ueIH1`
4. Work in both simultaneously (both sessions presumably trigger embedding/`ctx_search`/dreamer init near the same time).
5. Within ~5 minutes, **both processes die silently**, no TUI error, parent shell prints nothing.

Three crash reports were generated in 17 minutes (10:56, 11:12:07, 11:12:12), all with the same native-frame signature.

## Crash signatures (from `~/Library/Logs/DiagnosticReports/opencode-*.ips`)

### Crash 1 — `opencode-2026-04-17-105636.ips`

```
exception  : EXC_BAD_ACCESS (SIGBUS)
subtype    : KERN_PROTECTION_FAILURE at 0x00000004138ed418
termination: Bus error: 10
faultingThread: "Worker"
frames (top):
  OrtApis::ReleaseIoBinding(OrtIoBinding*) + 24
  InferenceSessionWrap::LoadModel(Napi::CallbackInfo const&) + 3220
  Napi::InstanceWrap<InferenceSessionWrap>::InstanceMethodCallbackWrapper::lambda()
  Napi::InstanceWrap<InferenceSessionWrap>::InstanceMethodCallbackWrapper
```

### Crash 2 — `opencode-2026-04-17-111207.ips`

```
exception  : EXC_BREAKPOINT (SIGTRAP)
termination: Trace/BPT trap: 5
faultingThread: "Worker"
frames (top):
  (native, unsymbolicated + _sigtramp)
  InferenceSessionWrap::LoadModel(Napi::CallbackInfo const&) + 3184
  Napi::InstanceWrap<InferenceSessionWrap>::InstanceMethodCallbackWrapper::lambda()
  Napi::InstanceWrap<InferenceSessionWrap>::InstanceMethodCallbackWrapper
```

### Crash 3 — `opencode-2026-04-17-111212.ips`

```
exception  : EXC_BREAKPOINT (SIGTRAP)
termination: Trace/BPT trap: 5
faultingThread: "Worker"
frames (top):
  (native, unsymbolicated + _sigtramp)
  OrtApis::ReleaseIoBinding(OrtIoBinding*) + 36   ← double frame!
  OrtApis::ReleaseIoBinding(OrtIoBinding*) + 36   ← double-free signature
  InferenceSessionWrap::LoadModel(Napi::CallbackInfo const&) + 3220
  Napi::InstanceWrap<InferenceSessionWrap>::InstanceMethodCallbackWrapper::lambda()
```

**Loaded native modules at crash time** (from `usedImages`):

```
/…/onnxruntime_binding.node
/…/libonnxruntime.1.14.0.dylib
```

## Root-cause hypothesis

All three crashes share:

1. The **same faulting thread name** (`"Worker"`) — likely a transformers.js / onnxruntime-node worker pool thread.
2. The **same top-of-stack function**: `OrtApis::ReleaseIoBinding` inside `InferenceSessionWrap::LoadModel`.
3. Crash 3 shows the `ReleaseIoBinding` frame twice → double-free; Crash 1 shows `KERN_PROTECTION_FAILURE` on a write → use-after-free.

The most plausible cause: when two opencode processes share the same on-disk ONNX model cache and both initialize `InferenceSession` at roughly the same wall-clock time, ONNX Runtime 1.14.0's session-initialization path releases/cleans up an `IoBinding` that the other process (or its own abort path) also releases — double-free → `ORT_ENFORCE` → `brk 1` → SIGTRAP. The 1.14.0 native lib is quite old (Feb 2023) relative to macOS 26.4.1.

## Suggested fixes (in order of effort)

1. **Lazy + singleton-initialized embedding session** — ensure only one `InferenceSession` per process, behind an async-locked init. If there's already a lock, it may not be covering the full `LoadModel` → ready path.
2. **Per-process model cache path** (or file-lock the first-time extraction) — avoid two processes mmap-ing / opening the same `.onnx` file during a warm-start window.
3. **Bump `@huggingface/transformers`** → newer versions pull `onnxruntime-node ≥ 1.18` which carries native lib ≥ 1.18 with many post-1.14 stability fixes on Apple Silicon + newer macOS.
4. **Graceful fallback** — wrap `await pipeline(...)` / `InferenceSession.create` in try/catch; on native crash signal or init error, auto-disable embeddings for this process instead of letting it SIGTRAP-kill the entire opencode TUI. (Currently there's zero user-visible error — the whole TUI just vanishes.)
5. **Document the workaround** — `embedding: { provider: "off" }` in `magic-context.jsonc` prevents the crash (ctx_search still works via FTS5 fallback). Worth noting in README until the root fix lands.

## Workaround applied locally

Setting `embedding.provider: "off"` in `~/.config/opencode/magic-context.jsonc` bypasses the crash entirely. FTS5 full-text fallback keeps `ctx_search` functional; only semantic ranking is lost.

## Additional context

Happy to attach the full `.ips` files, run with any `DEBUG=*` env var, or test a patched build. Let me know if you want the full thread dumps — I can upload them as a gist.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash: SIGBUS/SIGTRAP in onnxruntime-node ReleaseIoBinding during local embedding load (concurrent opencode sessions, macOS arm64) #21

Description

Environment

Configuration

Reproduction

Crash signatures (from `~/Library/Logs/DiagnosticReports/opencode-*.ips`)

Crash 1 — `opencode-2026-04-17-105636.ips`

Crash 2 — `opencode-2026-04-17-111207.ips`

Crash 3 — `opencode-2026-04-17-111212.ips`

Root-cause hypothesis

Suggested fixes (in order of effort)

Workaround applied locally

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Crash: SIGBUS/SIGTRAP in onnxruntime-node ReleaseIoBinding during local embedding load (concurrent opencode sessions, macOS arm64) #21

Description

Description

Environment

Configuration

Reproduction

Crash signatures (from ~/Library/Logs/DiagnosticReports/opencode-*.ips)

Crash 1 — opencode-2026-04-17-105636.ips

Crash 2 — opencode-2026-04-17-111207.ips

Crash 3 — opencode-2026-04-17-111212.ips

Root-cause hypothesis

Suggested fixes (in order of effort)

Workaround applied locally

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Crash signatures (from `~/Library/Logs/DiagnosticReports/opencode-*.ips`)

Crash 1 — `opencode-2026-04-17-105636.ips`

Crash 2 — `opencode-2026-04-17-111207.ips`

Crash 3 — `opencode-2026-04-17-111212.ips`