Skip to content

fix(opencode-plugin): close TOCTOU race in deliverPendingToIdle#1

Closed
mmkzer0 wants to merge 1 commit intomainfrom
fix/toctou-delivery-mutex
Closed

fix(opencode-plugin): close TOCTOU race in deliverPendingToIdle#1
mmkzer0 wants to merge 1 commit intomainfrom
fix/toctou-delivery-mutex

Conversation

@mmkzer0
Copy link
Copy Markdown
Owner

@mmkzer0 mmkzer0 commented Apr 19, 2026

Summary

This fixes duplicate injected hcom turns in the OpenCode plugin by closing a TOCTOU race in deliverPendingToIdle().

The issue became reproducible under multi-agent orchestration, where wake events happen frequently and near-simultaneously.

Root cause

deliverPendingToIdle() can be entered from two independent wake paths:

  • TCP notify wake
  • idle-status wake (session.status -> idle)

Both paths could pass the pendingAckId === null check before either caller set it (first await boundary), causing duplicate opencode-read/promptAsync delivery of the same payload.

Fix

  • Add synchronous deliveryInFlight guard flag checked/set before the first await
  • Early-return concurrent callers when a delivery is already running
  • Reset guard in finally

pendingAckId remains in place as the second serialization layer between injection and transform-hook ack.

Validation

  • Reproduced prior duplicate-delivery behavior under frequent wake conditions
  • Two-turn injection tests (FIRST_OK / SECOND_OK) now show single delivery per message
  • Real OpenCode usage confirms duplicate turn injection is no longer present
  • Repository test suite passes (cargo test -q)

Notes

This PR is intentionally scoped to the delivery race fix only. The agent/model persistence follow-up is prepared as a separate stacked branch/PR.

Prevent duplicate promptAsync injections from concurrent wake paths by adding a synchronous deliveryInFlight guard before the first await and clearing it in finally.

Also include the follow-up indentation correction in deliverPendingToIdle.
Copilot AI review requested due to automatic review settings April 19, 2026 19:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses duplicate hcom message injection in the OpenCode plugin by closing a TOCTOU race in deliverPendingToIdle() when multiple wake paths fire close together.

Changes:

  • Add a synchronous deliveryInFlight guard to prevent concurrent deliverPendingToIdle() executions before the first await.
  • Restructure deliverPendingToIdle() to set/reset the guard via try/finally and document the two-layer serialization (deliveryInFlight + pendingAckId).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 38 to 39
let permissionPending = false // Exact permission gate from OpenCode events

Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deliveryInFlight is part of the plugin’s mutable session state, but it isn’t reset in the session.deleted handler where other state (instanceName, pendingAckId, etc.) is cleared. If a session is deleted while a delivery attempt is hung/in-flight, the next session could remain permanently blocked by a stale deliveryInFlight = true. Consider explicitly resetting deliveryInFlight = false alongside the other state resets (and/or when stopping the notify server).

Suggested change
let permissionPending = false // Exact permission gate from OpenCode events
let permissionPending = false // Exact permission gate from OpenCode events
function resetSessionState() {
instanceName = null
sessionId = null
bootstrapText = null
bindingPromise = null
lastReportedStatus = null
pendingAckId = null
deliveryInFlight = false
permissionPending = false
}

Copilot uses AI. Check for mistakes.
Comment on lines +127 to +131
pendingAckId = maxId
client.session.promptAsync({
path: { id: sid },
body: { parts: [{ type: "text", text: formatted }] },
} as any)
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pendingAckId is set before calling client.session.promptAsync(...), but promptAsync is neither awaited nor error-handled. If promptAsync throws (sync) or returns a rejected promise, pendingAckId can remain set, blocking further deliveries and potentially causing a later transform to ack messages that were never actually injected. Consider wrapping promptAsync in a try/catch (and attaching a .catch(...) if it returns a promise) to clear pendingAckId on failure, or only setting pendingAckId after promptAsync has successfully queued the prompt.

Suggested change
pendingAckId = maxId
client.session.promptAsync({
path: { id: sid },
body: { parts: [{ type: "text", text: formatted }] },
} as any)
try {
const promptResult = client.session.promptAsync({
path: { id: sid },
body: { parts: [{ type: "text", text: formatted }] },
} as any)
pendingAckId = maxId
void Promise.resolve(promptResult).catch((error) => {
if (pendingAckId === maxId) pendingAckId = null
log("ERROR", "plugin.delivery_prompt_failed", instanceName, {
error: String(error),
pending_ack: maxId,
})
})
} catch (error) {
if (pendingAckId === maxId) pendingAckId = null
log("ERROR", "plugin.delivery_prompt_failed", instanceName, {
error: String(error),
pending_ack: maxId,
})
throw error
}

Copilot uses AI. Check for mistakes.
@mmkzer0
Copy link
Copy Markdown
Owner Author

mmkzer0 commented Apr 19, 2026

Superseded by upstream PR: aannoo#23

@mmkzer0 mmkzer0 closed this Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants