format: make return.data optional by gnidan · Pull Request #211 · ethdebug/format

gnidan · 2026-04-16T07:15:08Z

Summary

Makes the data pointer optional on function return contexts so
compilers can annotate returns that have no observable value —
most importantly, TCO back-edge JUMPs — without emitting
misleading placeholder pointers.

Motivation (discovered in #210)

PR #210 adds TCO debug-context preservation to bugc. At the
tail-call-optimized back-edge JUMP, the intermediate return
value is not materialized on the stack — it would have become
the next iteration's argument, which the compiler has already
folded into the new call's setup. The schema previously required
data, forcing the compiler to emit a placeholder pointer to
stack slot 0. But at that instruction slot 0 is the new
iteration's first argument (or the return address), not the
return value. A debugger following the pointer would mislabel
unrelated stack content as the return value.

Making data optional is the clean fix, and it matches the
revert context's precedent: reason and panic are both
optional there on the same grounds ("a bare revert: {} is
permitted when the compiler knows a revert occurred but has no
further detail").

Other use cases unlocked

Void functions — no return value to point at.
Lost compiler precision — compiler knows a return occurred
but has dropped the value's location tracking.
Optimized returns where the value lives in a register
already consumed by the subsequent instruction.

Changes

schemas/program/context/function/return.schema.yaml — drop
data from required, expand descriptions, add a no-data
example (TCO back-edge).
packages/format/src/types/program/context.ts —
data?: Function.PointerRef and adjust the isInfo guard.
packages/web/spec/program/context/function/return.mdx —
rewrite "Field optionality" section covering void, TCO, and
lost-precision cases.
packages/bugc/src/evmgen/call-contexts.test.ts — add a
non-null assertion at the one test site asserting on emitted
data (compiler does emit it there; the assertion just
satisfies the narrower type).

Downstream

Unblocks #210: once this lands, the compiler can drop the
stack-slot-0 placeholder and emit a bare
return: { identifier, declaration? } at TCO back-edges.

Test plan

yarn build passes
yarn test passes (942 tests, same as main)
yarn lint clean (only pre-existing warnings)
Schema guard tests exercise both the new no-data example
and the existing data-bearing examples

The data pointer on function return contexts was required, forcing compilers to emit placeholder pointers at instructions where no return value is observable. The common motivating case is a tail-call-optimized back-edge JUMP, where the intermediate return value is not materialized on the stack (it would have become the next iteration's argument, which the compiler has already folded into the new call's setup). A placeholder pointer at such an instruction would mislabel unrelated stack content as the return value. Make data optional, matching the revert context's precedent (reason and panic are both optional there for similar reasons). A bare `return: {}` is now permitted, analogous to bare `revert: {}`. This also accommodates void functions (no return value to point at) and lost-compiler-precision cases where the compiler knows a return occurred but has dropped the value's location. Updates: - return.schema.yaml: drop data from required, expand descriptions, add a no-data example. - context.ts: data?: Function.PointerRef and adjust guard. - return.mdx: rewrite Field optionality section covering void/TCO/lost-precision cases. - call-contexts.test.ts: use non-null assertion where the test asserts the compiler did emit data.

github-actions · 2026-04-16T07:19:27Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-16 07:26 UTC

Per the format change in #211 making `return.data` optional, the TCO back-edge JUMP now emits a bare return context (identifier + declaration only). The stack-slot-0 placeholder was semantically wrong anyway — that slot holds the new iteration's first argument, not the previous iteration's return value. TCO doesn't materialize the intermediate return value at all; the actual return happens at the function's terminal RETURN.

* bugc: verify optimizer preserves invoke/return contexts Adds a behavioral test suite that compiles a set of source patterns at every optimization level (0, 1, 2, 3) and: - asserts the bytecode still runs correctly end-to-end - counts invoke/return contexts by instruction type and function identifier, then asserts the expected shape Covers every pass that could touch call sites or returns: L1: constant folding, propagation, DCE L2: CSE, TCO, jump optimization L3: block merging, return merging, R/W merging Confirms that only tail call optimization eliminates contexts (by design — the tail call becomes a jump). All other transformations preserve invoke/return contexts across levels for simple calls, nested calls, mutual recursion, non-tail self-recursion, and multi-path returns of the same value. This is groundwork for the transform context spec. * bugc: preserve invoke context through tail call optimization TCO replaces a tail-recursive call terminator with a jump to the function's loop header. Previously this dropped the invoke debug context, so the recursive call became invisible to debuggers — a deeply recursive program looked like one giant loop with no logical call stack. Now the TCO pass records a TailCall metadata block on the replacement jump terminator, and codegen attaches an invoke debug context to the generated JUMP. The context mirrors the normal caller-JUMP invoke: identity + declaration + code target, no argument pointers. patchInvokeTarget resolves the placeholder code offset from the function registry the same way it does for regular calls. No matching return context is emitted for the TCO'd call — the tail call folds into the outer activation's return, and a future transform: tailcall marker will let the debugger reconcile the missing return when the outer function eventually returns and pops all accumulated tail frames at once. Updates the optimizer-contexts test suite to assert the preserved invoke is present at levels 2 and 3, and that the return context intentionally does not duplicate. * bugc: pair invoke with return on TCO back-edge JUMP Refines the TCO debug-context fix: the back-edge JUMP now carries a gather context with BOTH the previous iteration's return and the new iteration's invoke. Depth stays constant across the JUMP — one frame pops, one pushes, on the same instruction. The function's terminal RETURN then pops the final iteration's frame normally. This models source-level semantics rather than the optimized control flow: the debugger's logical call stack matches what the programmer wrote, and transform: tailcall markers (future work) can annotate these JUMPs as TCO-produced. Also fixes patchInvokeTarget to walk into gather contexts so the invoke leaf's placeholder code offset gets resolved from the function registry. Test helper countCallSites updated to unwrap gather contexts and count (invoke, return) pairs on JUMPs separately from the traditional JUMPDEST buckets. * bugc: drop placeholder return.data at TCO back-edges Per the format change in #211 making `return.data` optional, the TCO back-edge JUMP now emits a bare return context (identifier + declaration only). The stack-slot-0 placeholder was semantically wrong anyway — that slot holds the new iteration's first argument, not the previous iteration's return value. TCO doesn't materialize the intermediate return value at all; the actual return happens at the function's terminal RETURN.

Internal calls via JUMP normally carry a code pointer to the callee's entry point. When the compiler inlines a function, the JUMP is elided — there is no physical call instruction and no code target to point at. The callee identity (identifier, declaration, type) remains meaningful, but the target pointer does not. Same pattern as #211 (making return.data optional). Unblocks inlining: bugc can emit invoke contexts on inlined first instructions without fabricating a target pointer. - Schema: drop target from InternalCall.required, expand description, add worked example for inlined case - TS types: mark target optional; guard relaxed - Spec page: document optionality and point at transform + gather for inlining annotation - bugc: guard target access in patchInvokeInContext; tests assert target defined before dereferencing

gnidan merged commit e8c0351 into main Apr 16, 2026
4 checks passed

gnidan deleted the architect-return-data-optional branch April 16, 2026 07:21

gnidan mentioned this pull request Apr 16, 2026

format: make invoke.target optional for internal calls #213

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

format: make return.data optional#211

format: make return.data optional#211
gnidan merged 1 commit intomainfrom
architect-return-data-optional

gnidan commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gnidan commented Apr 16, 2026

Summary

Motivation (discovered in #210)

Other use cases unlocked

Changes

Downstream

Test plan

Uh oh!

github-actions Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Apr 16, 2026 •

edited

Loading