Skip to content

format: make return.data optional#211

Merged
gnidan merged 1 commit intomainfrom
architect-return-data-optional
Apr 16, 2026
Merged

format: make return.data optional#211
gnidan merged 1 commit intomainfrom
architect-return-data-optional

Conversation

@gnidan
Copy link
Copy Markdown
Member

@gnidan gnidan commented Apr 16, 2026

Summary

Makes the data pointer optional on function return contexts so
compilers can annotate returns that have no observable value —
most importantly, TCO back-edge JUMPs — without emitting
misleading placeholder pointers.

Motivation (discovered in #210)

PR #210 adds TCO debug-context preservation to bugc. At the
tail-call-optimized back-edge JUMP, the intermediate return
value is not materialized on the stack — it would have become
the next iteration's argument, which the compiler has already
folded into the new call's setup. The schema previously required
data, forcing the compiler to emit a placeholder pointer to
stack slot 0. But at that instruction slot 0 is the new
iteration's first argument (or the return address), not the
return value. A debugger following the pointer would mislabel
unrelated stack content as the return value.

Making data optional is the clean fix, and it matches the
revert context's precedent: reason and panic are both
optional there on the same grounds ("a bare revert: {} is
permitted when the compiler knows a revert occurred but has no
further detail").

Other use cases unlocked

  • Void functions — no return value to point at.
  • Lost compiler precision — compiler knows a return occurred
    but has dropped the value's location tracking.
  • Optimized returns where the value lives in a register
    already consumed by the subsequent instruction.

Changes

  • schemas/program/context/function/return.schema.yaml — drop
    data from required, expand descriptions, add a no-data
    example (TCO back-edge).
  • packages/format/src/types/program/context.ts
    data?: Function.PointerRef and adjust the isInfo guard.
  • packages/web/spec/program/context/function/return.mdx
    rewrite "Field optionality" section covering void, TCO, and
    lost-precision cases.
  • packages/bugc/src/evmgen/call-contexts.test.ts — add a
    non-null assertion at the one test site asserting on emitted
    data (compiler does emit it there; the assertion just
    satisfies the narrower type).

Downstream

Unblocks #210: once this lands, the compiler can drop the
stack-slot-0 placeholder and emit a bare
return: { identifier, declaration? } at TCO back-edges.

Test plan

  • yarn build passes
  • yarn test passes (942 tests, same as main)
  • yarn lint clean (only pre-existing warnings)
  • Schema guard tests exercise both the new no-data example
    and the existing data-bearing examples

The data pointer on function return contexts was required,
forcing compilers to emit placeholder pointers at instructions
where no return value is observable. The common motivating case
is a tail-call-optimized back-edge JUMP, where the intermediate
return value is not materialized on the stack (it would have
become the next iteration's argument, which the compiler has
already folded into the new call's setup). A placeholder
pointer at such an instruction would mislabel unrelated stack
content as the return value.

Make data optional, matching the revert context's precedent
(reason and panic are both optional there for similar reasons).
A bare `return: {}` is now permitted, analogous to bare
`revert: {}`.

This also accommodates void functions (no return value to
point at) and lost-compiler-precision cases where the compiler
knows a return occurred but has dropped the value's location.

Updates:
- return.schema.yaml: drop data from required, expand
  descriptions, add a no-data example.
- context.ts: data?: Function.PointerRef and adjust guard.
- return.mdx: rewrite Field optionality section covering
  void/TCO/lost-precision cases.
- call-contexts.test.ts: use non-null assertion where the
  test asserts the compiler did emit data.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 16, 2026

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-16 07:26 UTC

@gnidan gnidan merged commit e8c0351 into main Apr 16, 2026
4 checks passed
@gnidan gnidan deleted the architect-return-data-optional branch April 16, 2026 07:21
gnidan added a commit that referenced this pull request Apr 16, 2026
Per the format change in #211 making `return.data` optional,
the TCO back-edge JUMP now emits a bare return context
(identifier + declaration only). The stack-slot-0
placeholder was semantically wrong anyway — that slot holds
the new iteration's first argument, not the previous
iteration's return value. TCO doesn't materialize the
intermediate return value at all; the actual return happens
at the function's terminal RETURN.
gnidan added a commit that referenced this pull request Apr 16, 2026
* bugc: verify optimizer preserves invoke/return contexts

Adds a behavioral test suite that compiles a set of source
patterns at every optimization level (0, 1, 2, 3) and:
  - asserts the bytecode still runs correctly end-to-end
  - counts invoke/return contexts by instruction type and
    function identifier, then asserts the expected shape

Covers every pass that could touch call sites or returns:
  L1: constant folding, propagation, DCE
  L2: CSE, TCO, jump optimization
  L3: block merging, return merging, R/W merging

Confirms that only tail call optimization eliminates
contexts (by design — the tail call becomes a jump). All
other transformations preserve invoke/return contexts
across levels for simple calls, nested calls, mutual
recursion, non-tail self-recursion, and multi-path returns
of the same value.

This is groundwork for the transform context spec.

* bugc: preserve invoke context through tail call optimization

TCO replaces a tail-recursive call terminator with a jump
to the function's loop header. Previously this dropped the
invoke debug context, so the recursive call became
invisible to debuggers — a deeply recursive program looked
like one giant loop with no logical call stack.

Now the TCO pass records a TailCall metadata block on the
replacement jump terminator, and codegen attaches an invoke
debug context to the generated JUMP. The context mirrors
the normal caller-JUMP invoke: identity + declaration +
code target, no argument pointers. patchInvokeTarget
resolves the placeholder code offset from the function
registry the same way it does for regular calls.

No matching return context is emitted for the TCO'd call —
the tail call folds into the outer activation's return, and
a future transform: tailcall marker will let the debugger
reconcile the missing return when the outer function
eventually returns and pops all accumulated tail frames at
once.

Updates the optimizer-contexts test suite to assert the
preserved invoke is present at levels 2 and 3, and that the
return context intentionally does not duplicate.

* bugc: pair invoke with return on TCO back-edge JUMP

Refines the TCO debug-context fix: the back-edge JUMP now
carries a gather context with BOTH the previous iteration's
return and the new iteration's invoke. Depth stays constant
across the JUMP — one frame pops, one pushes, on the same
instruction. The function's terminal RETURN then pops the
final iteration's frame normally.

This models source-level semantics rather than the
optimized control flow: the debugger's logical call stack
matches what the programmer wrote, and transform: tailcall
markers (future work) can annotate these JUMPs as
TCO-produced.

Also fixes patchInvokeTarget to walk into gather contexts
so the invoke leaf's placeholder code offset gets resolved
from the function registry.

Test helper countCallSites updated to unwrap gather
contexts and count (invoke, return) pairs on JUMPs
separately from the traditional JUMPDEST buckets.

* bugc: drop placeholder return.data at TCO back-edges

Per the format change in #211 making `return.data` optional,
the TCO back-edge JUMP now emits a bare return context
(identifier + declaration only). The stack-slot-0
placeholder was semantically wrong anyway — that slot holds
the new iteration's first argument, not the previous
iteration's return value. TCO doesn't materialize the
intermediate return value at all; the actual return happens
at the function's terminal RETURN.
gnidan added a commit that referenced this pull request Apr 16, 2026
Internal calls via JUMP normally carry a code pointer to the
callee's entry point. When the compiler inlines a function,
the JUMP is elided — there is no physical call instruction
and no code target to point at. The callee identity
(identifier, declaration, type) remains meaningful, but the
target pointer does not.

Same pattern as #211 (making return.data optional). Unblocks
inlining: bugc can emit invoke contexts on inlined first
instructions without fabricating a target pointer.

- Schema: drop target from InternalCall.required, expand
  description, add worked example for inlined case
- TS types: mark target optional; guard relaxed
- Spec page: document optionality and point at transform +
  gather for inlining annotation
- bugc: guard target access in patchInvokeInContext; tests
  assert target defined before dereferencing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant