From 9d634e9edae2f3b8db7ca7995caa4361d820297e Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 18:25:44 +0000 Subject: [PATCH 01/45] branch_gap_triage: AST-structural modality bucketing Evolves triage from "group dark arms by method" to "classify each dark arm by which testing modality can reach it". Four buckets: fuzz_axis valid program, unseen shape (case-on-AST, &&/||, live if/while body) -> one fuzz axis covers a family + a mutant negative_spec the arm raises/diagnoses -> invalid-program only; fuzz cannot reach it by construction ffi_integration extern/require/module boundary -> needs a real external artifact a fuzzer can't synthesize accept_defensive narrow inert residue (synthetic else, empty, nil) -> annotate + accept; human-confirmed, never auto-accepts a reachable arm Classification is AST-structural, NOT a regex over the arm line (the rejected fake-value grep): the SimpleCov parent tuple gives the decision kind, and the arm's (line,col) span is matched to an AST node whose subtree is inspected for raise/FFI. Two PER-PROJECT LEXICON constants (FFI boundary methods, diagnostic message names) are the only project-specific knobs -- the algorithm generalizes, swap the lexicon per codebase. Result over the 3 lowering files: fuzz_axis 590, accept_defensive 296, ffi_integration 53, negative_spec 16. This is the work plan: not one fuzz test, not tons of unit tests, not an integration suite -- overwhelmingly fuzz axes, a bounded FFI .cht set, a tiny negative-spec set, a human-confirmed accept residue. Lives entirely in the coverage tool; decomplex untouched and stays static/zero-runtime (boundary preserved). Co-Authored-By: Claude Opus 4.7 --- tools/branch_gap_triage.rb | 217 ++++++++++++++++++++++++++++++++----- 1 file changed, 191 insertions(+), 26 deletions(-) diff --git a/tools/branch_gap_triage.rb b/tools/branch_gap_triage.rb index 784e96ee9..fdbbd56eb 100644 --- a/tools/branch_gap_triage.rb +++ b/tools/branch_gap_triage.rb @@ -1,26 +1,33 @@ #! /usr/bin/env ruby -# Branch-gap TRIAGE: collapse never-taken arms to their enclosing method. +# Branch-gap TRIAGE + MODALITY BUCKETING. # -# You do not triage 955 branches. You triage the ~N methods that contain -# them. A never-taken arm's fill-modality is a property of the DECISION -# the enclosing method makes, not of the arm: +# You do not triage N branches. You triage the ~M methods that contain +# them, and for each dark arm you decide WHICH testing modality can +# reach it. A never-taken arm is exactly one of four things: # -# - a dark arm in an escape / frame / cleanup / move decision is a -# latent UAF / double-free / leak, reachable only by a VALID program -# of some shape the corpus never wrote -> fuzz template axis (scales -# combinatorially; one axis value covers a whole arm family) + mutant. -# - a dark arm in a diagnostic / error builder is reachable only by an -# INVALID program. Fuzz emits valid self-checking programs by -# construction, so fuzz can NEVER reach it -> negative unit spec -# (deterministic, one per error cluster). -# - a dark arm guarding an impossible / defensive case (raise -# unreachable, exhaustive-when else) -> accept + annotate. No test. -# - whole-program integration .cht almost never fills a branch gap: -# 92 real programs moved this set 50/1005 arms. Not the tool. +# fuzz-axis reachable by a VALID program of an unseen shape +# (case-on-AST dispatch, &&/|| clause gap, a live +# if/while body). One fuzz template axis covers a +# whole arm family combinatorially. + a mutant. +# negative-spec the arm raises / diagnoses -> reachable only by an +# INVALID program. Fuzz emits valid self-checking +# programs by construction and can NEVER reach it. +# One deterministic negative unit spec per cluster. +# ffi-integration the arm is in the extern/require/module boundary +# -> needs a real external artifact a fuzzer cannot +# synthesize. A handful of targeted .cht. (Whole- +# program .cht is otherwise the WRONG lever: 92 real +# programs moved this set 50/1005 arms.) +# accept-defensive an effect-free else / impossible guard -> annotate +# and remove from the denominator. No test. (Human +# confirms; this bucket is proposed, not decided.) # -# This script does the collapse and the ranking. It does NOT classify by -# regexing arm source lines (that is the fake-value grep) — it groups by -# enclosing `def` so a human reads the DECISION, then assigns modality. +# Classification is AST-STRUCTURAL, never a regex over the arm's source +# line (that is the fake-value grep): the SimpleCov parent tuple gives +# the decision kind, and the arm's (line,col)-(line,col) span is +# matched to an AST node whose subtree is then inspected for raise / +# FFI. The two PER-PROJECT LEXICON constants below are the only +# project-specific knobs (generalizable: swap them per codebase). # # Usage: ruby tools/branch_gap_triage.rb [src/file.rb ...] @@ -34,6 +41,18 @@ src/mir/mir_lowering.rb ].freeze +# --- PER-PROJECT LEXICON (the only project-specific config) --- +# Methods that ARE the FFI / package boundary: a dark arm here needs a +# real external module + oracle no fuzzer can synthesize. +FFI_BOUNDARY = %w[ + build_extern_trampoline_call build_extern_trampoline_method + build_extern_trampoline_common lower_extern_direct_call + lower_require lower_module +].freeze +# Message names that mean "this arm is an error/diagnostic path" +# (reachable only by an invalid program). +DIAGNOSTIC_MIDS = %i[raise fail abort].freeze + abort "no #{RESULTSET}" unless File.exist?(RESULTSET) merged = Hash.new @@ -68,24 +87,158 @@ def method_index(lines) idx end +# All AST nodes of a file, for span -> node resolution. +def ast_nodes(abspath) + root = RubyVM::AbstractSyntaxTree.parse(File.read(abspath), + keep_script_lines: true) + acc = [] + walk = lambda do |n| + return unless n.is_a?(RubyVM::AbstractSyntaxTree::Node) + + acc << n + n.children.each { |c| walk.call(c) } + end + walk.call(root) + acc +rescue SyntaxError, StandardError + [] +end + +# Smallest AST node whose span covers the arm span (sl,sc)-(el,ec); +# prefers an exact match. nil if none (then we fall back to the +# decision kind alone, flagged low-confidence). +def node_for(nodes, sl, sc, el, ec) + span = ->(n) { [n.first_lineno, n.first_column, n.last_lineno, n.last_column] } + exact = nodes.find { |n| span.call(n) == [sl, sc, el, ec] } + return exact if exact + + covering = nodes.select do |n| + a = span.call(n) + (a[0] < sl || (a[0] == sl && a[1] <= sc)) && + (a[2] > el || (a[2] == el && a[3] >= ec)) + end + covering.min_by { |n| (n.last_lineno - n.first_lineno) * 1000 + n.children.size } +end + +def subtree_calls(node) + mids = [] + stack = [node] + until stack.empty? + n = stack.pop + next unless n.is_a?(RubyVM::AbstractSyntaxTree::Node) + + case n.type + when :FCALL, :VCALL then mids << n.children[0] + when :CALL, :OPCALL, :QCALL then mids << n.children[1] + end + n.children.each { |c| stack << c } + end + mids +end + +# accept-defensive is the NARROW residue: an arm that produces no +# observable outcome -- the synthetic implicit `else` SimpleCov still +# counts, an empty body, a bare `nil`. Anything that calls, assigns, +# returns/breaks, or yields a value IS a reachable valid-program +# decision outcome and defaults to fuzz_axis (human triage may later +# demote a genuinely-impossible one; we never auto-accept a reachable +# arm). +def trivial?(node) + return true if node.nil? + return true if node.type == :NIL + return true if node.type == :BEGIN && node.children.compact.empty? + return false if subtree_calls(node).any? + return false if has_type?(node, %i[LASGN IASGN OP_ASGN ATTRASGN MASGN + GASGN CVASGN RETURN NEXT BREAK YIELD]) + + # a bare value (literal / lvar / ivar) IS the branch's outcome -> + # reachable, not inert. + !has_type?(node, %i[LIT STR SYM INTEGER FLOAT LVAR IVAR DVAR CONST + ARRAY HASH TRUE FALSE]) +end + +def has_type?(node, types) + stack = [node] + until stack.empty? + n = stack.pop + next unless n.is_a?(RubyVM::AbstractSyntaxTree::Node) + return true if types.include?(n.type) + + n.children.each { |c| stack << c } + end + false +end + +DISPATCH_KINDS = %i[case when].freeze +CONJ_KINDS = %i[& |].freeze +COND_KINDS = %i[if unless ternary while until for].freeze + +# decision_kind: Symbol from the SimpleCov parent tuple ([:if,...] etc). +# Returns [bucket, confidence]. +def classify(method_name, decision_kind, arm_node) + return [:ffi_integration, :high] if FFI_BOUNDARY.include?(method_name) + + if arm_node && (subtree_calls(arm_node) & DIAGNOSTIC_MIDS).any? + return [:negative_spec, :high] + end + + if DISPATCH_KINDS.include?(decision_kind) || CONJ_KINDS.include?(decision_kind) + return [:fuzz_axis, arm_node ? :high : :low] + end + + if COND_KINDS.include?(decision_kind) + return [:accept_defensive, :med] if trivial?(arm_node) + + return [:fuzz_axis, arm_node ? :high : :low] + end + + [:accept_defensive, :low] +end + +ACTION = { + fuzz_axis: 'fuzz template axis (+ mutant)', + negative_spec: 'negative unit spec (fuzz cannot reach)', + ffi_integration: 'targeted FFI/package .cht', + accept_defensive: 'annotate + accept (human-confirm)' +}.freeze + targets = ARGV.empty? ? DEFAULT_FILES : ARGV +grand = Hash.new(0) + targets.each do |rel| abspath = File.join(ROOT, rel) branches = merged[abspath] next unless branches + lines = File.readlines(abspath) midx = method_index(lines) + nodes = ast_nodes(abspath) by_method = Hash.new { |h, k| h[k] = [] } total_by_method = Hash.new(0) - branches.each do |_p, arms| + bucket_by_method = Hash.new { |h, k| h[k] = Hash.new(0) } + file_bucket = Hash.new(0) + + branches.each do |parent, arms| + pkind = parent.gsub(/[\[\]:\s]/, '').split(',').first.to_s.to_sym arms.each do |arm, count| a = arm.gsub(/[\[\]:]/, '').split(',').map(&:strip) line = a[2].to_i meth, mstart = midx[line] || ['(top-level)', 0] key = [meth, mstart] total_by_method[key] += 1 - by_method[key] << [line, a[0]] if count.to_i.zero? + next unless count.to_i.zero? + + sl = a[2].to_i + sc = a[3].to_i + el = a[4].to_i + ec = a[5].to_i + anode = node_for(nodes, sl, sc, el, ec) + bucket, conf = classify(meth, pkind, anode) + by_method[key] << [line, a[0], bucket, conf] + bucket_by_method[key][bucket] += 1 + file_bucket[bucket] += 1 + grand[bucket] += 1 end end @@ -93,12 +246,24 @@ def method_index(lines) .sort_by { |(_, _), v| -v.size } puts "\n##### #{rel} — #{ranked.size} methods carry dark arms " \ "(#{by_method.values.sum(&:size)} arms)" - puts format(' %-42s %5s %5s %s', 'method', 'dark', 'tot', 'dark-arm lines') + puts ' buckets: ' + file_bucket.sort_by { |_, n| -n } + .map { |b, n| "#{b}=#{n}" }.join(' ') + puts format(' %-40s %4s %4s %-16s %s', + 'method', 'dark', 'tot', 'dominant', 'bucket mix') ranked.each do |(meth, mstart), arms| tot = total_by_method[[meth, mstart]] - ls = arms.map(&:first).uniq.sort - shown = ls.first(12).join(',') - shown += ",+#{ls.size - 12}" if ls.size > 12 - puts format(' %-42s %5d %5d %s', "#{meth}@#{mstart}", arms.size, tot, shown) + mix = bucket_by_method[[meth, mstart]] + dom = mix.max_by { |_, n| n }.first + mixs = mix.sort_by { |_, n| -n }.map { |b, n| "#{b}:#{n}" }.join(' ') + puts format(' %-40s %4d %4d %-16s %s', + "#{meth}@#{mstart}", arms.size, tot, dom, mixs) end end + +puts "\n##### MODALITY WORK PLAN (all targets)" +grand.sort_by { |_, n| -n }.each do |bucket, n| + puts format(' %-18s %5d arms -> %s', bucket, n, ACTION[bucket]) +end +puts "\n Triage order: fuzz_axis (combinatorial, memory-safety) first," \ + " then negative_spec, then ffi_integration; accept_defensive is" \ + " human-confirmed and leaves the denominator." From 05b3a954e134214e6d863d0eb90cb1a3bdc40e5c Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 18:48:53 +0000 Subject: [PATCH 02/45] =?UTF-8?q?fuzz:=206=20mir=5Flowering=20matrices=20?= =?UTF-8?q?=E2=80=94=20surface=203=20bugs,=20measure=20coverage?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 6 proposed fuzz_axis matrices, built and run. Result: BUGS (3 real, all OPEN — deliberately not fixed): catch_allocator_matrix surfaces B1: `r = maybe("") OR fbv` (frame fallback, heap success) -> Invalid free (invariant #9). catch_reassign_matrix surfaces B2 (leak: reassign-through-OR on success) and B3 (segfault: struct field fallback reads itself mid-cleanup -> UAF). All three are the catch/OR-rescue allocator-identity family — the exact P0 cluster branch_gap_triage flagged (infer_catch_value_ allocator 12/12 dark). The modality plan predicted it; the targeted matrices confirmed real memory-safety bugs there. Documented in docs/agents/fuzz-matrix-surfaced-bugs.md; the failing :pass cells are the live tickets. CLEAN: capability_wrap_matrix (3/3, +3 in_dev), match_matrix (6/6), indexed_assignment_matrix (20/20), binary_op_matrix (21/21) — after fixing two template-correctness bugs of mine (off-by-one list index; inverted string lt/gte oracle). These were my noise, not CLEAR bugs. COVERAGE: 68 cells moved mir_lowering branch coverage 673 -> 671 (2 arms). Verified real (COVERAGE=1 fuzz run writes a transpile-tests resultset entry with mir_lowering data; branch_gap_triage merges it). This reproduces the "92 programs -> 50 arms" result more starkly: feature-level fuzzing finds bugs well but is NOT a branch-closure lever — the dark arms need exact triggering type_info, and/or the fuzz_axis bucket is over-assigned vs reachability. Full analysis in the forensic doc. Co-Authored-By: Claude Opus 4.7 --- docs/agents/fuzz-matrix-surfaced-bugs.md | 85 ++++++++++++++ tools/fuzz/templates/binary_op_matrix.rb | 75 +++++++++++++ .../fuzz/templates/capability_wrap_matrix.rb | 58 ++++++++++ .../fuzz/templates/catch_allocator_matrix.rb | 86 ++++++++++++++ tools/fuzz/templates/catch_reassign_matrix.rb | 82 ++++++++++++++ .../templates/indexed_assignment_matrix.rb | 106 ++++++++++++++++++ tools/fuzz/templates/match_matrix.rb | 76 +++++++++++++ 7 files changed, 568 insertions(+) create mode 100644 docs/agents/fuzz-matrix-surfaced-bugs.md create mode 100644 tools/fuzz/templates/binary_op_matrix.rb create mode 100644 tools/fuzz/templates/capability_wrap_matrix.rb create mode 100644 tools/fuzz/templates/catch_allocator_matrix.rb create mode 100644 tools/fuzz/templates/catch_reassign_matrix.rb create mode 100644 tools/fuzz/templates/indexed_assignment_matrix.rb create mode 100644 tools/fuzz/templates/match_matrix.rb diff --git a/docs/agents/fuzz-matrix-surfaced-bugs.md b/docs/agents/fuzz-matrix-surfaced-bugs.md new file mode 100644 index 000000000..6a12ef235 --- /dev/null +++ b/docs/agents/fuzz-matrix-surfaced-bugs.md @@ -0,0 +1,85 @@ +# Bugs surfaced by the 6 mir_lowering fuzz matrices + +Status: OPEN. Not fixed (deliberately — the task was to surface, not +fix). Each is reproduced by a `:pass` fuzz cell that currently fails; +the red cell is the live ticket. + +All three are the **same family**: the catch / OR-rescue path +(`expr OR fallback`) mishandling allocator identity / cleanup across +the success vs error split. This is invariant #9 ("error paths +preserve allocator identity") and is exactly the decision +`branch_gap_triage` flagged as the P0 — `infer_catch_value_allocator` +was 12/12 dark and `lower_or_rescue` / `walk_catch_body_for_reassigns` +heavily fuzz_axis. The modality plan predicted this cluster; the +targeted matrices confirmed real bugs there. + +## B1 — invalid free: OR fallback is a frame value, success is heap +Template `catch_allocator_matrix`, cell +`{value: string, fallback: frame_var, taken: failure}`. + +``` +FN maybe(s: String) RETURNS !String -> + IF s.length() == 0_i64 THEN RAISE "empty"; END + RETURN COPY s; +END +FN main() RETURNS Void -> + fbv: String = "fb"; + r = maybe("") OR fbv; # raises -> r = fbv (frame String) + ASSERT r.length() >= 0_i64, "fallback value live"; + RETURN; +END +``` +`maybe("")` raises, so `r` takes the frame-allocated `fbv`. But the +OR-rescue lowering binds `r`'s cleanup to the success path's heap +allocator (`COPY s`). Scope-end frees a frame value with the heap +allocator → `thread panic: Invalid free`. + +## B2 — leak: reassign an outer binding through OR on the success path +Template `catch_reassign_matrix`, cell +`{var: local, value: string, taken: success}`. + +``` +MUTABLE acc = "init"; +acc = maybe("X") OR acc; # success -> acc = COPY-heap value +``` +The prior value of `acc` (or the new heap temp) is not cleaned across +the reassignment-through-OR; debug allocator reports leaked memory. + +## B3 — segfault: struct field reassigned from a fallible expr whose +fallback is the field itself +Template `catch_reassign_matrix`, cell +`{var: struct_field, value: string, taken: failure}`. + +``` +MUTABLE h = Holder{ acc: "init" }; +h.acc = maybe("") OR h.acc; # raises -> fallback reads h.acc while + # the reassignment is mid-cleanup +``` +`Segmentation fault` — use-after-free: the error path reads `h.acc` +for the fallback after the field's old value has been freed by the +reassignment cleanup. + +## Coverage note (the other half of the result) + +The 6 matrices (68 cells) moved mir_lowering branch coverage by **2 +arms (673 -> 671)** — essentially zero, despite exercising maps, +catch, match, capabilities, binary ops, and indexed assignment as +features. This reproduces, more starkly, the earlier "92 example +programs -> 50/1005 arms" result. Interpretation (both likely true): + +1. The dark arms need the *exact* triggering `type_info` shape + (the `dispatch_key x value_transforms x shard_direct` cross, the + `:dupe_borrowed_union` borrowed-union-into-map path, etc.), not + surface-level feature coverage. Feature fuzzing retreads + already-covered common arms. +2. The `fuzz_axis` bucket is likely over-assigned: many of those 590 + arms are closer to `accept_defensive` (reachable only by shapes a + valid program does not produce). The bucketer is a proposed + structural classification, not a verdict — this is the human- + confirm signal firing. + +Conclusion: feature-level fuzz matrices are high value for *finding +bugs* (3 real memory-safety bugs in the predicted P0 cluster) but, as +built, are NOT a branch-coverage-closure lever. Closing the branch gap +requires shape-specific cells driven off the actual dark `type_info`, +or re-triaging the fuzz_axis bucket against reachability. diff --git a/tools/fuzz/templates/binary_op_matrix.rb b/tools/fuzz/templates/binary_op_matrix.rb new file mode 100644 index 000000000..bd5bbcd4d --- /dev/null +++ b/tools/fuzz/templates/binary_op_matrix.rb @@ -0,0 +1,75 @@ +# Template: binary-operator lowering matrix. +# +# Targets src/mir/mir_lowering.rb#lower_binary_op + #lower_or_rescue. +# The dark arms are the operator x operand-type combinations the corpus +# never wrote: MOD on i64, string comparison (eql / strcmp branches), +# heap-string operands (the hoist_alloc path), concat, and the +# OR-rescue fallback. Operands come from a fn so constant-folding does +# not erase the decision. +# +# expected :pass; a failing/leaking :pass cell is a SURFACED bug. + +BOM_CELLS = [] +BOM_OPS = %i[eq neq lt gte mod concat or_fallback] +BOM_TYPES = %i[int float str_lit heap_str] + +BOM_OPS.each do |op| + BOM_TYPES.each do |t| + # MOD: integers only. concat / or_fallback: strings only. + next if op == :mod && t != :int + next if op == :concat && !%i[str_lit heap_str].include?(t) + next if op == :or_fallback && !%i[str_lit heap_str].include?(t) + # float ordering only exercises lt/gte/eq/neq. + next if t == :float && %i[concat or_fallback mod].include?(op) + BOM_CELLS << { op: op, type: t } + end +end + +# lhs is ALWAYS the strictly-smaller operand and rhs the larger, for +# every type, so the oracle is uniform: lt true, gte false, eq false, +# neq true. (MOD has its own dedicated operands below.) +def bom_provider(t) + case t + when :int then "FN lhs() RETURNS Int64 -> RETURN 3_i64; END\nFN rhs() RETURNS Int64 -> RETURN 10_i64; END" + when :float then "FN lhs() RETURNS Float64 -> RETURN 1.5; END\nFN rhs() RETURNS Float64 -> RETURN 2.5; END" + when :str_lit then "FN lhs() RETURNS String -> RETURN \"abc\"; END\nFN rhs() RETURNS String -> RETURN \"abd\"; END" + when :heap_str then "FN lhs() RETURNS !String -> RETURN COPY \"abc\"; END\nFN rhs() RETURNS !String -> RETURN COPY \"abd\"; END" + end +end + +def bom_lhs(t) = (t == :heap_str ? "(lhs())" : "lhs()") +def bom_rhs(t) = (t == :heap_str ? "(rhs())" : "rhs()") + +def bom_body(op, t) + l = bom_lhs(t) + r = bom_rhs(t) + case op + when :eq then " ASSERT (#{l} == #{r}) == FALSE, \"eq #{t}\";" + when :neq then " ASSERT (#{l} != #{r}), \"neq #{t}\";" + when :lt then " ASSERT (#{l} < #{r}), \"lt #{t}\";" + when :gte then " ASSERT (#{l} >= #{r}) == FALSE, \"gte #{t}\";" + when :mod then " ASSERT (10_i64 MOD 3_i64) == 1_i64, \"mod #{t}\";" + when :concat + " t: String = #{l} + #{r};\n ASSERT t.length() == 6_i64, \"concat #{t}\";" + when :or_fallback + # lhs() is non-fallible here; exercise the OR fallback shape with a + # fallible callee so lower_or_rescue lowers. + " v: String = mightFail() OR \"fb\";\n ASSERT v.length() >= 2_i64, \"or fallback #{t}\";" + end +end + +def bom_extra_fn(op) + return "" unless op == :or_fallback + + "FN mightFail() RETURNS !String ->\n RETURN COPY \"ok\";\nEND\n" +end + +FuzzGenerator.register(:binary_op_matrix, cells: BOM_CELLS) do |p| + <<~CHT + #{bom_provider(p[:type])} + #{bom_extra_fn(p[:op])}FN main() RETURNS Void -> + #{bom_body(p[:op], p[:type])} + RETURN; + END + CHT +end diff --git a/tools/fuzz/templates/capability_wrap_matrix.rb b/tools/fuzz/templates/capability_wrap_matrix.rb new file mode 100644 index 000000000..39da93083 --- /dev/null +++ b/tools/fuzz/templates/capability_wrap_matrix.rb @@ -0,0 +1,58 @@ +# Template: capability-wrap composition matrix. +# +# Targets src/mir/mir_lowering.rb#compose_capability_wrap + +# #lower_with_block + the cap path of #lower_var_decl. Dark arms = the +# sync_fn case (lockedCreate / rwLockedCreate / refCellCreate / +# versionedCreate) and the WITH access form per modality -- the corpus +# only ever declared @locked. Each cell declares a wrapped binding, +# enters the matching WITH, mutates/reads, asserts. +# +# Confirmed syntax (access_gate.rb): @locked/@writeLocked/@versioned + +# WITH EXCLUSIVE/SNAPSHOT. @alwaysMutable / @atomic / storage wraps are +# reserved :in_dev (WITH form not yet confirmed -- reserving matrix +# space, not emitting invalid noise). +# +# expected :pass; a failing/leaking :pass cell is a SURFACED bug. + +CWM_CELLS = [] +# [sync, with_head, expected] +CWM_MODES = [ + [:locked, "WITH EXCLUSIVE c AS ref", :pass], + [:write_locked, "WITH EXCLUSIVE c AS ref", :pass], + [:versioned, "WITH SNAPSHOT c AS ref", :pass], + [:always_mutable, nil, :in_dev], + [:atomic, nil, :in_dev], + [:shared_locked, nil, :in_dev], +] +CWM_MODES.each do |sync, head, exp| + CWM_CELLS << { sync: sync, head: head, expected: exp } +end + +def cwm_sync_decl(sync) + case sync + when :locked then "@locked" + when :write_locked then "@writeLocked" + when :versioned then "@versioned" + end +end + +FuzzGenerator.register(:capability_wrap_matrix, cells: CWM_CELLS) do |p| + # :in_dev cells render a placeholder; the harness emits them as + # comments and never runs them. + if p[:expected] == :in_dev + next "# in_dev: capability wrap #{p[:sync]} (WITH form unconfirmed)\n" + end + + <<~CHT + STRUCT Counter { value: Int64 } + + FN main() RETURNS Void -> + MUTABLE c = Counter{ value: 1_i64 } #{cwm_sync_decl(p[:sync])}; + #{p[:head]} { + x: Int64 = ref.value; + ASSERT x == 1_i64, "#{p[:sync]} wrap read"; + } + RETURN; + END + CHT +end diff --git a/tools/fuzz/templates/catch_allocator_matrix.rb b/tools/fuzz/templates/catch_allocator_matrix.rb new file mode 100644 index 000000000..686393753 --- /dev/null +++ b/tools/fuzz/templates/catch_allocator_matrix.rb @@ -0,0 +1,86 @@ +# Template: catch / OR-rescue allocator-identity matrix (the P0). +# +# Targets src/mir/mir_lowering.rb#infer_catch_value_allocator (12/12 +# dark -- invariant #9: error paths must preserve allocator identity) +# + #lower_or_rescue. The decision is: when `v = mayFail() OR fallback`, +# the success value and the fallback value may have DIFFERENT +# allocators (heap COPY vs frame literal vs primitive). If lowering +# binds one allocator on success and a different one on the error path, +# that is a UAF / double-free / leak. The corpus never crossed +# success_alloc x fallback_alloc. +# +# Both success and failure paths are exercised (call arg "" forces the +# RAISE -> fallback path; non-empty forces success). expected :pass; +# any leak / mir-error on a :pass cell is the invariant-#9 bug class. + +CAM_CELLS = [] +CAM_VALUE = %i[string int] # value type flowing out +CAM_SUCCESS = %i[heap] # success path: COPY -> heap +CAM_FALLBK = %i[heap_empty literal frame_var] # fallback allocator shape +CAM_TAKEN = %i[success failure] # which path the input forces + +CAM_VALUE.each do |vt| + CAM_FALLBK.each do |fb| + CAM_TAKEN.each do |taken| + # int value: only the primitive fallback shapes make sense. + next if vt == :int && fb == :heap_empty + CAM_CELLS << { value: vt, fallback: fb, taken: taken } + end + end +end + +def cam_inner(vt) + if vt == :string + "FN maybe(s: String) RETURNS !String ->\n" \ + " IF s.length() == 0_i64 THEN RAISE \"empty\"; END\n" \ + " RETURN COPY s;\nEND" + else + "FN maybe(s: String) RETURNS !Int64 ->\n" \ + " IF s.length() == 0_i64 THEN RAISE \"empty\"; END\n" \ + " RETURN s.length();\nEND" + end +end + +def cam_fallback_expr(vt, fb) + if vt == :string + case fb + when :heap_empty then "\"\"" + when :literal then "\"fb\"" + when :frame_var then "fbv" + end + else + fb == :frame_var ? "fbv" : "0_i64" + end +end + +def cam_fallback_setup(vt, fb) + return "" unless fb == :frame_var + + vt == :string ? " fbv: String = \"fb\";" : " fbv: Int64 = 0_i64;" +end + +def cam_call_arg(taken) = (taken == :success ? "\"X\"" : "\"\"") + +def cam_assert(vt, taken) + if vt == :string + taken == :success ? "ASSERT r.length() == 1_i64, \"success heap value\";" \ + : "ASSERT r.length() >= 0_i64, \"fallback value live\";" + else + taken == :success ? "ASSERT r == 1_i64, \"success int value\";" \ + : "ASSERT r >= 0_i64, \"fallback int live\";" + end +end + +FuzzGenerator.register(:catch_allocator_matrix, cells: CAM_CELLS) do |p| + setup = cam_fallback_setup(p[:value], p[:fallback]) + setup_line = setup.empty? ? "" : "#{setup}\n" + <<~CHT + #{cam_inner(p[:value])} + + FN main() RETURNS Void -> + #{setup_line} r = maybe(#{cam_call_arg(p[:taken])}) OR #{cam_fallback_expr(p[:value], p[:fallback])}; + #{cam_assert(p[:value], p[:taken])} + RETURN; + END + CHT +end diff --git a/tools/fuzz/templates/catch_reassign_matrix.rb b/tools/fuzz/templates/catch_reassign_matrix.rb new file mode 100644 index 000000000..ab3591f06 --- /dev/null +++ b/tools/fuzz/templates/catch_reassign_matrix.rb @@ -0,0 +1,82 @@ +# Template: reassign-through-fallible-expression matrix. +# +# Targets src/mir/mir_lowering.rb#walk_catch_body_for_reassigns (12/13 +# fuzz_axis dark). The decision: an outer MUTABLE binding is reassigned +# from a fallible expression `acc = maybe(...) OR acc;`. On the error +# path the binding keeps its OLD value/allocator; on success it takes +# the new one. If lowering mishandles the reassignment cleanup across +# the success/error split, that is a double-free or leak (invariant +# #9). The corpus never reassigned an outer binding through OR-rescue. +# +# var_kind x value_type x path-taken. Both paths exercised. expected +# :pass; a leak / mir-error on a :pass cell is the SURFACED bug. + +CRM_CELLS = [] +CRM_VARKIND = %i[local struct_field] +CRM_VALUE = %i[string int] +CRM_TAKEN = %i[success failure] + +CRM_VARKIND.each do |vk| + CRM_VALUE.each do |vt| + CRM_TAKEN.each do |tk| + CRM_CELLS << { var: vk, value: vt, taken: tk } + end + end +end + +def crm_ret(vt) = (vt == :string ? "!String" : "!Int64") +def crm_succ(vt) = (vt == :string ? "RETURN COPY s;" : "RETURN s.length();") +def crm_arg(tk) = (tk == :success ? "\"X\"" : "\"\"") + +def crm_inner(vt) + "FN maybe(s: String) RETURNS #{crm_ret(vt)} ->\n" \ + " IF s.length() == 0_i64 THEN RAISE \"empty\"; END\n" \ + " #{crm_succ(vt)}\nEND" +end + +def crm_init(vt) = (vt == :string ? "\"init\"" : "7_i64") + +def crm_assert(vt, tk) + if vt == :string + tk == :success ? "ASSERT acc.length() == 1_i64, \"reassigned to success\";" \ + : "ASSERT acc.length() == 4_i64, \"kept old value on failure\";" + else + tk == :success ? "ASSERT acc == 1_i64, \"reassigned to success\";" \ + : "ASSERT acc == 7_i64, \"kept old value on failure\";" + end +end + +FuzzGenerator.register(:catch_reassign_matrix, cells: CRM_CELLS) do |p| + if p[:var] == :local + <<~CHT + #{crm_inner(p[:value])} + + FN main() RETURNS Void -> + MUTABLE acc = #{crm_init(p[:value])}; + acc = maybe(#{crm_arg(p[:taken])}) OR acc; + #{crm_assert(p[:value], p[:taken])} + RETURN; + END + CHT + else + field_t = p[:value] == :string ? "String" : "Int64" + rd = p[:value] == :string ? "h.acc.length()" : "h.acc" + exp = if p[:value] == :string + p[:taken] == :success ? "1_i64" : "4_i64" + else + p[:taken] == :success ? "1_i64" : "7_i64" + end + <<~CHT + #{crm_inner(p[:value])} + + STRUCT Holder { acc: #{field_t} } + + FN main() RETURNS Void -> + MUTABLE h = Holder{ acc: #{crm_init(p[:value])} }; + h.acc = maybe(#{crm_arg(p[:taken])}) OR h.acc; + ASSERT #{rd} == #{exp}, "struct field reassign #{p[:taken]}"; + RETURN; + END + CHT + end +end diff --git a/tools/fuzz/templates/indexed_assignment_matrix.rb b/tools/fuzz/templates/indexed_assignment_matrix.rb new file mode 100644 index 000000000..756f46150 --- /dev/null +++ b/tools/fuzz/templates/indexed_assignment_matrix.rb @@ -0,0 +1,106 @@ +# Template: indexed-assignment lowering matrix. +# +# Targets src/mir/mir_lowering.rb#lower_indexed_assignment (the largest +# fuzz_axis dark cluster: `kind = ti.dispatch_key` -> +# INDEX_OPS.dig(kind,:set) crossed with value_transforms +# {:dupe_string_literal, :dupe_borrowed_union, :container_promote} and +# shard_direct). The corpus only ever wrote `lst[i] = int` into a plain +# list; this enumerates container_shape x key_kind x value_ownership x +# map_wrap so every dispatch/transform arm lowers. +# +# Value type DERIVES from the container (int containers take an Int64 +# value; String-valued maps take string / COPY-string values -- the +# :dupe_string_literal transform arm). Mixing them would be an invalid +# program, not a surfaced bug. +# +# expected :pass with a self-checking ASSERT. A :pass cell that fails, +# leaks, or mir-errors is a SURFACED lowering bug (do not fix here). + +IAM_CELLS = [] + +# container => :seq (int-indexed) | :map_int | :map_str +IAM_CONTAINERS = { + array: :seq, + list: :seq, + map_int: :map_int, + map_int_sharded: :map_int, + map_str: :map_str, + map_str_sharded: :map_str +} + +IAM_CONTAINERS.each do |container, family| + if family == :seq + IAM_CELLS << { container: container, key: :index, value: :primitive } + else + keys = %i[literal variable concat] + values = family == :map_int ? %i[primitive] : %i[str_literal copy_str] + keys.each do |k| + values.each { |v| IAM_CELLS << { container: container, key: k, value: v } } + end + end +end + +def iam_decl(c) + { + array: "MUTABLE box: Int64[] = [0_i64, 0_i64, 0_i64];", + list: "MUTABLE box: Int64[]@list = [];", + map_int: "MUTABLE box: HashMap = {};", + map_int_sharded: "MUTABLE box: HashMap@sharded(2) = {};", + map_str: "MUTABLE box: HashMap = {};", + map_str_sharded: "MUTABLE box: HashMap@sharded(2) = {};" + }[c] +end + +def iam_prep(c) + c == :list ? " box.append(0_i64);" : "" +end + +def iam_key_expr(c, k) + return "0_i64" if %i[array list].include?(c) + + case k + when :literal then "\"kk\"" + when :variable then "kvar" + when :concat then "(\"k\" + \"k\")" + end +end + +def iam_key_setup(c, k) + (k == :variable && !%i[array list].include?(c)) ? " kvar: String = \"kk\";" : "" +end + +def iam_value_expr(v) + case v + when :primitive then "9_i64" + when :str_literal then "\"vv\"" + when :copy_str then "COPY sval" + end +end + +def iam_value_setup(v) + v == :copy_str ? " sval: String = \"vv\";" : "" +end + +def iam_expected_read(c, key_e, v) + if %i[array list].include?(c) + "ASSERT box[#{key_e}] == 9_i64, \"seq indexed set\";" + elsif v == :primitive + "ASSERT (box[#{key_e}] OR 0_i64) == 9_i64, \"map int set\";" + else + "ASSERT (box[#{key_e}] OR \"\") == \"vv\", \"map str set\";" + end +end + +FuzzGenerator.register(:indexed_assignment_matrix, cells: IAM_CELLS) do |p| + key_e = iam_key_expr(p[:container], p[:key]) + val_e = iam_value_expr(p[:value]) + parts = ["FN main() RETURNS Void ->", " #{iam_decl(p[:container])}"] + prep = iam_prep(p[:container]); parts << prep unless prep.empty? + ks = iam_key_setup(p[:container], p[:key]); parts << ks unless ks.empty? + vs = iam_value_setup(p[:value]); parts << vs unless vs.empty? + parts << " box[#{key_e}] = #{val_e};" + parts << " #{iam_expected_read(p[:container], key_e, p[:value])}" + parts << " RETURN;" + parts << "END" + parts.join("\n") + "\n" +end diff --git a/tools/fuzz/templates/match_matrix.rb b/tools/fuzz/templates/match_matrix.rb new file mode 100644 index 000000000..0e566c42b --- /dev/null +++ b/tools/fuzz/templates/match_matrix.rb @@ -0,0 +1,76 @@ +# Template: MATCH lowering matrix. +# +# Targets src/mir/mir_lowering.rb#lower_match. Dark arms = the subject +# shape x arm kind x default-presence cross-product: union payload +# variant, union unit variant, enum, with and without DEFAULT. The +# corpus exercised only a couple of these shapes. +# +# Confirmed syntax (transpile-tests/52_union.cht): UNION with payload +# and unit variants, `Type{ Variant: v }` construction, `PARTIAL MATCH +# subj START Type.Variant -> ...; DEFAULT -> ...; END`. +# +# expected :pass; a failing/leaking :pass cell is a SURFACED bug. + +MM_CELLS = [] +MM_SUBJECT = %i[union_payload union_unit enum] +MM_DEFAULT = %i[with_default no_default] + +MM_SUBJECT.each do |s| + MM_DEFAULT.each do |d| + MM_CELLS << { subject: s, default: d } + end +end + +def mm_default_arm(d) + d == :with_default ? " DEFAULT -> got = 9.0;\n" : "" +end + +FuzzGenerator.register(:match_matrix, cells: MM_CELLS) do |p| + case p[:subject] + when :union_payload + <<~CHT + UNION Shape { Circle: Float64, Rect: Float64, Empty } + + FN main() RETURNS Void -> + s = Shape{ Circle: 2.0 }; + MUTABLE got: Float64 = 0.0; + PARTIAL MATCH s START + Shape.Circle -> got = 1.0;, + Shape.Rect -> got = 2.0;, + #{mm_default_arm(p[:default])} END + ASSERT got == 1.0, "union payload match"; + RETURN; + END + CHT + when :union_unit + <<~CHT + UNION Shape { Circle: Float64, Rect: Float64, Empty } + + FN main() RETURNS Void -> + s = Shape.Empty; + MUTABLE got: Float64 = 0.0; + PARTIAL MATCH s START + Shape.Empty -> got = 5.0;, + Shape.Circle -> got = 1.0;, + #{mm_default_arm(p[:default])} END + ASSERT got == 5.0, "union unit match"; + RETURN; + END + CHT + when :enum + <<~CHT + ENUM Dir { North, South, East } + + FN main() RETURNS Void -> + d: Dir = Dir.South; + MUTABLE got: Float64 = 0.0; + PARTIAL MATCH d START + Dir.North -> got = 1.0;, + Dir.South -> got = 2.0;, + #{mm_default_arm(p[:default])} END + ASSERT got == 2.0, "enum match"; + RETURN; + END + CHT + end +end From 4501716854a200cc17e7340f4711830e3ec0109b Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 18:57:31 +0000 Subject: [PATCH 03/45] =?UTF-8?q?tools:=20bc=5Flower=5Fcoverage=20?= =?UTF-8?q?=E2=80=94=20close=20@target=3D=3D:bc=20arms=20via=20existing=20?= =?UTF-8?q?corpus?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Feasibility verified: the @target==:bc branches in mir_lowering fire during MIRLowering#lower_program (Ruby), before the bytecode VM. The incomplete _bc_runner is irrelevant -- we never execute, never require BcEmitter to succeed; a program that hits `raise Unimplemented` in a :bc arm still covered that arm. Per-file rescue, zero new programs. Result: mir_lowering dark arms 671 -> 656 (15 closed) by re-lowering the existing corpus with target: :bc. Cost comparison vs the 6 hand-written matrices: 15 arms / 0 new programs vs 2 arms / 68 new programs (~250x more cost-efficient). But still only 15/671 -- which is the decisive evidence, from a second direction, that the remaining ~581 fuzz_axis-bucketed arms are NOT closable by program generation in any backend mode. They are internal-IR-state / defensive guards: the fuzz_axis bucket is over-assigned and mir_lowering branch closure is a re-triage problem, not a test-generation problem. Co-Authored-By: Claude Opus 4.7 --- tools/bc_lower_coverage.rb | 60 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) create mode 100644 tools/bc_lower_coverage.rb diff --git a/tools/bc_lower_coverage.rb b/tools/bc_lower_coverage.rb new file mode 100644 index 000000000..7fe06ca56 --- /dev/null +++ b/tools/bc_lower_coverage.rb @@ -0,0 +1,60 @@ +#! /usr/bin/env ruby +# Drive src/ branch coverage of the `@target == :bc` lowering arms by +# re-lowering the EXISTING corpus with target: :bc. Zero new programs. +# +# Feasibility: the `@target == :bc` branches in mir_lowering fire during +# MIRLowering#lower_program (Ruby), which runs BEFORE the bytecode VM. +# The MiniVM (_bc_runner) is incomplete, but that is irrelevant here -- +# we never execute, never even require BcEmitter to succeed. A program +# that hits `raise Unimplemented` inside a :bc arm still EXECUTED that +# arm (coverage is recorded up to the raise). So every per-file failure +# is rescued and counted as "lowering attempted". +# +# Usage: +# COVERAGE=1 ruby tools/bc_lower_coverage.rb +# bundle exec ruby spec/collate_coverage.rb +# ruby tools/branch_gap_triage.rb + +require 'bundler/setup' +require_relative '../spec/coverage_bootstrap' +CoverageBootstrap.start('bc-lower') + +require_relative '../src/backends/transpiler' + +ROOT = File.expand_path('..', __dir__) +files = ( + Dir.glob(File.join(ROOT, 'transpile-tests', '**', '*.cht')) + + Dir.glob(File.join(ROOT, '{examples,benchmarks}', '**', '*.cht')) + + Dir.glob(File.join(ROOT, 'transpile-tests', 'fuzz', '*.cht')) +).reject { |f| File.basename(f).start_with?('._') } + .uniq.sort + +lowered = 0 +raised = 0 +files.each do |path| + dir = File.dirname(path) + begin + imp = ModuleImporter.new(base_dir: dir, use_mir: true) + fe = CompilerFrontend.compile(File.read(path), importer: imp, source_dir: dir) + lo = MIRLowering.new( + struct_schemas: fe.struct_schemas, + enum_schemas: fe.enum_schemas, + union_schemas: fe.union_schemas, + fn_sigs: fe.fn_sigs, + moved_guard_info: fe.moved_guard_info, + importer: imp, + source_dir: dir, + target: :bc + ) + lo.lower_program(fe.ast) + lowered += 1 + rescue StandardError, ScriptError + # A raise inside a :bc arm still covered that arm -- that is the + # point. Count and continue; do not let the incomplete VM / a + # bc-Unimplemented stop the batch. + raised += 1 + end +end + +puts "bc-lower coverage: #{lowered} lowered cleanly, #{raised} raised " \ + "(still covered up to the raise), #{files.size} total" From 8c45f3edeb810ad616c6ceaba7a09560c7b0ac3d Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 18:58:51 +0000 Subject: [PATCH 04/45] ci: bc-lowering coverage job for Codecov New `bc-lower-coverage` job mirroring the transpile-tests / tools-fuzz coverage pattern: COVERAGE=1, run tools/bc_lower_coverage.rb, collate, upload to Codecov with flags `ruby,bc-lower`. Pure Ruby -- no Zig, no clear build (the @target==:bc arms are covered during MIRLowering, before the bytecode VM; the incomplete _bc_runner is never executed), so the job is minimal and fast. fail_ci_if_error: false, matching the other coverage jobs. Co-Authored-By: Claude Opus 4.7 --- .github/workflows/ci.yml | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 3f831bf7c..31d5fb0c1 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -211,6 +211,31 @@ jobs: fail_ci_if_error: false token: ${{ secrets.CODECOV_TOKEN }} + bc-lower-coverage: + name: bc-lowering coverage (@target==:bc arms) + runs-on: ubuntu-latest + timeout-minutes: 20 + env: + COVERAGE: "1" + steps: + - uses: actions/checkout@v4 + - uses: ruby/setup-ruby@v1 + with: + ruby-version: ${{ env.RUBY_VERSION }} + bundler-cache: true + # Pure Ruby: re-lowers the existing .cht corpus with target: :bc + # to cover the @target==:bc lowering arms. No Zig / no clear build + # (lowering runs before the bytecode VM; the incomplete _bc_runner + # is never executed). + - run: bundle exec ruby tools/bc_lower_coverage.rb + - run: bundle exec ruby spec/collate_coverage.rb + - uses: codecov/codecov-action@v5 + with: + files: ./coverage/coverage.xml + flags: ruby,bc-lower + fail_ci_if_error: false + token: ${{ secrets.CODECOV_TOKEN }} + module-integration: name: transpile-tests/module-integration (zig build test) runs-on: ubuntu-latest From e1bea22a8fc88ffd8a846955920d5a74e9053df8 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 19:21:36 +0000 Subject: [PATCH 05/45] fuzz: enumerate binary_op + capability_wrap from dispatch labels Rebuild both from SAMPLED axes to EXHAUSTIVE enumeration of the dispatch's own when-labels, with surface syntax confirmed from lexer/transpile-tests (not guessed): binary_op: every comparison op incl. LTE/GT (was missing), POW int+float (** confirmed), MOD, concat, OR. Symbol-path EXCLUDED -- CLEAR has no surface symbol literal, so those {EQ,NEQ} arms are not source-reachable (accept, not fuzz). 21 -> 30 cells, all clean. capability_wrap: one cell per ft.sync x ownership label {locked, write_locked, always_mutable, versioned, atomic-ptr, multiowned, shared:locked} -- all forms confirmed from transpile-tests; zero in_dev. 6 pass. Surfaces B4: @indirect:atomic + WITH EXCLUSIVE (both the compiler's own directed forms) -> invalid Zig `no field 'ctrl' in AtomicPtr`. The atomicPtrCreate dark arm of compose_capability_wrap is broken. OPEN, not fixed. Coverage delta from the provably-complete enumeration: mir_lowering 656 -> 653 (3 arms). Fourth independent confirmation that branch coverage is not closable by test generation; documented in the forensic. Fuzz's value here is bug-finding (4 bugs on dark arms), not coverage. Co-Authored-By: Claude Opus 4.7 --- docs/agents/fuzz-matrix-surfaced-bugs.md | 46 ++++++ tools/fuzz/templates/binary_op_matrix.rb | 64 ++++---- .../fuzz/templates/capability_wrap_matrix.rb | 141 +++++++++++------- 3 files changed, 172 insertions(+), 79 deletions(-) diff --git a/docs/agents/fuzz-matrix-surfaced-bugs.md b/docs/agents/fuzz-matrix-surfaced-bugs.md index 6a12ef235..54ad80b4a 100644 --- a/docs/agents/fuzz-matrix-surfaced-bugs.md +++ b/docs/agents/fuzz-matrix-surfaced-bugs.md @@ -83,3 +83,49 @@ bugs* (3 real memory-safety bugs in the predicted P0 cluster) but, as built, are NOT a branch-coverage-closure lever. Closing the branch gap requires shape-specific cells driven off the actual dark `type_info`, or re-triaging the fuzz_axis bucket against reachability. + +## B4 — invalid Zig: @indirect:atomic + WITH EXCLUSIVE has no `ctrl` +Template `capability_wrap_matrix` (enumerated), cell `{mode: atomic}`. + +``` +STRUCT Counter { value: Int64 } +FN main() RETURNS Void -> + MUTABLE c = Counter{ value: 1_i64 } @indirect:atomic; + WITH EXCLUSIVE c AS x { x.value = 2_i64; ASSERT x.value == 2_i64; } + RETURN; +END +``` +Both forms are the compiler's OWN guidance (it rejected `@atomic` on a +struct telling us to use `@indirect:atomic`; it rejected +`WITH POLYMORPHIC` telling us to use plain `WITH`). CLEAR then accepts +this and emits invalid Zig: `no field named 'ctrl' in AtomicPtr(...)`. +The `is_atomic_ptr -> atomicPtrCreate` arm of compose_capability_wrap +(a dark arm) is broken. OPEN; not fixed. + +## Enumeration result (the decisive coverage finding) + +`binary_op_matrix` and `capability_wrap_matrix` were rebuilt from +SAMPLED axes to EXHAUSTIVE enumeration of the dispatch's own `when` +labels (every comparison op incl. LTE/GT, POW int+float, every +ft.sync/ownership mode; symbol-path excluded — no surface literal). +binary_op went 21->30 cells, all clean; capability 7 enumerated cells +(6 pass, 1 = B4). + +Branch-gap delta from the *provably complete* enumeration method: +mir_lowering 656 -> 653 (3 arms). Four independent attempts now: + + 92 example programs -> 50 arms + 6 sampled fuzz matrices (68p) -> 2 arms + bc-lower whole corpus (0 new) -> 15 arms + exhaustive dispatch enumeration-> 3 arms + +Conclusion (now ironclad): mir_lowering branch coverage is NOT +closable by test generation, even by the theoretically-complete +enumeration. The ~650 dark arms are overwhelmingly invariant-guarded +/ nil-defensive / internal-state branches no source program in any +backend can toggle; the earlier ~22% "genuine dispatch" estimate +(eyeballed from 36 lines) was itself too optimistic. The strategy is +reachability-aware re-triage to remove the impossible arms from the +denominator + a tiny enumerated set for the genuine handful. Fuzz's +delivered value here is bug-finding (B1-B4: 4 real memory-safety / +codegen bugs on dark arms), not coverage. diff --git a/tools/fuzz/templates/binary_op_matrix.rb b/tools/fuzz/templates/binary_op_matrix.rb index bd5bbcd4d..bef6fec4e 100644 --- a/tools/fuzz/templates/binary_op_matrix.rb +++ b/tools/fuzz/templates/binary_op_matrix.rb @@ -1,33 +1,38 @@ -# Template: binary-operator lowering matrix. +# Template: binary-operator lowering matrix — ENUMERATED, not sampled. # # Targets src/mir/mir_lowering.rb#lower_binary_op + #lower_or_rescue. -# The dark arms are the operator x operand-type combinations the corpus -# never wrote: MOD on i64, string comparison (eql / strcmp branches), -# heap-string operands (the hoist_alloc path), concat, and the -# OR-rescue fallback. Operands come from a fn so constant-folding does -# not erase the decision. +# The cell set is the dispatch's OWN discriminant set read from the +# source: the string-compare `case node.op` has arms +# {EQ,NEQ,LT,LTE,GT,GTE}; POW (**), MOD, concat (+), OR (OR_RESCUE) +# are the other op branches. Every comparison arm x every operand +# type is emitted -- exhaustive by construction, not a guessed axis. # -# expected :pass; a failing/leaking :pass cell is a SURFACED bug. +# Surface syntax confirmed from lexer/transpile-tests: +# == != < <= > >= ; **=POW ; MOD ; + (concat) ; OR (rescue). +# The symbol-path `case node.op {EQ,NEQ}` is EXCLUDED: CLEAR has no +# surface symbol literal (only a union *variant* named Symbol), so +# those 2 arms are not source-reachable -> accept/invariant_guarded, +# correctly not chased here. +# +# lhs() is ALWAYS strictly less than rhs() for every type, so the +# oracle is uniform: EQ false, NEQ true, LT true, LTE true, GT false, +# GTE false. expected :pass; a failing :pass cell is a SURFACED bug. BOM_CELLS = [] -BOM_OPS = %i[eq neq lt gte mod concat or_fallback] -BOM_TYPES = %i[int float str_lit heap_str] +BOM_CMP = %i[eq neq lt lte gt gte] +BOM_TYPES = %i[int float str_lit heap_str] -BOM_OPS.each do |op| - BOM_TYPES.each do |t| - # MOD: integers only. concat / or_fallback: strings only. - next if op == :mod && t != :int - next if op == :concat && !%i[str_lit heap_str].include?(t) - next if op == :or_fallback && !%i[str_lit heap_str].include?(t) - # float ordering only exercises lt/gte/eq/neq. - next if t == :float && %i[concat or_fallback mod].include?(op) - BOM_CELLS << { op: op, type: t } - end +BOM_CMP.each do |op| + BOM_TYPES.each { |t| BOM_CELLS << { op: op, type: t } } end +# Non-comparison op branches, each at its valid operand type(s). +BOM_CELLS << { op: :mod, type: :int } +BOM_CELLS << { op: :pow, type: :int } +BOM_CELLS << { op: :pow, type: :float } +BOM_CELLS << { op: :concat, type: :str_lit } +BOM_CELLS << { op: :concat, type: :heap_str } +BOM_CELLS << { op: :or_fallback, type: :heap_str } -# lhs is ALWAYS the strictly-smaller operand and rhs the larger, for -# every type, so the oracle is uniform: lt true, gte false, eq false, -# neq true. (MOD has its own dedicated operands below.) def bom_provider(t) case t when :int then "FN lhs() RETURNS Int64 -> RETURN 3_i64; END\nFN rhs() RETURNS Int64 -> RETURN 10_i64; END" @@ -47,21 +52,22 @@ def bom_body(op, t) when :eq then " ASSERT (#{l} == #{r}) == FALSE, \"eq #{t}\";" when :neq then " ASSERT (#{l} != #{r}), \"neq #{t}\";" when :lt then " ASSERT (#{l} < #{r}), \"lt #{t}\";" + when :lte then " ASSERT (#{l} <= #{r}), \"lte #{t}\";" + when :gt then " ASSERT (#{l} > #{r}) == FALSE, \"gt #{t}\";" when :gte then " ASSERT (#{l} >= #{r}) == FALSE, \"gte #{t}\";" - when :mod then " ASSERT (10_i64 MOD 3_i64) == 1_i64, \"mod #{t}\";" + when :mod then " ASSERT (10_i64 MOD 3_i64) == 1_i64, \"mod\";" + when :pow + t == :int ? " ASSERT (2_i64 ** 3_i64) == 8_i64, \"pow int\";" \ + : " ASSERT (2.0 ** 3.0) == 8.0, \"pow float\";" when :concat " t: String = #{l} + #{r};\n ASSERT t.length() == 6_i64, \"concat #{t}\";" when :or_fallback - # lhs() is non-fallible here; exercise the OR fallback shape with a - # fallible callee so lower_or_rescue lowers. - " v: String = mightFail() OR \"fb\";\n ASSERT v.length() >= 2_i64, \"or fallback #{t}\";" + " v: String = mightFail() OR \"fb\";\n ASSERT v.length() >= 2_i64, \"or fallback\";" end end def bom_extra_fn(op) - return "" unless op == :or_fallback - - "FN mightFail() RETURNS !String ->\n RETURN COPY \"ok\";\nEND\n" + op == :or_fallback ? "FN mightFail() RETURNS !String ->\n RETURN COPY \"ok\";\nEND\n" : "" end FuzzGenerator.register(:binary_op_matrix, cells: BOM_CELLS) do |p| diff --git a/tools/fuzz/templates/capability_wrap_matrix.rb b/tools/fuzz/templates/capability_wrap_matrix.rb index 39da93083..0bb6cfd13 100644 --- a/tools/fuzz/templates/capability_wrap_matrix.rb +++ b/tools/fuzz/templates/capability_wrap_matrix.rb @@ -1,58 +1,99 @@ -# Template: capability-wrap composition matrix. +# Template: capability-wrap composition matrix — ENUMERATED. # -# Targets src/mir/mir_lowering.rb#compose_capability_wrap + -# #lower_with_block + the cap path of #lower_var_decl. Dark arms = the -# sync_fn case (lockedCreate / rwLockedCreate / refCellCreate / -# versionedCreate) and the WITH access form per modality -- the corpus -# only ever declared @locked. Each cell declares a wrapped binding, -# enters the matching WITH, mutates/reads, asserts. -# -# Confirmed syntax (access_gate.rb): @locked/@writeLocked/@versioned + -# WITH EXCLUSIVE/SNAPSHOT. @alwaysMutable / @atomic / storage wraps are -# reserved :in_dev (WITH form not yet confirmed -- reserving matrix -# space, not emitting invalid noise). +# Targets src/mir/mir_lowering.rb#compose_capability_wrap. The +# discriminant set is read from the dispatch itself: +# sync_fn = case ft.sync {locked, write_locked, always_mutable, +# versioned, atomic} +# own_fn = case ft.ownership {shared->arc, multiowned->rc} +# + the 4-way sync_fn&&own_fn / sync_only / own_only / else. +# One cell per sync mode + one per ownership wrap = exhaustive over +# the dispatch labels. Every surface form is CONFIRMED from +# transpile-tests (all sigils occur there); nothing is :in_dev. # # expected :pass; a failing/leaking :pass cell is a SURFACED bug. -CWM_CELLS = [] -# [sync, with_head, expected] -CWM_MODES = [ - [:locked, "WITH EXCLUSIVE c AS ref", :pass], - [:write_locked, "WITH EXCLUSIVE c AS ref", :pass], - [:versioned, "WITH SNAPSHOT c AS ref", :pass], - [:always_mutable, nil, :in_dev], - [:atomic, nil, :in_dev], - [:shared_locked, nil, :in_dev], -] -CWM_MODES.each do |sync, head, exp| - CWM_CELLS << { sync: sync, head: head, expected: exp } -end - -def cwm_sync_decl(sync) - case sync - when :locked then "@locked" - when :write_locked then "@writeLocked" - when :versioned then "@versioned" - end -end +CWM_CELLS = %i[ + locked write_locked always_mutable versioned atomic + multiowned shared_locked +].map { |m| { mode: m } } FuzzGenerator.register(:capability_wrap_matrix, cells: CWM_CELLS) do |p| - # :in_dev cells render a placeholder; the harness emits them as - # comments and never runs them. - if p[:expected] == :in_dev - next "# in_dev: capability wrap #{p[:sync]} (WITH form unconfirmed)\n" + case p[:mode] + when :locked + <<~CHT + STRUCT Counter { value: Int64 } + FN main() RETURNS Void -> + MUTABLE c = Counter{ value: 1_i64 } @locked; + WITH EXCLUSIVE c AS r { + ASSERT r.value == 1_i64, "locked wrap read"; + } + RETURN; + END + CHT + when :write_locked + <<~CHT + STRUCT Counter { value: Int64 } + FN main() RETURNS Void -> + MUTABLE c = Counter{ value: 1_i64 } @writeLocked; + WITH EXCLUSIVE c AS r { + ASSERT r.value == 1_i64, "writeLocked wrap read"; + } + RETURN; + END + CHT + when :always_mutable + # Interior mutability: immutable binding, mutable data, direct. + <<~CHT + STRUCT Counter { value: Int64 } + FN main() RETURNS Void -> + c = Counter{ value: 1_i64 } @alwaysMutable; + c.value = 2_i64; + ASSERT c.value == 2_i64, "alwaysMutable interior mutate"; + RETURN; + END + CHT + when :versioned + <<~CHT + STRUCT Counter { value: Int64 } + FN main() RETURNS Void -> + MUTABLE c = Counter{ value: 1_i64 } @versioned; + WITH SNAPSHOT c AS r { + ASSERT r.value == 1_i64, "versioned snapshot read"; + } + RETURN; + END + CHT + when :atomic + <<~CHT + STRUCT Counter { value: Int64 } + FN main() RETURNS Void -> + MUTABLE c = Counter{ value: 1_i64 } @indirect:atomic; + WITH EXCLUSIVE c AS x { + x.value = 2_i64; + ASSERT x.value == 2_i64, "atomic-ptr exclusive mutate"; + } + RETURN; + END + CHT + when :multiowned + <<~CHT + STRUCT Counter { value: Int64 } + FN main() RETURNS Void -> + p = Counter{ value: 1_i64 } @multiowned; + ASSERT p.value == 1_i64, "multiowned (rc) read"; + RETURN; + END + CHT + when :shared_locked + <<~CHT + STRUCT Counter { value: Int64 } + FN main() RETURNS Void -> + MUTABLE t = Counter{ value: 1_i64 } @shared:locked; + WITH EXCLUSIVE t AS r { + ASSERT r.value == 1_i64, "shared:locked (arc+lock) read"; + } + RETURN; + END + CHT end - - <<~CHT - STRUCT Counter { value: Int64 } - - FN main() RETURNS Void -> - MUTABLE c = Counter{ value: 1_i64 } #{cwm_sync_decl(p[:sync])}; - #{p[:head]} { - x: Int64 = ref.value; - ASSERT x == 1_i64, "#{p[:sync]} wrap read"; - } - RETURN; - END - CHT end From f5c7f7da146cd4d17f62e2e0bc046b26ff6c935c Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 19:50:08 +0000 Subject: [PATCH 06/45] =?UTF-8?q?decomplex:=20DecisionPressure=20=E2=80=94?= =?UTF-8?q?=20score=20loose=20contracts=20by=20conditionals=20driven?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The project's primary goal made concrete: not "this decision is duplicated N times" (scatter) but "THIS contract is the SOURCE of N defensive type/nil decisions -- fix the contract once, the cluster dies." Attributes every is_a?/kind_of?/instance_of?/nil?/respond_to?/ safe-nav guard to the canonical root contract of its subject, resolving proximate locals through INTRA-procedural assignment (reuses the derived-state def-use idea + semantic-alias-style canonicalization). Cross-procedure pressure stays nil-kill's by the recorded boundary -- not re-implemented (decomplex stays CFG-free). Ranks contracts by decisions x methods; unresolved ~local bucket sorts last (that residue needs cross-proc = nil-kill). New tier-1 report section. Self-tested: decision_pressure_test (5), full suite 44/124/0. Verified on src/ (93 files): top contract `.type_info` drives 274 defensive decisions across 94 methods; the type-contract family (.type_info 274, .value 110, .full_type 33, .type 28, .return_type 27, [:type] 29) dominates the head of the ranking -- exactly the "one loose contract -> hundreds of conditionals" the user predicted. Co-Authored-By: Claude Opus 4.7 --- gems/decomplex/lib/decomplex.rb | 1 + .../lib/decomplex/decision_pressure.rb | 159 ++++++++++++++++++ gems/decomplex/lib/decomplex/report.rb | 6 + gems/decomplex/report.md | 61 ++++++- gems/decomplex/test/decision_pressure_test.rb | 76 +++++++++ 5 files changed, 301 insertions(+), 2 deletions(-) create mode 100644 gems/decomplex/lib/decomplex/decision_pressure.rb create mode 100644 gems/decomplex/test/decision_pressure_test.rb diff --git a/gems/decomplex/lib/decomplex.rb b/gems/decomplex/lib/decomplex.rb index 3ac8b23fc..97a0eede2 100644 --- a/gems/decomplex/lib/decomplex.rb +++ b/gems/decomplex/lib/decomplex.rb @@ -10,6 +10,7 @@ require_relative "decomplex/sequence_mine" require_relative "decomplex/derived_state" require_relative "decomplex/type3_clone" +require_relative "decomplex/decision_pressure" # Decomplex: decision-level duplication + neglected-condition detector. # See decomplex.gemspec for the rationale. v0 scope is exact-match diff --git a/gems/decomplex/lib/decomplex/decision_pressure.rb b/gems/decomplex/lib/decomplex/decision_pressure.rb new file mode 100644 index 000000000..509c74358 --- /dev/null +++ b/gems/decomplex/lib/decomplex/decision_pressure.rb @@ -0,0 +1,159 @@ +# frozen_string_literal: true + +require_relative "ast" + +module Decomplex + # Decision-pressure: attribute every defensive type/nil guard to the + # canonical ROOT CONTRACT its subject comes from, then rank contracts + # by how many re-derived decisions they drive. + # + # This is the project's primary goal made concrete: not "this decision + # is duplicated N times" (scatter) but "THIS loosely-typed contract + # (`.full_type`, `[:type]`, `@schema`) is the SOURCE of N conditionals + # -- fix the contract once, the cluster dies." Pressure, decomplex- + # scoped: intra-procedural only (a local is resolved to the accessor + # it was assigned from IN THE SAME METHOD). Cross-procedure pressure + # is nil-kill's, by the recorded boundary -- not re-implemented here. + # + # A "decision" = a guard whose subject is type/nil-tested: + # x.is_a?(T) / kind_of? / instance_of? / x.nil? / x.respond_to? / + # x&.m (safe-nav: an implicit nil decision on x). + class DecisionPressure + GUARD_MIDS = %i[is_a? kind_of? instance_of? nil? respond_to?].freeze + Hit = Struct.new(:contract, :file, :defn, :line, keyword_init: true) + + def self.scan(files) + hits = [] + files.each do |f| + root, lines = Ast.parse(f) + e = new(f, lines) + e.walk(root, [], {}) + hits.concat(e.hits) + end + Report.new(hits) + end + + attr_reader :hits + + def initialize(file, lines) + @file = file + @lines = lines + @hits = [] + end + + def walk(node, defstack, asgmap) + return unless Ast.node?(node) + + if %i[DEFN DEFS].include?(node.type) + name = node.children[node.type == :DEFS ? 1 : 0].to_s + defstack = defstack + [name] + asgmap = build_asgmap(node) + end + + record_guard(node, defstack, asgmap) + node.children.each { |c| walk(c, defstack, asgmap) } + end + + private + + # name => rhs-source-node, for `name = ` LASGNs in + # this method (intra-procedural only). First simple assignment wins. + def build_asgmap(defn_node) + map = {} + stack = Ast.body_stmts(defn_node).dup + until stack.empty? + n = stack.pop + next unless Ast.node?(n) + + if n.type == :LASGN + nm = n.children[0].to_s + src = n.children[1] + map[nm] ||= src if !map.key?(nm) && simple_source?(src) + end + n.children.each { |c| stack << c } + end + map + end + + def simple_source?(n) + return false unless Ast.node?(n) + + case n.type + when :IVAR then true + when :CALL, :QCALL + recv, mid, args = n.children + recv && (args.nil? || mid == :[]) + else false + end + end + + def record_guard(node, defstack, asgmap) + return unless %i[CALL QCALL].include?(node.type) + + recv, mid, _args = node.children + is_guard = + (node.type == :CALL && GUARD_MIDS.include?(mid)) || + node.type == :QCALL # safe-nav = implicit nil decision on recv + return unless is_guard && recv + + c = contract_of(recv, asgmap) + return unless c + + @hits << Hit.new(contract: c, file: @file, + defn: defstack.last || "(top-level)", + line: node.first_lineno) + end + + # Canonical root contract of a subject node, resolving locals + # through the intra-method assignment map. + def contract_of(n, asgmap, depth = 0) + return nil unless Ast.node?(n) && depth < 8 + + case n.type + when :LVAR, :DVAR + nm = n.children[0].to_s + src = asgmap[nm] + src ? contract_of(src, asgmap, depth + 1) : "~local" + when :IVAR + n.children[0].to_s # already includes the leading @ + when :CALL, :QCALL + recv, mid, args = n.children + if mid == :[] + key = args && Ast.node?(args) ? args.children.compact.first : nil + kt = (Ast.node?(key) ? Ast.slice(key, @lines) : key.inspect) + "[#{kt}]" + elsif args.nil? && recv + ".#{mid}" # no-arg accessor: the contract + end + when :VCALL + ".#{n.children[0]}" + end + end + + class Report + def initialize(hits) + @hits = hits + end + + # [{ contract:, decisions:, methods:, sites:[...] }, ...] + # ranked by decisions; the low-signal "~local" (unresolved + # proximate local -- needs cross-proc pressure = nil-kill) is + # reported last regardless of count. + def ranked + by = @hits.group_by(&:contract) + rows = by.map do |contract, hs| + { + contract: contract, + decisions: hs.size, + methods: hs.map { |h| [h.file, h.defn] }.uniq.size, + sites: hs.map { |h| "#{h.file}:#{h.defn}:#{h.line}" } + } + end + named = rows.reject { |r| r[:contract] == "~local" } + .sort_by { |r| [-r[:decisions], -r[:methods]] } + local = rows.select { |r| r[:contract] == "~local" } + named + local + end + end + end +end diff --git a/gems/decomplex/lib/decomplex/report.rb b/gems/decomplex/lib/decomplex/report.rb index a08aa6c52..13f57b1ab 100644 --- a/gems/decomplex/lib/decomplex/report.rb +++ b/gems/decomplex/lib/decomplex/report.rb @@ -32,6 +32,7 @@ def run @broken = sm.broken_protocol @derived = DerivedState.scan(@files) @clones = Type3Clone.scan(@files) + @pressure = DecisionPressure.scan(@files).ranked end # tier = signal quality (1 = highest signal / lowest false-positive, @@ -40,6 +41,7 @@ def run # must not outrank a precise one. Within a section, items are # frequency-ranked (support / scatter / confidence, descending). SECTIONS = [ + ["Decision Pressure", :@pressure, 1, "loose contract -> N defensive type/nil decisions; fix the contract once, the cluster dies (intra-proc; cross-proc = nil-kill)"], ["Missing Abstractions", :@miss, 1, "guard tuple recomputed across >=2 decision units"], ["Reification Misses", :@reif, 1, "an existing predicate reinvented inline -- invariant #16"], ["Semantic Predicate Aliases", :@salias, 1, "one decision, multiple names (receiver/polarity folded)"], @@ -126,6 +128,10 @@ def to_markdown def render(out, title, v) v.first(25).each do |h| out << case title + when "Decision Pressure" + "- `#{h[:contract]}` drives **#{h[:decisions]}** defensive " \ + "type/nil decisions across #{h[:methods]} method(s)\n" \ + " - #{h[:sites].first(4).map { |s| nav(s) }.join(' ; ')}\n" when "Missing Abstractions" "- **[#{h[:kind]}]** support=#{h[:support]} scatter=#{h[:scatter]} " \ "rank=#{h[:rank]}\n - tuple: `#{h[:members].join(' | ')}`\n" \ diff --git a/gems/decomplex/report.md b/gems/decomplex/report.md index 60be7bc67..b016b1796 100644 --- a/gems/decomplex/report.md +++ b/gems/decomplex/report.md @@ -9,6 +9,7 @@ ## Table of Contents - [Project Prioritization](#project-prioritization) +- [Decision Pressure (256)](#decision-pressure-256) - [Missing Abstractions (217)](#missing-abstractions-217) - [Reification Misses (129)](#reification-misses-129) - [Semantic Predicate Aliases (3)](#semantic-predicate-aliases-3) @@ -24,6 +25,7 @@ ## Project Prioritization _Ordered by signal tier (1 = highest signal / lowest FP), then by volume._ +- **[tier 1]** [Decision Pressure (256)](#decision-pressure-256): loose contract -> N defensive type/nil decisions; fix the contract once, the cluster dies (intra-proc; cross-proc = nil-kill) - **[tier 1]** [Missing Abstractions (217)](#missing-abstractions-217): guard tuple recomputed across >=2 decision units - **[tier 1]** [Reification Misses (129)](#reification-misses-129): an existing predicate reinvented inline -- invariant #16 - **[tier 1]** [Exact Predicate Aliases (7)](#exact-predicate-aliases-7): identical one-line predicate body under >=2 names @@ -35,6 +37,61 @@ _Ordered by signal tier (1 = highest signal / lowest FP), then by volume._ - **[tier 3]** [Neglected Path Conditions (2203)](#neglected-path-conditions-2203): nested-if/&& guard set minus one atom -- *POSSIBLE* bug (noisy) - **[tier 3]** [Broken Protocols (1730)](#broken-protocols-1730): co-called pair, one site does A without B -- *POSSIBLE* bug (noisy) +## Decision Pressure (256) +_loose contract -> N defensive type/nil decisions; fix the contract once, the cluster dies (intra-proc; cross-proc = nil-kill)_ + +- `.type_info` drives **274** defensive type/nil decisions across 94 method(s) + - `src/annotator-helpers/function_analysis.rb:243` (resolve_call) ; `src/annotator-helpers/function_analysis.rb:247` (resolve_call) ; `src/annotator-helpers/function_analysis.rb:248` (resolve_call) ; `src/annotator-helpers/function_analysis.rb:248` (resolve_call) +- `.value` drives **110** defensive type/nil decisions across 54 method(s) + - `src/annotator-helpers/auto_inference.rb:760` (walk_binops) ; `src/annotator-helpers/capabilities.rb:1055` (_unified_capture_walk) ; `src/annotator-helpers/capabilities.rb:1059` (_unified_capture_walk) ; `src/annotator-helpers/capabilities.rb:1067` (_unified_capture_walk) +- `.symbol` drives **63** defensive type/nil decisions across 44 method(s) + - `src/annotator-helpers/capabilities.rb:93` (cap_var_sync) ; `src/annotator-helpers/capabilities.rb:118` (cap_var_layout) ; `src/annotator-helpers/capabilities.rb:142` (validate_capability) ; `src/annotator-helpers/capabilities.rb:164` (validate_capability) +- `.target` drives **60** defensive type/nil decisions across 35 method(s) + - `src/annotator-helpers/auto_inference.rb:655` (record_index_assign) ; `src/annotator-helpers/capabilities.rb:746` (cap_var_name) ; `src/annotator-helpers/function_analysis.rb:909` (verify_return) ; `src/annotator-helpers/generic_analysis.rb:645` (find_container_source) +- `.name` drives **55** defensive type/nil decisions across 37 method(s) + - `src/annotator-helpers/auto_inference.rb:653` (record_index_assign) ; `src/annotator-helpers/capabilities.rb:1040` (_unified_capture_walk) ; `src/annotator-helpers/capabilities.rb:1292` (_bg_walk) ; `src/annotator-helpers/generic_analysis.rb:630` (register_container_borrow!) +- `.right` drives **53** defensive type/nil decisions across 17 method(s) + - `src/annotator-helpers/pipe_analysis.rb:24` (visit_Smooth) ; `src/annotator-helpers/pipe_analysis.rb:26` (visit_Smooth) ; `src/annotator-helpers/pipe_analysis.rb:263` (analyze_select_family_op) ; `src/annotator-helpers/pipe_analysis.rb:263` (analyze_select_family_op) +- `.current_fn_ctx` drives **35** defensive type/nil decisions across 23 method(s) + - `src/annotator-helpers/capabilities.rb:1148` (_unified_capture_walk) ; `src/annotator-helpers/capabilities.rb:1185` (_unified_capture_walk) ; `src/annotator-helpers/capabilities.rb:1329` (record_capability_binding) ; `src/annotator-helpers/capabilities.rb:1337` (record_capability_binding) +- `.full_type` drives **33** defensive type/nil decisions across 20 method(s) + - `src/annotator-helpers/capabilities.rb:95` (cap_var_sync) ; `src/annotator-helpers/capabilities.rb:104` (cap_var_storage) ; `src/annotator-helpers/capabilities.rb:120` (cap_var_layout) ; `src/annotator-helpers/capabilities.rb:186` (validate_capability) +- `[:type]` drives **29** defensive type/nil decisions across 20 method(s) + - `src/annotator-helpers/function_analysis.rb:326` (verify_function_signature!) ; `src/annotator-helpers/function_analysis.rb:553` (atomic_cell_to_atomic_param?) ; `src/annotator-helpers/function_analysis.rb:693` (verify_lifetime_source!) ; `src/annotator-helpers/function_analysis.rb:726` (declare_and_verify_params) +- `.left` drives **29** defensive type/nil decisions across 18 method(s) + - `src/annotator-helpers/pipe_analysis.rb:63` (stamp_observable_terminal!) ; `src/annotator-helpers/pipe_analysis.rb:240` (analyze_collect_op) ; `src/annotator-helpers/pipe_analysis.rb:588` (analyze_limit_op) ; `src/annotator-helpers/pipe_analysis.rb:1335` (analyze_shard_op) +- `.type` drives **28** defensive type/nil decisions across 21 method(s) + - `src/annotator-helpers/auto_inference.rb:210` (record_local) ; `src/annotator-helpers/auto_inference.rb:504` (stamp_map_pairs!) ; `src/annotator-helpers/auto_inference.rb:505` (stamp_map_pairs!) ; `src/annotator-helpers/auto_inference.rb:572` (walk_for_shape_decls) +- `.return_type` drives **27** defensive type/nil decisions across 15 method(s) + - `src/annotator-helpers/capabilities.rb:566` (visit_post_clauses!) ; `src/annotator-helpers/function_analysis.rb:170` (resolve_call) ; `src/annotator-helpers/reentrance.rb:162` (validate_not_logical_return!) ; `src/annotator-helpers/reentrance.rb:164` (validate_not_logical_return!) +- `.last` drives **25** defensive type/nil decisions across 6 method(s) + - `src/annotator.rb:5619` (expr_result_type) ; `src/annotator.rb:5621` (expr_result_type) ; `src/annotator.rb:5628` (expr_result_type) ; `src/annotator.rb:5628` (expr_result_type) +- `.token` drives **22** defensive type/nil decisions across 19 method(s) + - `src/annotator-helpers/capabilities.rb:1341` (record_capability_binding) ; `src/annotator-helpers/capabilities.rb:1342` (record_capability_binding) ; `src/mir/concurrency_checks.rb:73` (check_hold_across_yield!) ; `src/mir/concurrency_checks.rb:171` (check_reentrant!) +- `.capture_analysis` drives **22** defensive type/nil decisions across 17 method(s) + - `src/mir/control_flow.rb:675` (transfer_stmt) ; `src/mir/control_flow.rb:755` (collect_ownership_transfers) ; `src/mir/control_flow.rb:841` (_walk_bg_captures_in_expr) ; `src/mir/control_flow.rb:870` (collect_bg_body_gives) +- `[name]` drives **21** defensive type/nil decisions across 20 method(s) + - `src/annotator-helpers/effects.rb:985` (max_tier_for_calls) ; `src/annotator-helpers/fixable_helpers.rb:310` (emit_use_of_moved_error!) ; `src/annotator-helpers/fixable_helpers.rb:997` (emit_with_materialized_needs_tense!) ; `src/annotator-helpers/fixable_helpers.rb:1200` (build_decl_cap_insert_fix) +- `.tail` drives **21** defensive type/nil decisions across 6 method(s) + - `src/mir/fsm_transform/emit.rb:341` (build_recursive) ; `src/mir/fsm_transform/emit.rb:375` (build_recursive) ; `src/mir/fsm_transform/emit.rb:376` (build_recursive) ; `src/mir/fsm_transform/emit.rb:444` (build_recursive) +- `.element_type` drives **19** defensive type/nil decisions across 15 method(s) + - `src/annotator-helpers/generic_analysis.rb:182` (validate_type_annotation!) ; `src/annotator-helpers/method_analysis.rb:42` (narrow_collection_type!) ; `src/annotator-helpers/method_analysis.rb:121` (resolve_typed_method) ; `src/annotator.rb:4305` (infer_element_type) +- `@union_schemas` drives **18** defensive type/nil decisions across 13 method(s) + - `src/mir/mir_lowering.rb:262` (owned_value_temp_needs_cleanup?) ; `src/mir/mir_lowering.rb:263` (owned_value_temp_needs_cleanup?) ; `src/mir/mir_lowering.rb:291` (copy_container_borrow_if_needed) ; `src/mir/mir_lowering.rb:1297` (lower_function_def) +- `[:var_node]` drives **18** defensive type/nil decisions across 12 method(s) + - `src/annotator-helpers/capabilities.rb:668` (acquire_capability!) ; `src/annotator-helpers/capabilities.rb:679` (acquire_capability!) ; `src/annotator-helpers/capabilities.rb:709` (acquire_capability!) ; `src/annotator-helpers/capabilities.rb:714` (acquire_capability!) +- `.payload_type` drives **17** defensive type/nil decisions across 5 method(s) + - `src/annotator-helpers/function_analysis.rb:192` (resolve_call) ; `src/annotator-helpers/function_analysis.rb:195` (resolve_call) ; `src/annotator-helpers/function_analysis.rb:199` (resolve_call) ; `src/annotator-helpers/function_analysis.rb:200` (resolve_call) +- `.reg` drives **15** defensive type/nil decisions across 11 method(s) + - `src/annotator-helpers/fixable_helpers.rb:1000` (emit_with_materialized_needs_tense!) ; `src/annotator-helpers/fixable_helpers.rb:1201` (build_decl_cap_insert_fix) ; `src/annotator-helpers/fixable_helpers.rb:1229` (build_decl_cap_replace_fix) ; `src/annotator-helpers/function_analysis.rb:970` (return_is_borrow?) +- `.arms` drives **14** defensive type/nil decisions across 8 method(s) + - `src/annotator-helpers/capabilities.rb:1221` (_unified_capture_walk) ; `src/annotator-helpers/effects.rb:1178` (scan_for_raises) ; `src/annotator.rb:4523` (visit_WithBlock) ; `src/annotator.rb:4682` (visit_WithBlock) +- `@og` drives **14** defensive type/nil decisions across 6 method(s) + - `src/annotator.rb:1199` (analyze_control_flow_branches) ; `src/annotator.rb:1206` (analyze_control_flow_branches) ; `src/annotator.rb:1212` (analyze_control_flow_branches) ; `src/annotator.rb:1223` (analyze_control_flow_branches) +- `.sync` drives **13** defensive type/nil decisions across 12 method(s) + - `src/annotator-helpers/function_analysis.rb:204` (resolve_call) ; `src/annotator-helpers/generic_analysis.rb:453` (generic_type_has_capabilities?) ; `src/annotator-helpers/pipe_analysis.rb:1147` (collect_sharded_names) ; `src/annotator-helpers/pipe_analysis.rb:1170` (pre_scan_node_for_sharded) +- ...(+231 more) + ## Missing Abstractions (217) _guard tuple recomputed across >=2 decision units_ @@ -343,6 +400,6 @@ _co-called pair, one site does A without B -- *POSSIBLE* bug (noisy)_ ## Run Summary - Files analyzed: 93 -- Detectors: 10 (all shipped, self-tested) -- Total candidates: 10662 +- Detectors: 11 (all shipped, self-tested) +- Total candidates: 10918 - Method: stdlib AST only, intra-procedural, zero deps, no CFG / no points-to (see docs/agents/design.md) diff --git a/gems/decomplex/test/decision_pressure_test.rb b/gems/decomplex/test/decision_pressure_test.rb new file mode 100644 index 000000000..1b8dfe494 --- /dev/null +++ b/gems/decomplex/test/decision_pressure_test.rb @@ -0,0 +1,76 @@ +# frozen_string_literal: true + +require "minitest/autorun" +require "tempfile" +require_relative "../lib/decomplex" + +class DecisionPressureTest < Minitest::Test + def rank(ruby) + f = Tempfile.new(["dp", ".rb"]) + f.write(ruby) + f.close + Decomplex::DecisionPressure.scan([f.path]).ranked + ensure + f&.unlink + end + + def test_local_is_resolved_to_the_accessor_it_came_from + # Two methods, both `ti = node.full_type; ... ti.is_a?(Type)`. + # The proximate local `ti` must attribute to `.full_type`, and the + # contract must aggregate across both methods. + r = rank(<<~RB) + def a(node) + ti = node.full_type + return 1 if ti.is_a?(Type) + end + def b(node) + ti = node.full_type + return 2 if ti.is_a?(Type) + end + RB + top = r.first + assert_equal ".full_type", top[:contract] + assert_equal 2, top[:decisions] + assert_equal 2, top[:methods] + end + + def test_hash_key_and_ivar_contracts_are_distinct_and_ranked + r = rank(<<~RB) + def a(p) + return 1 if p[:type].is_a?(Type) + return 2 if p[:type].nil? + return 3 if @schema.respond_to?(:x) + end + RB + by = r.to_h { |x| [x[:contract], x[:decisions]] } + assert_equal 2, by["[:type]"] + assert_equal 1, by["@schema"] + end + + def test_safe_nav_counts_as_a_nil_decision_on_its_receiver + r = rank(<<~RB) + def a(node) + x = node.type_info&.heap? + return x + end + RB + assert_equal ".type_info", r.first[:contract] + assert_equal 1, r.first[:decisions] + end + + def test_unresolved_local_is_low_signal_and_sorts_last + r = rank(<<~RB) + def a(thing) + return 1 if thing.is_a?(Type) + return 2 if other.full_type.is_a?(Type) + end + RB + # .full_type is a named contract -> ranked above the ~local bucket + assert_equal ".full_type", r.first[:contract] + assert_equal "~local", r.last[:contract] + end + + def test_no_guards_no_rows + assert_empty rank("def a(x); return x + 1; end\n") + end +end From 57be1caadbdb39e1f3fc24dbeb12e71ef346199f Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 23:08:40 +0000 Subject: [PATCH 07/45] gems/prick: categorical coverage-gap synthesis (the capstone) Promotes tools/branch_gap_triage from a one-off probe to a first-class gem (named `prick` -- it pricks holes in your codebase). A flat "673/2732 uncovered" is unactionable; prick categorizes every dark branch arm and overlays fix-churn so the actionable slice surfaces. OWNS the gap-categorization analysis (AST-structural per-arm classifier, dead/live decision split, categorical rollup). CONSUMES the sibling fix-cache gem for churn (require_relative, not re-derived) + an optional nil-kill verdict for type_norm. Boundary held: it aggregates, it does not re-implement. Categories: type_norm (confirm w/ nil-kill -> removable), dead (delete, complexity down), defensive (accept), ffi, diagnostic (negative spec), genuine (the real gap). New signal: genuine x fix-churn = "bugs highly likely HERE". Validated on src/mir/{mir_lowering,control_flow,escape_analysis}: 935 dark arms -> diagnostic 305, genuine 273, type_norm 229, dead 68, ffi 46, defensive 14. Bugs-likely #1 mir_lowering (187 genuine x churn 1.0); top sites hoist_alloc / owned_value_temp_needs_cleanup? -- the exact methods that produced B1-B4. The synthesis points at real bugs. Honest v0 caveats (documented in design.md): diagnostic over-greedy (subtree-wide raise), type_norm under-counted (no intra-proc local->accessor resolution yet). Shape + bug-likely join are sound; percentages are candidates to tighten. Self-tested 6/30/0 incl. a real stdlib-Coverage resultset integration + temp-git churn overlay. sorbet: ignore gems/prick/. Co-Authored-By: Claude Opus 4.7 --- gems/prick/README.md | 59 +++++++++ gems/prick/docs/agents/design.md | 71 +++++++++++ gems/prick/exe/prick | 44 +++++++ gems/prick/lib/prick.rb | 13 ++ gems/prick/lib/prick/classifier.rb | 191 +++++++++++++++++++++++++++++ gems/prick/lib/prick/report.rb | 71 +++++++++++ gems/prick/lib/prick/rollup.rb | 77 ++++++++++++ gems/prick/prick.gemspec | 25 ++++ gems/prick/report.md | 56 +++++++++ gems/prick/test/classifier_test.rb | 83 +++++++++++++ gems/prick/test/rollup_test.rb | 62 ++++++++++ sorbet/config | 1 + 12 files changed, 753 insertions(+) create mode 100644 gems/prick/README.md create mode 100644 gems/prick/docs/agents/design.md create mode 100644 gems/prick/exe/prick create mode 100644 gems/prick/lib/prick.rb create mode 100644 gems/prick/lib/prick/classifier.rb create mode 100644 gems/prick/lib/prick/report.rb create mode 100644 gems/prick/lib/prick/rollup.rb create mode 100644 gems/prick/prick.gemspec create mode 100644 gems/prick/report.md create mode 100644 gems/prick/test/classifier_test.rb create mode 100644 gems/prick/test/rollup_test.rb diff --git a/gems/prick/README.md b/gems/prick/README.md new file mode 100644 index 000000000..af454851b --- /dev/null +++ b/gems/prick/README.md @@ -0,0 +1,59 @@ +# prick: not all coverage gaps are equal. + + * A flat "673/2732 uncovered" is unactionable. prick categorizes + every dark branch arm and tells you which to delete, which to + accept, which nil-kill should resolve, and which are GENUINE gaps + where bugs are highly likely. + * The capstone over decomplex / fix-cache / nil-kill: it OWNS gap + categorization and CONSUMES fix-cache's churn (and an optional + nil-kill verdict). It re-derives nothing. + +## The categories + +| category | what to do | +|---|---| +| `type_norm` | likely removable — confirm with nil-kill (a typed contract kills the cluster) | +| `dead` | decision never executes — audit & delete (complexity down) | +| `defensive` | inert / invariant-pinned — accept, drop from the denominator | +| `ffi` | extern/require/module — a few targeted `.cht` | +| `diagnostic` | raises — one negative unit spec | +| `genuine` | the REAL gap — test it; in churn-hot code = **bug highly likely** | + +The headline signal: **`genuine` × fix-churn = "bugs highly likely +HERE"** — the small slice actually worth your time. + +## Usage + +``` +prick report --repo=. --coverage=coverage/.resultset.json \ + --output=report.md +prick report --files=src/mir/mir_lowering.rb # specific files +``` + +Needs `coverage/.resultset.json` (SimpleCov `enable_coverage :branch`) +and a git repo (for the fix-cache churn overlay). See +[report.md](report.md) for a demo over CLEAR's lowering passes. + +## What it found on CLEAR + +935 dark arms across the 3 lowering passes: only ~29% genuine, ~33% +diagnostic, ~24% type_norm (→ nil-kill), ~7% dead (→ delete). The +"bugs highly likely" #1 is `src/mir/mir_lowering.rb` (187 genuine × +top churn) — and its top sites are `hoist_alloc` / +`owned_value_temp_needs_cleanup?`, the exact methods that produced +real bugs B1–B4. The synthesis points at real bugs. + +## What it is NOT + + * Not a re-implementation. It consumes fix-cache; it does not compute + churn or type pressure itself. + * Not a verdict. Categories are ranked candidates (Engler + discipline). v0 precision caveats — `diagnostic` over-greedy, + `type_norm` under-counted (no intra-proc local resolution yet) — + are documented in [docs/agents/design.md](docs/agents/design.md). + The bug-likely join is the sound, validated part. + +## Links + + * [Design, categories, boundary, caveats](docs/agents/design.md) + * [Demo report](report.md) diff --git a/gems/prick/docs/agents/design.md b/gems/prick/docs/agents/design.md new file mode 100644 index 000000000..8109615cf --- /dev/null +++ b/gems/prick/docs/agents/design.md @@ -0,0 +1,71 @@ +# prick — design + +## Why this exists (and why it IS a gem) + +A flat "673/2732 uncovered" is unactionable: gaps are not equal. +prick is the **capstone** — it turns the raw coverage gap into a +prioritised, categorical answer. It was promoted from the one-off +`tools/branch_prick.rb` probe because it is a coherent, reusable, +versioned product with its own identity, exactly like fix-cache (which +is itself an aggregation gem). "It's an aggregation" is an argument to +*consume* the other tools, not against gem status. + +## Boundary + +OWNS the gap-categorization analysis (the per-arm classifier, the +dead/live decision split, the categorical rollup). CONSUMES +`fix-cache` (churn) via the sibling gem; CONSUMES an optional nil-kill +verdict for type_norm removability. Re-derives nothing. + +## Categories (the user's model: not all gaps equal) + +| category | meaning | action | +|---|---|---| +| `type_norm` | arm/decision guards a type/nil check (`is_a?`/`kind_of?`/`nil?`/`respond_to?`/safe-nav) | likely removable — CONFIRM with nil-kill; a typed contract kills the whole cluster | +| `dead` | no sibling arm of the decision ever taken: decision never executes | audit as dead code → delete (complexity down) | +| `defensive` | live decision, inert/invariant-pinned (empty else, `nil`) | accept + annotate, drop from denominator | +| `ffi` | extern/require/module boundary | a few targeted `.cht` | +| `diagnostic` | arm raises/diagnoses | one negative unit spec (fuzz cannot reach) | +| `genuine` | live, reachable, input-determined, none of the above | the REAL gap — test it | + +The one genuinely-new signal: **`genuine` arms × fix-cache churn = +"bugs highly likely HERE"** — the small actionable slice. + +## Classification is AST-structural + +Never a regex over the arm line. The SimpleCov parent tuple gives the +decision kind; the arm's `(line,col)` span is matched to an AST node; +the decision's CONDITION (parent node's first child — where a +type-guard lives) and the arm body are inspected. The FFI-boundary +method set and diagnostic message names are the only per-project +lexicon. + +## Honest v0 precision caveats (Engler discipline: ranked, refine) + +- `diagnostic` is **over-greedy**: it tags any arm whose subtree + contains `raise`/`fail`/`abort` *anywhere*, not "the arm IS + primarily a raise". Over-counts vs the older probe (305 vs 16). + Refinement: require the raise to be the arm's dominant outcome. +- `type_norm` is **under-counted**: the classifier does not do + decomplex's intra-procedural `local = recv.accessor` resolution, so + a guard on a local that came from `.type_info` is missed unless the + guard is syntactically on the accessor. Refinement: fold in + decomplex's local→contract resolution (consume, don't re-derive). +- Net: the *shape* and the bug-likely join are sound and empirically + validated (top genuine sites are the exact cleanup/ownership + methods that produced bugs B1–B4); the per-category percentages are + candidates to tighten, not verdicts. + +## Validated result (src/mir/{mir_lowering,control_flow,escape_analysis}) + +935 dark arms: diagnostic 305 (32.6%), genuine 273 (29.2%), type_norm +229 (24.5%), dead 68 (7.3%), ffi 46 (4.9%), defensive 14 (1.5%). +Bugs-highly-likely #1: `src/mir/mir_lowering.rb` — 187 genuine arms × +churn 1.0; top sites `hoist_alloc`, `owned_value_temp_needs_cleanup?` +— the exact methods behind B1–B4. The synthesis points at real bugs. + +## Self-tested + +`test/classifier_test.rb` (incl. a real stdlib-`Coverage` resultset +integration), `test/rollup_test.rb` (real temp git repo + churn +overlay). 6 runs / 30 assertions / 0 failures. diff --git a/gems/prick/exe/prick b/gems/prick/exe/prick new file mode 100644 index 000000000..8325f45e1 --- /dev/null +++ b/gems/prick/exe/prick @@ -0,0 +1,44 @@ +#!/usr/bin/env ruby +# frozen_string_literal: true + +require_relative "../lib/prick" + +def usage + warn <<~U + prick -- categorical coverage-gap synthesis + + prick report [--repo=.] [--coverage=coverage/.resultset.json] \\ + [--output=report.md] [--files=a.rb,b.rb] + + --files repo-relative .rb to triage; default: the src/mir lowering + passes (the fix-cache hotspots). + U + exit 1 +end + +usage if ARGV.empty? || %w[-h --help].include?(ARGV[0]) +usage unless ARGV[0] == "report" + +opts = { repo: ".", coverage: "coverage/.resultset.json", output: nil, + files: %w[src/mir/mir_lowering.rb src/mir/control_flow.rb + src/mir/escape_analysis.rb] } +ARGV[1..].each do |a| + case a + when /\A--repo=(.+)/ then opts[:repo] = Regexp.last_match(1) + when /\A--coverage=(.+)/ then opts[:coverage] = Regexp.last_match(1) + when /\A--output=(.+)/ then opts[:output] = Regexp.last_match(1) + when /\A--files=(.+)/ then opts[:files] = Regexp.last_match(1).split(",") + else usage + end +end + +md = Prick::Report.new( + files: opts[:files], repo: opts[:repo], resultset: opts[:coverage] +).to_markdown + +if opts[:output] + File.write(opts[:output], md) + warn "wrote #{opts[:output]}" +else + puts md +end diff --git a/gems/prick/lib/prick.rb b/gems/prick/lib/prick.rb new file mode 100644 index 000000000..7d64c6d41 --- /dev/null +++ b/gems/prick/lib/prick.rb @@ -0,0 +1,13 @@ +# frozen_string_literal: true + +require_relative "prick/classifier" +require_relative "prick/rollup" +require_relative "prick/report" + +# prick: categorical coverage-gap synthesis (the capstone). +# Owns the gap-categorization analysis; consumes the sibling fix-cache +# gem for churn and an optional nil-kill verdict for type_norm +# removability. See docs/agents/design.md. +module Prick + VERSION = "0.0.1" +end diff --git a/gems/prick/lib/prick/classifier.rb b/gems/prick/lib/prick/classifier.rb new file mode 100644 index 000000000..c72613d64 --- /dev/null +++ b/gems/prick/lib/prick/classifier.rb @@ -0,0 +1,191 @@ +# frozen_string_literal: true + +require "json" + +module Prick + # Classifies every never-taken branch arm in a target file into ONE + # actionable category. Ported + extended from tools/branch_prick + # (which was the one-off probe this gem promotes). AST-structural, + # never a regex over the arm line. + # + # Categories (the user's model -- not all gaps are equal): + # :type_norm arm guards a type/nil check (is_a?/kind_of?/nil?/ + # respond_to?/safe-nav). Likely removable -- the loose + # contract should be typed; CONFIRM with nil-kill. + # :dead no sibling arm of the decision is ever taken: the + # decision never executes. Dead/internal path -> audit + # for deletion (complexity down). + # :defensive live decision, inert/pinned polarity (empty else, + # nil, invariant-guaranteed). Accept + annotate. + # :ffi extern/require/module boundary -> targeted .cht. + # :diagnostic arm raises/diagnoses -> invalid-input only -> spec. + # :genuine live, reachable, input-determined, none of the above. + # The real gap. Overlaid with fix-churn = "bug-likely". + module Classifier + FFI_BOUNDARY = %w[ + build_extern_trampoline_call build_extern_trampoline_method + build_extern_trampoline_common lower_extern_direct_call + lower_require lower_module + ].freeze + DIAGNOSTIC_MIDS = %i[raise fail abort].freeze + GUARD_MIDS = %i[is_a? kind_of? instance_of? nil? respond_to?].freeze + + Arm = Struct.new(:file, :defn, :line, :category, keyword_init: true) + + module_function + + def merged_branches(resultset, abspath) + m = {} + JSON.parse(File.read(resultset)).each_value do |e| + (e["coverage"] || {}).each do |p, c| + next unless p == abspath && c.is_a?(Hash) && c["branches"] + + c["branches"].each do |par, arms| + d = (m[par] ||= Hash.new(0)) + arms.each { |a, n| d[a] = d[a] + (n || 0) } + end + end + end + m + end + + def method_index(lines) + idx = {} + stack = [] + lines.each_with_index do |raw, i| + ln = i + 1 + if (mm = raw.match(/^(\s*)def\s+(self\.)?([A-Za-z0-9_?!]+)/)) + ind = mm[1].length + stack.pop while stack.any? && stack.last[0] >= ind + stack.push([ind, mm[3], ln]) + elsif (e = raw.match(/^(\s*)end\b/)) + ind = e[1].length + stack.pop if stack.any? && stack.last[0] == ind + end + idx[ln] = stack.last ? stack.last[1] : "(top-level)" + end + idx + end + + def ast_nodes(abspath) + root = RubyVM::AbstractSyntaxTree.parse(File.read(abspath), keep_script_lines: true) + acc = [] + w = ->(n) { return unless n.is_a?(RubyVM::AbstractSyntaxTree::Node); acc << n; n.children.each { |c| w.call(c) } } + w.call(root) + acc + rescue SyntaxError, StandardError + [] + end + + def node_for(nodes, sl, sc, el, ec) + sp = ->(n) { [n.first_lineno, n.first_column, n.last_lineno, n.last_column] } + ex = nodes.find { |n| sp.call(n) == [sl, sc, el, ec] } + return ex if ex + + cov = nodes.select do |n| + a = sp.call(n) + (a[0] < sl || (a[0] == sl && a[1] <= sc)) && (a[2] > el || (a[2] == el && a[3] >= ec)) + end + cov.min_by { |n| (n.last_lineno - n.first_lineno) * 1000 + n.children.size } + end + + def subtree(node, types: nil, mids: nil) + st = [node] + until st.empty? + n = st.pop + next unless n.is_a?(RubyVM::AbstractSyntaxTree::Node) + return true if types&.include?(n.type) + + if mids && %i[CALL FCALL VCALL QCALL OPCALL].include?(n.type) + mid = n.children[%i[CALL OPCALL QCALL].include?(n.type) ? 1 : 0] + return true if mids.include?(mid) + return true if n.type == :QCALL # safe-nav = nil decision + end + n.children.each { |c| st << c } + end + false + end + + def trivial?(node) + return true if node.nil? + return true if node.type == :NIL + return true if node.type == :BEGIN && node.children.compact.empty? + return false if has_any_call?(node) + return false if subtree(node, types: %i[LASGN IASGN OP_ASGN ATTRASGN MASGN GASGN CVASGN RETURN NEXT BREAK YIELD]) + + !subtree(node, types: %i[LIT STR SYM INTEGER FLOAT LVAR IVAR DVAR CONST ARRAY HASH TRUE FALSE]) + end + + def has_any_call?(node) + subtree(node, types: %i[CALL FCALL VCALL OPCALL QCALL]) + end + + # -> [Arm, ...] for every dark arm in abspath. + def classify_file(resultset, abspath) + branches = merged_branches(resultset, abspath) + return [] if branches.empty? + + lines = File.readlines(abspath) + midx = method_index(lines) + nodes = ast_nodes(abspath) + out = [] + + branches.each do |parent, arms| + p = parent.gsub(/[\[\]:]/, "").split(",").map(&:strip) + pkind = p[0].to_sym + # The decision's CONDITION (where a type/nil guard lives) is the + # parent node's first child, not the dark arm's body. + pnode = node_for(nodes, p[2].to_i, p[3].to_i, p[4].to_i, p[5].to_i) + cond = if pnode && %i[IF UNLESS WHILE UNTIL CASE].include?(pnode.type) + pnode.children[0] + else + pnode + end + any_taken = arms.values.any? { |v| v.to_i.positive? } + arms.each do |arm, count| + next unless count.to_i.zero? + + a = arm.gsub(/[\[\]:]/, "").split(",").map(&:strip) + sl, sc, el, ec = a[2].to_i, a[3].to_i, a[4].to_i, a[5].to_i + meth = midx[sl] || "(top-level)" + anode = node_for(nodes, sl, sc, el, ec) + cat = categorize(meth, pkind, anode, any_taken, cond) + out << Arm.new(file: abspath, defn: meth, line: sl, category: cat) + end + end + out + end + + def categorize(method, pkind, anode, sibling_taken, cond = nil) + return :ffi if FFI_BOUNDARY.include?(method) + return :diagnostic if anode && subtree(anode, mids: DIAGNOSTIC_MIDS) + # type/nil guard family: check the decision's CONDITION and the + # arm body -> the decomplex DecisionPressure class. + return :type_norm if (cond && type_guard?(cond)) || (anode && type_guard?(anode)) + return :dead unless sibling_taken # decision never executes + return :defensive if trivial?(anode) + + if %i[case when & |].include?(pkind) || %i[if unless ternary while until for].include?(pkind) + :genuine + else + :defensive + end + end + + def type_guard?(node) + st = [node] + until st.empty? + n = st.pop + next unless n.is_a?(RubyVM::AbstractSyntaxTree::Node) + return true if n.type == :QCALL # x&.m : implicit nil decision + + if %i[CALL OPCALL].include?(n.type) && GUARD_MIDS.include?(n.children[1]) + return true + end + + n.children.each { |c| st << c } + end + false + end + end +end diff --git a/gems/prick/lib/prick/report.rb b/gems/prick/lib/prick/report.rb new file mode 100644 index 000000000..65067c94e --- /dev/null +++ b/gems/prick/lib/prick/report.rb @@ -0,0 +1,71 @@ +# frozen_string_literal: true + +require_relative "rollup" + +module Prick + # Markdown report, structured like decomplex / fix-cache / nil-kill. + class Report + def initialize(files:, repo:, resultset:) + @repo = repo + @r = Rollup.run(files: files, repo: repo, resultset: resultset) + end + + def to_markdown + o = +"# Prick Report\n\n" + o << "> Not all coverage gaps are equal. Every dark branch arm\n" \ + "> categorized; the GENUINE arms x fix-churn = where bugs\n" \ + "> are highly likely. Owns categorization; consumes\n" \ + "> fix-cache (churn). type_norm = confirm with nil-kill.\n\n" + + o << "## Table of Contents\n" + o << "- [Category Rollup](#category-rollup)\n" + o << "- [Bugs Highly Likely (#{@r[:bug_likely].size})](#bugs-highly-likely-#{@r[:bug_likely].size})\n" + o << "- [Per-File Breakdown](#per-file-breakdown)\n" + o << "- [Run Summary](#run-summary)\n\n" + + g = @r[:grand] + o << "## Category Rollup\n" + o << "_#{g} dark arms across #{@r[:per_file].size} file(s). " \ + "Most are NOT test targets:_\n\n" + o << "| category | arms | % | action |\n|---|---|---|---|\n" + Rollup::CATS.each do |c| + n = @r[:totals][c].to_i + pct = g.zero? ? 0 : (100.0 * n / g).round(1) + o << "| **#{c}** | #{n} | #{pct}% | #{Rollup::ACTION[c]} |\n" + end + o << "\n" + + o << "## Bugs Highly Likely (#{@r[:bug_likely].size})\n" + o << "_genuine reachable gaps in fix-churn-hot code -- triage " \ + "top-down; this is the actionable ~slice:_\n\n" + if @r[:bug_likely].empty? + o << "None.\n\n" + else + o << "| # | file | genuine arms | churn | score |\n|---|---|---|---|---|\n" + @r[:bug_likely].first(30).each_with_index do |h, i| + o << "| #{i + 1} | `#{h[:file]}` | #{h[:genuine]} | " \ + "#{h[:churn_norm]} | #{h[:score]} |\n" + end + o << "\n Top file's genuine sites:\n" + top = @r[:bug_likely].first + top[:sites].each { |s| o << " - #{s}\n" } + o << "\n" + end + + o << "## Per-File Breakdown\n\n" + o << "| file | total | type_norm | dead | defensive | genuine | ffi | diag |\n" + o << "|---|---|---|---|---|---|---|---|\n" + @r[:per_file].sort_by { |_, h| -h[:total] }.each do |f, h| + p = h[:pct] + o << "| `#{f}` | #{h[:total]} | #{p[:type_norm]}% | #{p[:dead]}% " \ + "| #{p[:defensive]}% | #{p[:genuine]}% | #{p[:ffi]}% | #{p[:diagnostic]}% |\n" + end + o << "\n## Run Summary\n" + o << "- Repo: `#{@repo}`\n" + o << "- Files triaged: #{@r[:per_file].size}; dark arms: #{g}\n" + o << "- Owns categorization; consumes fix-cache churn. type_norm " \ + "arms: confirm removable with nil-kill (see docs/agents/design.md)\n" + o + end + end +end diff --git a/gems/prick/lib/prick/rollup.rb b/gems/prick/lib/prick/rollup.rb new file mode 100644 index 000000000..d3306b462 --- /dev/null +++ b/gems/prick/lib/prick/rollup.rb @@ -0,0 +1,77 @@ +# frozen_string_literal: true + +require_relative "classifier" +# Consume the sibling fix-cache gem for the churn signal -- do NOT +# re-derive it (boundary discipline: own categorization, consume churn). +require_relative "../../../fix-cache/lib/fix_cache" + +module Prick + # Per-file categorical rollup + the one genuinely-new signal: + # genuine-reachable arms x fix-churn = "bugs highly likely HERE". + module Rollup + ACTION = { + type_norm: "type/nil guard -> likely removable; CONFIRM with nil-kill (typed contract kills the cluster)", + dead: "decision never executes -> audit as dead code, delete (complexity down)", + defensive: "inert / invariant-pinned -> accept + annotate, drop from denominator", + ffi: "extern/require/module -> a few targeted .cht", + diagnostic: "raises -> one negative unit spec (fuzz cannot reach)", + genuine: "REAL reachable gap -> test it; if in churn-hot code, bug-likely" + }.freeze + CATS = ACTION.keys.freeze + + module_function + + # files: repo-relative .rb paths to triage (e.g. the fix-cache + # hotspots). repo: absolute root. resultset: SimpleCov json. + # Returns { per_file:, totals:, bug_likely: }. + def run(files:, repo:, resultset:) + repo = File.realpath(repo) + churn = begin + FixCache::Bugspots.from_git(repo) + rescue StandardError + {} + end + mx = churn.values.max + mx = 1.0 if mx.nil? || mx.zero? + + per_file = {} + bug_likely = [] + files.each do |rel| + abs = File.join(repo, rel) + next unless File.exist?(abs) + + arms = Classifier.classify_file(resultset, abs) + next if arms.empty? + + counts = Hash.new(0) + arms.each { |a| counts[a.category] += 1 } + total = arms.size + cn = (churn[rel] || 0.0) / mx + per_file[rel] = { + total: total, + pct: CATS.to_h { |c| [c, total.zero? ? 0 : (100.0 * counts[c] / total).round(1)] }, + counts: counts, + churn_norm: cn.round(3) + } + # the new signal: genuine arms weighted by the file's fix-churn + gen = arms.select { |a| a.category == :genuine } + next if gen.empty? + + bug_likely << { + file: rel, genuine: gen.size, churn_norm: cn.round(3), + score: (gen.size * cn).round(3), + sites: gen.first(8).map { |a| "#{a.file}:#{a.defn}:#{a.line}" } + } + end + + totals = Hash.new(0) + per_file.each_value { |h| h[:counts].each { |c, n| totals[c] += n } } + { + per_file: per_file, + totals: totals, + grand: totals.values.sum, + bug_likely: bug_likely.sort_by { |h| -h[:score] } + } + end + end +end diff --git a/gems/prick/prick.gemspec b/gems/prick/prick.gemspec new file mode 100644 index 000000000..7cd375b57 --- /dev/null +++ b/gems/prick/prick.gemspec @@ -0,0 +1,25 @@ +# frozen_string_literal: true + +Gem::Specification.new do |s| + s.name = "prick" + s.version = "0.0.1" + s.summary = "Categorical coverage-gap synthesis: not all gaps are equal" + s.description = <<~DESC + The capstone. A flat "673/2732 uncovered" is unactionable because + gaps are not equal. prick classifies every dark branch arm by + category -- type-normalization (likely removable, confirm with + nil-kill), defensive/invariant-pinned (accept), dead-decision + (delete: complexity down), or GENUINE reachable gap -- then overlays + fix-churn so the genuine arms in churn-hot code surface as "bugs + highly likely HERE." It OWNS the gap-categorization analysis and + CONSUMES fix-cache (churn) + an optional nil-kill verdict; it does + not re-derive them. Promotes tools/branch_prick.rb to a + first-class product. Zero runtime deps beyond the sibling fix-cache. + DESC + s.authors = ["CLEAR"] + s.license = "MIT" + s.files = Dir["lib/**/*.rb", "exe/*"] + s.bindir = "exe" + s.executables = ["prick"] + s.required_ruby_version = ">= 3.1" +end diff --git a/gems/prick/report.md b/gems/prick/report.md new file mode 100644 index 000000000..2dc0e0cdf --- /dev/null +++ b/gems/prick/report.md @@ -0,0 +1,56 @@ +# Prick Report + +> Not all coverage gaps are equal. Every dark branch arm +> categorized; the GENUINE arms x fix-churn = where bugs +> are highly likely. Owns categorization; consumes +> fix-cache (churn). type_norm = confirm with nil-kill. + +## Table of Contents +- [Category Rollup](#category-rollup) +- [Bugs Highly Likely (3)](#bugs-highly-likely-3) +- [Per-File Breakdown](#per-file-breakdown) +- [Run Summary](#run-summary) + +## Category Rollup +_935 dark arms across 3 file(s). Most are NOT test targets:_ + +| category | arms | % | action | +|---|---|---|---| +| **type_norm** | 229 | 24.5% | type/nil guard -> likely removable; CONFIRM with nil-kill (typed contract kills the cluster) | +| **dead** | 68 | 7.3% | decision never executes -> audit as dead code, delete (complexity down) | +| **defensive** | 14 | 1.5% | inert / invariant-pinned -> accept + annotate, drop from denominator | +| **ffi** | 46 | 4.9% | extern/require/module -> a few targeted .cht | +| **diagnostic** | 305 | 32.6% | raises -> one negative unit spec (fuzz cannot reach) | +| **genuine** | 273 | 29.2% | REAL reachable gap -> test it; if in churn-hot code, bug-likely | + +## Bugs Highly Likely (3) +_genuine reachable gaps in fix-churn-hot code -- triage top-down; this is the actionable ~slice:_ + +| # | file | genuine arms | churn | score | +|---|---|---|---|---| +| 1 | `src/mir/mir_lowering.rb` | 187 | 1.0 | 187.0 | +| 2 | `src/mir/control_flow.rb` | 64 | 0.231 | 14.783 | +| 3 | `src/mir/escape_analysis.rb` | 22 | 0.142 | 3.128 | + + Top file's genuine sites: + - /home/yahn/cheat/src/mir/mir_lowering.rb:hoist_alloc:226 + - /home/yahn/cheat/src/mir/mir_lowering.rb:hoist_owned_value_temp:244 + - /home/yahn/cheat/src/mir/mir_lowering.rb:owned_value_temp_needs_cleanup?:255 + - /home/yahn/cheat/src/mir/mir_lowering.rb:owned_value_temp_needs_cleanup?:260 + - /home/yahn/cheat/src/mir/mir_lowering.rb:owned_value_temp_needs_cleanup?:261 + - /home/yahn/cheat/src/mir/mir_lowering.rb:container_borrow_expr?:270 + - /home/yahn/cheat/src/mir/mir_lowering.rb:copy_container_borrow_if_needed:289 + - /home/yahn/cheat/src/mir/mir_lowering.rb:copy_container_borrow_if_needed:290 + +## Per-File Breakdown + +| file | total | type_norm | dead | defensive | genuine | ffi | diag | +|---|---|---|---|---|---|---|---| +| `src/mir/mir_lowering.rb` | 653 | 20.5% | 6.3% | 2.0% | 28.6% | 7.0% | 35.5% | +| `src/mir/control_flow.rb` | 170 | 31.8% | 10.0% | 0.6% | 37.6% | 0.0% | 20.0% | +| `src/mir/escape_analysis.rb` | 112 | 36.6% | 8.9% | 0.0% | 19.6% | 0.0% | 34.8% | + +## Run Summary +- Repo: `.` +- Files triaged: 3; dark arms: 935 +- Owns categorization; consumes fix-cache churn. type_norm arms: confirm removable with nil-kill (see docs/agents/design.md) diff --git a/gems/prick/test/classifier_test.rb b/gems/prick/test/classifier_test.rb new file mode 100644 index 000000000..e532a0823 --- /dev/null +++ b/gems/prick/test/classifier_test.rb @@ -0,0 +1,83 @@ +# frozen_string_literal: true + +require "minitest/autorun" +require "tempfile" +require "json" +require "coverage" +require_relative "../lib/prick" + +class ClassifierTest < Minitest::Test + C = Prick::Classifier + + def node(expr) + RubyVM::AbstractSyntaxTree.parse(expr).children.last + end + + def test_type_guard_detects_is_a_nil_respond_and_safe_nav + assert C.type_guard?(node("x.is_a?(Type)")) + assert C.type_guard?(node("x.nil?")) + assert C.type_guard?(node("x.respond_to?(:y)")) + assert C.type_guard?(node("x&.foo")) + refute C.type_guard?(node("x + 1")) + refute C.type_guard?(node("x.bar(1)")) + end + + def test_trivial_is_the_narrow_inert_residue + assert C.trivial?(nil) + assert C.trivial?(node("nil")) + refute C.trivial?(node("foo(1)")) # a call + refute C.trivial?(node("return 5")) # an outcome + refute C.trivial?(node("x = 1")) # an assignment + end + + def test_categorize_priority_order + g = node("x.is_a?(Type)") + # FFI method name wins first + assert_equal :ffi, C.categorize("lower_require", :if, g, true) + # diagnostic (raise) before type_norm + assert_equal :diagnostic, C.categorize("m", :if, node("raise 'x'"), true) + # type_norm before dead/defensive + assert_equal :type_norm, C.categorize("m", :if, g, false) + # no sibling taken + not type/diag/ffi -> dead + assert_equal :dead, C.categorize("m", :if, node("foo(1)"), false) + # live + trivial -> defensive + assert_equal :defensive, C.categorize("m", :if, node("nil"), true) + # live + real body + branch kind -> genuine + assert_equal :genuine, C.categorize("m", :case, node("foo(1)"), true) + end + + # Real resultset via stdlib Coverage (same branch-tuple shape SimpleCov + # uses), so classify_file runs the true path on real dark arms. + def test_classify_file_on_real_coverage + src = <<~RB + def shape(x, n) + return 0 if x.is_a?(String) # type_norm (dark: never String) + if n > 0 + a = 1 + else + a = 2 # genuine-ish (dark else, sibling taken) + end + a + end + shape(7, 5) + RB + f = Tempfile.new(["cov", ".rb"]) + f.write(src) + f.close + Coverage.start(branches: true) + load f.path + res = Coverage.result + rs = { "T" => { "coverage" => { f.path => { "branches" => res.dig(f.path, :branches) } } } } + rsf = Tempfile.new(["rs", ".json"]) + rsf.write(JSON.dump(rs)) + rsf.close + + arms = C.classify_file(rsf.path, f.path) + cats = arms.map(&:category) + assert_includes cats, :type_norm, "the never-true String guard" + refute_empty arms + ensure + f&.unlink + rsf&.unlink + end +end diff --git a/gems/prick/test/rollup_test.rb b/gems/prick/test/rollup_test.rb new file mode 100644 index 000000000..a48b129c0 --- /dev/null +++ b/gems/prick/test/rollup_test.rb @@ -0,0 +1,62 @@ +# frozen_string_literal: true + +require "minitest/autorun" +require "tmpdir" +require "json" +require "coverage" +require "fileutils" +require_relative "../lib/prick" + +class RollupTest < Minitest::Test + def test_rollup_categorizes_and_surfaces_genuine_with_churn_overlay + Dir.mktmpdir do |dir| + FileUtils.mkdir_p("#{dir}/src") + src = <<~RB + def shape(x, n) + return 0 if x.is_a?(String) + case n + when 1 then 10 + when 2 then 20 + else 30 + end + end + shape(7, 1) + RB + path = "#{dir}/src/m.rb" + File.write(path, src) + # real git repo so fix-cache churn is computable (no fix commit -> + # churn 0, score 0, but the genuine bucket still lists). + system("git", "-C", dir, "init", "-q", out: File::NULL, err: File::NULL) + system("git", "-C", dir, "config", "user.email", "t@t") + system("git", "-C", dir, "config", "user.name", "t") + system("git", "-C", dir, "add", "-A", out: File::NULL, err: File::NULL) + system("git", "-C", dir, "commit", "-qm", "add", out: File::NULL, err: File::NULL) + + Coverage.start(branches: true) + load path + res = Coverage.result + rs = { "T" => { "coverage" => { path => { "branches" => res.dig(path, :branches) } } } } + rsf = "#{dir}/.resultset.json" + File.write(rsf, JSON.dump(rs)) + + out = Prick::Rollup.run(files: ["src/m.rb"], repo: dir, resultset: rsf) + assert out[:per_file].key?("src/m.rb") + fh = out[:per_file]["src/m.rb"] + assert fh[:total].positive?, "should find dark arms" + assert(fh[:counts][:type_norm].positive?, "the never-String is_a? guard") + assert_equal fh[:total], fh[:counts].values.sum, "every arm categorized" + assert_equal out[:grand], out[:totals].values.sum + end + end + + def test_missing_file_is_skipped_not_crashed + Dir.mktmpdir do |dir| + system("git", "-C", dir, "init", "-q", out: File::NULL, err: File::NULL) + File.write("#{dir}/rs.json", JSON.dump({ "T" => { "coverage" => {} } })) + out = Prick::Rollup.run(files: ["nope.rb"], repo: dir, + resultset: "#{dir}/rs.json") + assert_empty out[:per_file] + assert_empty out[:bug_likely] + end + end +end diff --git a/sorbet/config b/sorbet/config index 14c784d5b..0acb1c403 100644 --- a/sorbet/config +++ b/sorbet/config @@ -12,6 +12,7 @@ --ignore=docs/ --ignore=tools/ --ignore=gems/decomplex/ +--ignore=gems/prick/ --ignore=gems/fix-cache/ --ignore=gems/nil-kill/ From cb4e2545e89eed50c556952442f70bd10922db82 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 23:20:30 +0000 Subject: [PATCH 08/45] =?UTF-8?q?prick:=20general=20engine=20=E2=80=94=20t?= =?UTF-8?q?op=20true=20gaps=20ranked=20by=20fix-churn?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses three valid critiques: 1. Repo-relative + linked paths. Was absolute (/home/yahn/cheat/...); now [src/x.rb:226](src/x.rb#L226). 2. Report leads with the actionable artifact. Dropped the unhelpful per-file %-table; the headline is now "Top True Gaps (N) -- test these, ranked by fix-churn": every genuine reachable arm, linked, sorted by the file's fix-cache churn score. Compact category summary follows as context. 3. General gem, no baked-in repo lexicon/jargon. Category action text is now testing-strategy-neutral (no .cht / fuzz / nil-kill). The FFI/external-boundary lexicon ships EMPTY in the gem and is caller-supplied via --ffi (CLEAR's set lives in exe/prick, not the library). DIAGNOSTIC_MIDS is general Ruby. The engine (categorize uncovered branches, rank genuine by consumed fix-cache churn) is general to any Ruby project. classifier: ffi_boundary injected (kwarg, default []); doc comment de-jargoned + rename-mangled history ref removed. rollup: emits top_gaps (genuine arms ranked by churn) instead of file-level bug_likely. README/design.md rewritten generic + caveats kept. Tests updated for new signatures/shape; 6/30/0. report.md regenerated (Top True Gaps headline, linked paths). Co-Authored-By: Claude Opus 4.7 --- gems/prick/README.md | 83 ++++++++++---------- gems/prick/docs/agents/design.md | 122 ++++++++++++++++------------- gems/prick/exe/prick | 23 ++++-- gems/prick/lib/prick/classifier.rb | 44 +++++------ gems/prick/lib/prick/report.rb | 80 ++++++++----------- gems/prick/lib/prick/rollup.rb | 61 +++++++-------- gems/prick/report.md | 114 ++++++++++++++++----------- gems/prick/test/classifier_test.rb | 2 +- gems/prick/test/rollup_test.rb | 2 +- 9 files changed, 281 insertions(+), 250 deletions(-) diff --git a/gems/prick/README.md b/gems/prick/README.md index af454851b..8096f8612 100644 --- a/gems/prick/README.md +++ b/gems/prick/README.md @@ -1,57 +1,58 @@ -# prick: not all coverage gaps are equal. +# prick: pricks holes in your codebase. - * A flat "673/2732 uncovered" is unactionable. prick categorizes - every dark branch arm and tells you which to delete, which to - accept, which nil-kill should resolve, and which are GENUINE gaps - where bugs are highly likely. - * The capstone over decomplex / fix-cache / nil-kill: it OWNS gap - categorization and CONSUMES fix-cache's churn (and an optional - nil-kill verdict). It re-derives nothing. +A flat "673/2732 uncovered" is unactionable. **prick** categorizes +every dark branch arm and gives you the one thing you want: **the top +true gaps to test, ranked by fix-churn.** -## The categories +It is a **general engine** — it categorizes uncovered branches and +ranks the genuine ones by consumed fix-cache churn. It ships **no +project lexicon**; the only project-specific input (your +external/boundary method names) is caller-supplied via `--ffi`. -| category | what to do | -|---|---| -| `type_norm` | likely removable — confirm with nil-kill (a typed contract kills the cluster) | -| `dead` | decision never executes — audit & delete (complexity down) | -| `defensive` | inert / invariant-pinned — accept, drop from the denominator | -| `ffi` | extern/require/module — a few targeted `.cht` | -| `diagnostic` | raises — one negative unit spec | -| `genuine` | the REAL gap — test it; in churn-hot code = **bug highly likely** | +## The report + +1. **Top True Gaps** — every genuine reachable gap, repo-relative + + linked, ranked by the file's fix-cache churn score. This is the + list: "test these, in this order." +2. **Category Summary** — the rest of the dark arms, so you can see + why most are *not* test targets: -The headline signal: **`genuine` × fix-churn = "bugs highly likely -HERE"** — the small slice actually worth your time. +| category | meaning | +|---|---| +| `type_norm` | type/nil guard — likely dead if the contract were strictly typed | +| `dead` | decision never executes — audit as dead code | +| `defensive` | inert / invariant-pinned — accept, exclude from denominator | +| `ffi` | a caller-declared external/boundary call — needs an integration test | +| `diagnostic` | error/raise path — reachable only by invalid input | +| `genuine` | the real reachable gap — **test it** (these are ranked above) | ## Usage ``` prick report --repo=. --coverage=coverage/.resultset.json \ - --output=report.md -prick report --files=src/mir/mir_lowering.rb # specific files + --output=report.md \ + --files=src/a.rb,src/b.rb \ + --ffi=my_extern_call,my_boundary_method ``` Needs `coverage/.resultset.json` (SimpleCov `enable_coverage :branch`) and a git repo (for the fix-cache churn overlay). See -[report.md](report.md) for a demo over CLEAR's lowering passes. - -## What it found on CLEAR - -935 dark arms across the 3 lowering passes: only ~29% genuine, ~33% -diagnostic, ~24% type_norm (→ nil-kill), ~7% dead (→ delete). The -"bugs highly likely" #1 is `src/mir/mir_lowering.rb` (187 genuine × -top churn) — and its top sites are `hoist_alloc` / -`owned_value_temp_needs_cleanup?`, the exact methods that produced -real bugs B1–B4. The synthesis points at real bugs. - -## What it is NOT - - * Not a re-implementation. It consumes fix-cache; it does not compute - churn or type pressure itself. - * Not a verdict. Categories are ranked candidates (Engler - discipline). v0 precision caveats — `diagnostic` over-greedy, - `type_norm` under-counted (no intra-proc local resolution yet) — - are documented in [docs/agents/design.md](docs/agents/design.md). - The bug-likely join is the sound, validated part. +[report.md](report.md) for a demo. + +## Boundary + +prick **owns** gap-categorization. It **consumes** the sibling +`fix-cache` gem for churn (it does not compute churn itself) and an +optional nil-kill verdict for the `type_norm` bucket. It re-derives +nothing. + +## Not a verdict + +Categories are ranked candidates (Engler discipline). v0 precision +caveats — `diagnostic` is over-greedy, `type_norm` under-counted (no +intra-procedural local→accessor resolution yet) — are documented in +[docs/agents/design.md](docs/agents/design.md). The Top-True-Gaps +ranking is the sound, validated part. ## Links diff --git a/gems/prick/docs/agents/design.md b/gems/prick/docs/agents/design.md index 8109615cf..508b81892 100644 --- a/gems/prick/docs/agents/design.md +++ b/gems/prick/docs/agents/design.md @@ -1,68 +1,78 @@ # prick — design -## Why this exists (and why it IS a gem) +## What it is (and why it is a general gem) -A flat "673/2732 uncovered" is unactionable: gaps are not equal. -prick is the **capstone** — it turns the raw coverage gap into a -prioritised, categorical answer. It was promoted from the one-off -`tools/branch_prick.rb` probe because it is a coherent, reusable, -versioned product with its own identity, exactly like fix-cache (which -is itself an aggregation gem). "It's an aggregation" is an argument to -*consume* the other tools, not against gem status. +A flat uncovered-count is unactionable: gaps are not equal. prick is +a **general engine**: it categorizes every dark branch arm by +reachability class and ranks the genuine ones by consumed fix-cache +churn — "the top true gaps to test, in order." It is a gem for the +same reason fix-cache is: a coherent, reusable, versioned product +with its own identity. "It's an aggregation" argues for *consuming* +the others, not against gem status. + +## Generality / no baked-in lexicon + +The earlier objection was correct: the first cut baked CLEAR jargon +(`.cht`, `fuzz`, `nil-kill`) and CLEAR's FFI method names into the +gem. Fixed: + +- **Vocabulary is generic.** Category actions are + testing-strategy-neutral ("error/raise path — invalid input only", + "external/boundary — integration test"). No project jargon. +- **The project lexicon is caller-supplied.** `ffi_boundary:` (the + external/boundary method names) defaults to empty in the gem; the + consuming project passes its own (CLEAR's set lives in the CLI + `exe/prick`, not the library). `DIAGNOSTIC_MIDS` is general Ruby + (`raise`/`fail`/`abort`). + +The *engine* — categorize uncovered branches, rank genuine by churn — +is general to any Ruby project with branch coverage + git history. ## Boundary -OWNS the gap-categorization analysis (the per-arm classifier, the -dead/live decision split, the categorical rollup). CONSUMES -`fix-cache` (churn) via the sibling gem; CONSUMES an optional nil-kill -verdict for type_norm removability. Re-derives nothing. +OWNS gap-categorization (AST-structural per-arm classifier, dead/live +decision split, category rollup, the gap ranking). CONSUMES the +sibling `fix-cache` gem for churn (require_relative, not re-derived) +and an optional nil-kill verdict for `type_norm`. Re-derives nothing. -## Categories (the user's model: not all gaps equal) +## Categories -| category | meaning | action | +| category | meaning | not a test target? | |---|---|---| -| `type_norm` | arm/decision guards a type/nil check (`is_a?`/`kind_of?`/`nil?`/`respond_to?`/safe-nav) | likely removable — CONFIRM with nil-kill; a typed contract kills the whole cluster | -| `dead` | no sibling arm of the decision ever taken: decision never executes | audit as dead code → delete (complexity down) | -| `defensive` | live decision, inert/invariant-pinned (empty else, `nil`) | accept + annotate, drop from denominator | -| `ffi` | extern/require/module boundary | a few targeted `.cht` | -| `diagnostic` | arm raises/diagnoses | one negative unit spec (fuzz cannot reach) | -| `genuine` | live, reachable, input-determined, none of the above | the REAL gap — test it | - -The one genuinely-new signal: **`genuine` arms × fix-cache churn = -"bugs highly likely HERE"** — the small actionable slice. - -## Classification is AST-structural - -Never a regex over the arm line. The SimpleCov parent tuple gives the -decision kind; the arm's `(line,col)` span is matched to an AST node; -the decision's CONDITION (parent node's first child — where a -type-guard lives) and the arm body are inspected. The FFI-boundary -method set and diagnostic message names are the only per-project -lexicon. - -## Honest v0 precision caveats (Engler discipline: ranked, refine) - -- `diagnostic` is **over-greedy**: it tags any arm whose subtree - contains `raise`/`fail`/`abort` *anywhere*, not "the arm IS - primarily a raise". Over-counts vs the older probe (305 vs 16). - Refinement: require the raise to be the arm's dominant outcome. -- `type_norm` is **under-counted**: the classifier does not do - decomplex's intra-procedural `local = recv.accessor` resolution, so - a guard on a local that came from `.type_info` is missed unless the - guard is syntactically on the accessor. Refinement: fold in - decomplex's local→contract resolution (consume, don't re-derive). -- Net: the *shape* and the bug-likely join are sound and empirically - validated (top genuine sites are the exact cleanup/ownership - methods that produced bugs B1–B4); the per-category percentages are - candidates to tighten, not verdicts. - -## Validated result (src/mir/{mir_lowering,control_flow,escape_analysis}) - -935 dark arms: diagnostic 305 (32.6%), genuine 273 (29.2%), type_norm -229 (24.5%), dead 68 (7.3%), ffi 46 (4.9%), defensive 14 (1.5%). -Bugs-highly-likely #1: `src/mir/mir_lowering.rb` — 187 genuine arms × -churn 1.0; top sites `hoist_alloc`, `owned_value_temp_needs_cleanup?` -— the exact methods behind B1–B4. The synthesis points at real bugs. +| `type_norm` | type/nil guard (`is_a?`/`kind_of?`/`nil?`/`respond_to?`/safe-nav) | yes — likely dead if the contract were strictly typed | +| `dead` | no sibling arm ever taken: decision never executes | yes — audit/delete | +| `defensive` | live, inert/invariant-pinned | yes — accept | +| `ffi` | caller-declared external/boundary method | special — integration test | +| `diagnostic` | arm raises/diagnoses | special — invalid-input only | +| `genuine` | live, reachable, input-determined | **NO — this is the gap; ranked by churn** | + +## Report shape (per the user's ask) + +Leads with **Top True Gaps**: every `genuine` arm, repo-relative + +markdown-linked (`[src/x.rb:226](src/x.rb#L226)`), ranked by the +file's normalized fix-cache churn. Then a compact category summary +(not a per-file %-table — that was unhelpful). The actionable list +is the headline; the rest is context. + +## AST-structural, never a line regex + +SimpleCov parent tuple → decision kind; arm `(line,col)` span → AST +node; the decision's *condition* (parent first child, where a +type-guard lives) and the arm body are inspected. + +## Honest v0 precision caveats (Engler: ranked, refine) + +- `diagnostic` over-greedy: tags any arm whose subtree contains + `raise`/`fail`/`abort` anywhere, not "the arm IS primarily a + raise." Over-counts. Refine: require it to be the dominant outcome. +- `type_norm` under-counted: no intra-procedural `local = + recv.accessor` resolution, so a guard on a local sourced from + `.type_info` is missed unless syntactically on the accessor. Refine + by consuming decomplex's local→contract resolution (don't re-derive). +- The Top-True-Gaps ranking and the categorization *shape* are sound + and validated (top genuine sites are the exact cleanup/ownership + methods that produced real bugs B1–B4); the per-category + percentages are candidates to tighten, not verdicts. ## Self-tested diff --git a/gems/prick/exe/prick b/gems/prick/exe/prick index 8325f45e1..b84f6c8f9 100644 --- a/gems/prick/exe/prick +++ b/gems/prick/exe/prick @@ -5,13 +5,15 @@ require_relative "../lib/prick" def usage warn <<~U - prick -- categorical coverage-gap synthesis + prick -- top true coverage gaps, ranked by fix-churn prick report [--repo=.] [--coverage=coverage/.resultset.json] \\ - [--output=report.md] [--files=a.rb,b.rb] + [--output=report.md] [--files=a.rb,b.rb] \\ + [--ffi=meth1,meth2] [--top=N] - --files repo-relative .rb to triage; default: the src/mir lowering - passes (the fix-cache hotspots). + --files repo-relative .rb to triage (default: CLEAR's lowering passes) + --ffi project external-boundary method names (the per-project + lexicon; the gem ships none -- it is general) U exit 1 end @@ -19,7 +21,15 @@ end usage if ARGV.empty? || %w[-h --help].include?(ARGV[0]) usage unless ARGV[0] == "report" +# CLEAR's project lexicon lives HERE (the caller), not in the gem. +CLEAR_FFI_BOUNDARY = %w[ + build_extern_trampoline_call build_extern_trampoline_method + build_extern_trampoline_common lower_extern_direct_call + lower_require lower_module +].freeze + opts = { repo: ".", coverage: "coverage/.resultset.json", output: nil, + top: 50, ffi: CLEAR_FFI_BOUNDARY, files: %w[src/mir/mir_lowering.rb src/mir/control_flow.rb src/mir/escape_analysis.rb] } ARGV[1..].each do |a| @@ -28,12 +38,15 @@ ARGV[1..].each do |a| when /\A--coverage=(.+)/ then opts[:coverage] = Regexp.last_match(1) when /\A--output=(.+)/ then opts[:output] = Regexp.last_match(1) when /\A--files=(.+)/ then opts[:files] = Regexp.last_match(1).split(",") + when /\A--ffi=(.+)/ then opts[:ffi] = Regexp.last_match(1).split(",") + when /\A--top=(\d+)/ then opts[:top] = Regexp.last_match(1).to_i else usage end end md = Prick::Report.new( - files: opts[:files], repo: opts[:repo], resultset: opts[:coverage] + files: opts[:files], repo: opts[:repo], resultset: opts[:coverage], + ffi_boundary: opts[:ffi], top: opts[:top] ).to_markdown if opts[:output] diff --git a/gems/prick/lib/prick/classifier.rb b/gems/prick/lib/prick/classifier.rb index c72613d64..2629195b6 100644 --- a/gems/prick/lib/prick/classifier.rb +++ b/gems/prick/lib/prick/classifier.rb @@ -4,30 +4,28 @@ module Prick # Classifies every never-taken branch arm in a target file into ONE - # actionable category. Ported + extended from tools/branch_prick - # (which was the one-off probe this gem promotes). AST-structural, - # never a regex over the arm line. + # actionable category. AST-structural, never a regex over the arm + # line. General -- no project lexicon baked in (see ffi_boundary:). # - # Categories (the user's model -- not all gaps are equal): - # :type_norm arm guards a type/nil check (is_a?/kind_of?/nil?/ - # respond_to?/safe-nav). Likely removable -- the loose - # contract should be typed; CONFIRM with nil-kill. + # Categories (not all gaps are equal): + # :type_norm arm/decision guards a type/nil check (is_a?/kind_of?/ + # nil?/respond_to?/safe-nav). Likely dead if the + # contract were strictly typed. # :dead no sibling arm of the decision is ever taken: the - # decision never executes. Dead/internal path -> audit - # for deletion (complexity down). + # decision never executes. Audit as dead code. # :defensive live decision, inert/pinned polarity (empty else, - # nil, invariant-guaranteed). Accept + annotate. - # :ffi extern/require/module boundary -> targeted .cht. - # :diagnostic arm raises/diagnoses -> invalid-input only -> spec. + # nil, invariant-guaranteed). Accept. + # :ffi a caller-declared external/boundary method -> needs + # an integration test. + # :diagnostic arm raises/diagnoses -> invalid-input only. # :genuine live, reachable, input-determined, none of the above. - # The real gap. Overlaid with fix-churn = "bug-likely". + # The real gap. Ranked by fix-churn downstream. module Classifier - FFI_BOUNDARY = %w[ - build_extern_trampoline_call build_extern_trampoline_method - build_extern_trampoline_common lower_extern_direct_call - lower_require lower_module - ].freeze - DIAGNOSTIC_MIDS = %i[raise fail abort].freeze + # The gem ships NO project lexicon -- it is general. The consuming + # project supplies its external/boundary method names via + # `ffi_boundary:` (CLEAR passes its set from the CLI). Empty here + # by design. + DIAGNOSTIC_MIDS = %i[raise fail abort].freeze # general Ruby GUARD_MIDS = %i[is_a? kind_of? instance_of? nil? respond_to?].freeze Arm = Struct.new(:file, :defn, :line, :category, keyword_init: true) @@ -121,7 +119,7 @@ def has_any_call?(node) end # -> [Arm, ...] for every dark arm in abspath. - def classify_file(resultset, abspath) + def classify_file(resultset, abspath, ffi_boundary: []) branches = merged_branches(resultset, abspath) return [] if branches.empty? @@ -149,15 +147,15 @@ def classify_file(resultset, abspath) sl, sc, el, ec = a[2].to_i, a[3].to_i, a[4].to_i, a[5].to_i meth = midx[sl] || "(top-level)" anode = node_for(nodes, sl, sc, el, ec) - cat = categorize(meth, pkind, anode, any_taken, cond) + cat = categorize(meth, pkind, anode, any_taken, cond, ffi_boundary) out << Arm.new(file: abspath, defn: meth, line: sl, category: cat) end end out end - def categorize(method, pkind, anode, sibling_taken, cond = nil) - return :ffi if FFI_BOUNDARY.include?(method) + def categorize(method, pkind, anode, sibling_taken, cond = nil, ffi_boundary = []) + return :ffi if ffi_boundary.include?(method) return :diagnostic if anode && subtree(anode, mids: DIAGNOSTIC_MIDS) # type/nil guard family: check the decision's CONDITION and the # arm body -> the decomplex DecisionPressure class. diff --git a/gems/prick/lib/prick/report.rb b/gems/prick/lib/prick/report.rb index 65067c94e..4abad1581 100644 --- a/gems/prick/lib/prick/report.rb +++ b/gems/prick/lib/prick/report.rb @@ -3,68 +3,56 @@ require_relative "rollup" module Prick - # Markdown report, structured like decomplex / fix-cache / nil-kill. + # Markdown report. Leads with the actionable artifact: the top true + # gaps, repo-relative + linked, ranked by fix-cache churn score. class Report - def initialize(files:, repo:, resultset:) + def initialize(files:, repo:, resultset:, ffi_boundary: [], top: 50) @repo = repo - @r = Rollup.run(files: files, repo: repo, resultset: resultset) + @top = top + @r = Rollup.run(files: files, repo: repo, resultset: resultset, + ffi_boundary: ffi_boundary) end def to_markdown - o = +"# Prick Report\n\n" - o << "> Not all coverage gaps are equal. Every dark branch arm\n" \ - "> categorized; the GENUINE arms x fix-churn = where bugs\n" \ - "> are highly likely. Owns categorization; consumes\n" \ - "> fix-cache (churn). type_norm = confirm with nil-kill.\n\n" - - o << "## Table of Contents\n" - o << "- [Category Rollup](#category-rollup)\n" - o << "- [Bugs Highly Likely (#{@r[:bug_likely].size})](#bugs-highly-likely-#{@r[:bug_likely].size})\n" - o << "- [Per-File Breakdown](#per-file-breakdown)\n" - o << "- [Run Summary](#run-summary)\n\n" - + gaps = @r[:top_gaps] g = @r[:grand] - o << "## Category Rollup\n" - o << "_#{g} dark arms across #{@r[:per_file].size} file(s). " \ - "Most are NOT test targets:_\n\n" - o << "| category | arms | % | action |\n|---|---|---|---|\n" - Rollup::CATS.each do |c| - n = @r[:totals][c].to_i - pct = g.zero? ? 0 : (100.0 * n / g).round(1) - o << "| **#{c}** | #{n} | #{pct}% | #{Rollup::ACTION[c]} |\n" - end - o << "\n" + o = +"# Prick Report\n\n" + o << "> Top true coverage gaps to test, ranked by fix-churn.\n" \ + "> Every dark branch arm is categorized; only the GENUINE\n" \ + "> reachable ones are gaps worth testing. Owns\n" \ + "> categorization; consumes fix-cache for churn.\n\n" - o << "## Bugs Highly Likely (#{@r[:bug_likely].size})\n" - o << "_genuine reachable gaps in fix-churn-hot code -- triage " \ - "top-down; this is the actionable ~slice:_\n\n" - if @r[:bug_likely].empty? + o << "## Top True Gaps (#{gaps.size}) — test these, ranked by fix-churn\n\n" + if gaps.empty? o << "None.\n\n" else - o << "| # | file | genuine arms | churn | score |\n|---|---|---|---|---|\n" - @r[:bug_likely].first(30).each_with_index do |h, i| - o << "| #{i + 1} | `#{h[:file]}` | #{h[:genuine]} | " \ - "#{h[:churn_norm]} | #{h[:score]} |\n" + o << "| # | gap | method | churn |\n|---|---|---|---|\n" + gaps.first(@top).each_with_index do |x, i| + link = "[`#{x[:file]}:#{x[:line]}`](#{x[:file]}#L#{x[:line]})" + o << "| #{i + 1} | #{link} | `#{x[:method]}` | #{x[:churn]} |\n" end - o << "\n Top file's genuine sites:\n" - top = @r[:bug_likely].first - top[:sites].each { |s| o << " - #{s}\n" } + o << "\n- ...(+#{gaps.size - @top} more genuine gaps)\n" if gaps.size > @top o << "\n" end - o << "## Per-File Breakdown\n\n" - o << "| file | total | type_norm | dead | defensive | genuine | ffi | diag |\n" - o << "|---|---|---|---|---|---|---|---|\n" - @r[:per_file].sort_by { |_, h| -h[:total] }.each do |f, h| - p = h[:pct] - o << "| `#{f}` | #{h[:total]} | #{p[:type_norm]}% | #{p[:dead]}% " \ - "| #{p[:defensive]}% | #{p[:genuine]}% | #{p[:ffi]}% | #{p[:diagnostic]}% |\n" + o << "## Category Summary\n" + o << "_#{g} dark arms; only #{gaps.size} are genuine gaps. " \ + "The rest are not test targets:_\n\n" + o << "| category | arms | % | what it means |\n|---|---|---|---|\n" + Rollup::CATS.each do |c| + n = @r[:totals][c].to_i + pct = g.zero? ? 0 : (100.0 * n / g).round(1) + o << "| #{c} | #{n} | #{pct}% | #{Rollup::ACTION[c]} |\n" end + o << "\n## Run Summary\n" o << "- Repo: `#{@repo}`\n" - o << "- Files triaged: #{@r[:per_file].size}; dark arms: #{g}\n" - o << "- Owns categorization; consumes fix-cache churn. type_norm " \ - "arms: confirm removable with nil-kill (see docs/agents/design.md)\n" + o << "- Files: #{@r[:per_file].size}; dark arms: #{g}; " \ + "genuine gaps: #{gaps.size}\n" + o << "- General engine: categorizes uncovered branches, ranks " \ + "genuine gaps by consumed fix-cache churn. Project lexicon " \ + "(external-boundary methods) is caller-supplied, not baked " \ + "in (see docs/agents/design.md).\n" o end end diff --git a/gems/prick/lib/prick/rollup.rb b/gems/prick/lib/prick/rollup.rb index d3306b462..453a93ece 100644 --- a/gems/prick/lib/prick/rollup.rb +++ b/gems/prick/lib/prick/rollup.rb @@ -2,29 +2,33 @@ require_relative "classifier" # Consume the sibling fix-cache gem for the churn signal -- do NOT -# re-derive it (boundary discipline: own categorization, consume churn). +# re-derive it (boundary: own categorization, consume churn). require_relative "../../../fix-cache/lib/fix_cache" module Prick - # Per-file categorical rollup + the one genuinely-new signal: - # genuine-reachable arms x fix-churn = "bugs highly likely HERE". + # Per-file categorical totals + the headline artifact: every GENUINE + # reachable gap, repo-relative, ranked by the file's fix-cache churn + # score. "Here are the top N true gaps to test." module Rollup + # Generic vocabulary -- NO repo jargon. Recommended-action text is + # testing-strategy-neutral; the consuming project decides what a + # "negative test" / "integration test" concretely is. ACTION = { - type_norm: "type/nil guard -> likely removable; CONFIRM with nil-kill (typed contract kills the cluster)", - dead: "decision never executes -> audit as dead code, delete (complexity down)", - defensive: "inert / invariant-pinned -> accept + annotate, drop from denominator", - ffi: "extern/require/module -> a few targeted .cht", - diagnostic: "raises -> one negative unit spec (fuzz cannot reach)", - genuine: "REAL reachable gap -> test it; if in churn-hot code, bug-likely" + type_norm: "type/nil guard -- likely dead if the contract were strictly typed", + dead: "decision never executes -- audit as dead code, delete", + defensive: "inert / invariant-pinned -- accept, exclude from denominator", + ffi: "external/boundary call -- needs an integration test", + diagnostic: "error/raise path -- reachable only by invalid input (negative test)", + genuine: "real reachable gap -- test it; ranked by fix-churn below" }.freeze CATS = ACTION.keys.freeze module_function - # files: repo-relative .rb paths to triage (e.g. the fix-cache - # hotspots). repo: absolute root. resultset: SimpleCov json. - # Returns { per_file:, totals:, bug_likely: }. - def run(files:, repo:, resultset:) + # files: repo-relative .rb paths. repo: absolute root. resultset: + # SimpleCov json. ffi_boundary: caller-supplied lexicon (the gem + # ships NONE -- it is general; the consuming repo provides its own). + def run(files:, repo:, resultset:, ffi_boundary: []) repo = File.realpath(repo) churn = begin FixCache::Bugspots.from_git(repo) @@ -35,33 +39,24 @@ def run(files:, repo:, resultset:) mx = 1.0 if mx.nil? || mx.zero? per_file = {} - bug_likely = [] + gaps = [] files.each do |rel| abs = File.join(repo, rel) next unless File.exist?(abs) - arms = Classifier.classify_file(resultset, abs) + arms = Classifier.classify_file(resultset, abs, ffi_boundary: ffi_boundary) next if arms.empty? counts = Hash.new(0) arms.each { |a| counts[a.category] += 1 } - total = arms.size - cn = (churn[rel] || 0.0) / mx - per_file[rel] = { - total: total, - pct: CATS.to_h { |c| [c, total.zero? ? 0 : (100.0 * counts[c] / total).round(1)] }, - counts: counts, - churn_norm: cn.round(3) - } - # the new signal: genuine arms weighted by the file's fix-churn - gen = arms.select { |a| a.category == :genuine } - next if gen.empty? + cn = ((churn[rel] || 0.0) / mx).round(4) + per_file[rel] = { total: arms.size, counts: counts, churn: cn } - bug_likely << { - file: rel, genuine: gen.size, churn_norm: cn.round(3), - score: (gen.size * cn).round(3), - sites: gen.first(8).map { |a| "#{a.file}:#{a.defn}:#{a.line}" } - } + arms.each do |a| + next unless a.category == :genuine + + gaps << { file: rel, line: a.line, method: a.defn, churn: cn } + end end totals = Hash.new(0) @@ -70,7 +65,9 @@ def run(files:, repo:, resultset:) per_file: per_file, totals: totals, grand: totals.values.sum, - bug_likely: bug_likely.sort_by { |h| -h[:score] } + # the headline: true gaps ranked by fix-cache score, then + # file/line for stable order. + top_gaps: gaps.sort_by { |g| [-g[:churn], g[:file], g[:line]] } } end end diff --git a/gems/prick/report.md b/gems/prick/report.md index 2dc0e0cdf..1449a1a80 100644 --- a/gems/prick/report.md +++ b/gems/prick/report.md @@ -1,56 +1,80 @@ # Prick Report -> Not all coverage gaps are equal. Every dark branch arm -> categorized; the GENUINE arms x fix-churn = where bugs -> are highly likely. Owns categorization; consumes -> fix-cache (churn). type_norm = confirm with nil-kill. +> Top true coverage gaps to test, ranked by fix-churn. +> Every dark branch arm is categorized; only the GENUINE +> reachable ones are gaps worth testing. Owns +> categorization; consumes fix-cache for churn. -## Table of Contents -- [Category Rollup](#category-rollup) -- [Bugs Highly Likely (3)](#bugs-highly-likely-3) -- [Per-File Breakdown](#per-file-breakdown) -- [Run Summary](#run-summary) +## Top True Gaps (273) — test these, ranked by fix-churn -## Category Rollup -_935 dark arms across 3 file(s). Most are NOT test targets:_ - -| category | arms | % | action | +| # | gap | method | churn | |---|---|---|---| -| **type_norm** | 229 | 24.5% | type/nil guard -> likely removable; CONFIRM with nil-kill (typed contract kills the cluster) | -| **dead** | 68 | 7.3% | decision never executes -> audit as dead code, delete (complexity down) | -| **defensive** | 14 | 1.5% | inert / invariant-pinned -> accept + annotate, drop from denominator | -| **ffi** | 46 | 4.9% | extern/require/module -> a few targeted .cht | -| **diagnostic** | 305 | 32.6% | raises -> one negative unit spec (fuzz cannot reach) | -| **genuine** | 273 | 29.2% | REAL reachable gap -> test it; if in churn-hot code, bug-likely | - -## Bugs Highly Likely (3) -_genuine reachable gaps in fix-churn-hot code -- triage top-down; this is the actionable ~slice:_ +| 1 | [`src/mir/mir_lowering.rb:226`](src/mir/mir_lowering.rb#L226) | `hoist_alloc` | 1.0 | +| 2 | [`src/mir/mir_lowering.rb:244`](src/mir/mir_lowering.rb#L244) | `hoist_owned_value_temp` | 1.0 | +| 3 | [`src/mir/mir_lowering.rb:255`](src/mir/mir_lowering.rb#L255) | `owned_value_temp_needs_cleanup?` | 1.0 | +| 4 | [`src/mir/mir_lowering.rb:260`](src/mir/mir_lowering.rb#L260) | `owned_value_temp_needs_cleanup?` | 1.0 | +| 5 | [`src/mir/mir_lowering.rb:261`](src/mir/mir_lowering.rb#L261) | `owned_value_temp_needs_cleanup?` | 1.0 | +| 6 | [`src/mir/mir_lowering.rb:270`](src/mir/mir_lowering.rb#L270) | `container_borrow_expr?` | 1.0 | +| 7 | [`src/mir/mir_lowering.rb:289`](src/mir/mir_lowering.rb#L289) | `copy_container_borrow_if_needed` | 1.0 | +| 8 | [`src/mir/mir_lowering.rb:290`](src/mir/mir_lowering.rb#L290) | `copy_container_borrow_if_needed` | 1.0 | +| 9 | [`src/mir/mir_lowering.rb:361`](src/mir/mir_lowering.rb#L361) | `cleanup_entry_for_heap_result` | 1.0 | +| 10 | [`src/mir/mir_lowering.rb:362`](src/mir/mir_lowering.rb#L362) | `cleanup_entry_for_heap_result` | 1.0 | +| 11 | [`src/mir/mir_lowering.rb:369`](src/mir/mir_lowering.rb#L369) | `cleanup_entry_for_heap_result` | 1.0 | +| 12 | [`src/mir/mir_lowering.rb:435`](src/mir/mir_lowering.rb#L435) | `lower` | 1.0 | +| 13 | [`src/mir/mir_lowering.rb:462`](src/mir/mir_lowering.rb#L462) | `lower` | 1.0 | +| 14 | [`src/mir/mir_lowering.rb:499`](src/mir/mir_lowering.rb#L499) | `lower` | 1.0 | +| 15 | [`src/mir/mir_lowering.rb:517`](src/mir/mir_lowering.rb#L517) | `lower_body` | 1.0 | +| 16 | [`src/mir/mir_lowering.rb:529`](src/mir/mir_lowering.rb#L529) | `lower_body` | 1.0 | +| 17 | [`src/mir/mir_lowering.rb:544`](src/mir/mir_lowering.rb#L544) | `lower_body_with_break` | 1.0 | +| 18 | [`src/mir/mir_lowering.rb:552`](src/mir/mir_lowering.rb#L552) | `lower_body_with_break` | 1.0 | +| 19 | [`src/mir/mir_lowering.rb:557`](src/mir/mir_lowering.rb#L557) | `lower_body_with_break` | 1.0 | +| 20 | [`src/mir/mir_lowering.rb:583`](src/mir/mir_lowering.rb#L583) | `lower_program` | 1.0 | +| 21 | [`src/mir/mir_lowering.rb:589`](src/mir/mir_lowering.rb#L589) | `lower_program` | 1.0 | +| 22 | [`src/mir/mir_lowering.rb:674`](src/mir/mir_lowering.rb#L674) | `alloc_expr` | 1.0 | +| 23 | [`src/mir/mir_lowering.rb:681`](src/mir/mir_lowering.rb#L681) | `alloc_from_sym` | 1.0 | +| 24 | [`src/mir/mir_lowering.rb:682`](src/mir/mir_lowering.rb#L682) | `alloc_from_sym` | 1.0 | +| 25 | [`src/mir/mir_lowering.rb:700`](src/mir/mir_lowering.rb#L700) | `coerce_stdlib_arg` | 1.0 | +| 26 | [`src/mir/mir_lowering.rb:709`](src/mir/mir_lowering.rb#L709) | `coerce_stdlib_arg` | 1.0 | +| 27 | [`src/mir/mir_lowering.rb:750`](src/mir/mir_lowering.rb#L750) | `resolve_alloc_sym` | 1.0 | +| 28 | [`src/mir/mir_lowering.rb:774`](src/mir/mir_lowering.rb#L774) | `alloc_zig_str` | 1.0 | +| 29 | [`src/mir/mir_lowering.rb:775`](src/mir/mir_lowering.rb#L775) | `alloc_zig_str` | 1.0 | +| 30 | [`src/mir/mir_lowering.rb:893`](src/mir/mir_lowering.rb#L893) | `resolve_decl_stdlib_alloc` | 1.0 | +| 31 | [`src/mir/mir_lowering.rb:915`](src/mir/mir_lowering.rb#L915) | `lower_promote` | 1.0 | +| 32 | [`src/mir/mir_lowering.rb:951`](src/mir/mir_lowering.rb#L951) | `lower_struct_def` | 1.0 | +| 33 | [`src/mir/mir_lowering.rb:1061`](src/mir/mir_lowering.rb#L1061) | `lower_union_def` | 1.0 | +| 34 | [`src/mir/mir_lowering.rb:1245`](src/mir/mir_lowering.rb#L1245) | `lower_function_def` | 1.0 | +| 35 | [`src/mir/mir_lowering.rb:1434`](src/mir/mir_lowering.rb#L1434) | `lower_function_def` | 1.0 | +| 36 | [`src/mir/mir_lowering.rb:1519`](src/mir/mir_lowering.rb#L1519) | `build_post_outer_fn` | 1.0 | +| 37 | [`src/mir/mir_lowering.rb:1528`](src/mir/mir_lowering.rb#L1528) | `build_post_outer_fn` | 1.0 | +| 38 | [`src/mir/mir_lowering.rb:1602`](src/mir/mir_lowering.rb#L1602) | `build_catch_clauses` | 1.0 | +| 39 | [`src/mir/mir_lowering.rb:1647`](src/mir/mir_lowering.rb#L1647) | `collect_catch_reassigns` | 1.0 | +| 40 | [`src/mir/mir_lowering.rb:1662`](src/mir/mir_lowering.rb#L1662) | `walk_catch_body_for_reassigns` | 1.0 | +| 41 | [`src/mir/mir_lowering.rb:1672`](src/mir/mir_lowering.rb#L1672) | `walk_catch_body_for_reassigns` | 1.0 | +| 42 | [`src/mir/mir_lowering.rb:1675`](src/mir/mir_lowering.rb#L1675) | `walk_catch_body_for_reassigns` | 1.0 | +| 43 | [`src/mir/mir_lowering.rb:1856`](src/mir/mir_lowering.rb#L1856) | `lower_method_call` | 1.0 | +| 44 | [`src/mir/mir_lowering.rb:1982`](src/mir/mir_lowering.rb#L1982) | `lower_intrinsic` | 1.0 | +| 45 | [`src/mir/mir_lowering.rb:2177`](src/mir/mir_lowering.rb#L2177) | `extern_call_args_zig` | 1.0 | +| 46 | [`src/mir/mir_lowering.rb:2274`](src/mir/mir_lowering.rb#L2274) | `lower_lambda` | 1.0 | +| 47 | [`src/mir/mir_lowering.rb:2361`](src/mir/mir_lowering.rb#L2361) | `lower_list_lit` | 1.0 | +| 48 | [`src/mir/mir_lowering.rb:2373`](src/mir/mir_lowering.rb#L2373) | `lower_list_lit` | 1.0 | +| 49 | [`src/mir/mir_lowering.rb:2379`](src/mir/mir_lowering.rb#L2379) | `lower_list_lit` | 1.0 | +| 50 | [`src/mir/mir_lowering.rb:2402`](src/mir/mir_lowering.rb#L2402) | `lower_hash_lit` | 1.0 | -| # | file | genuine arms | churn | score | -|---|---|---|---|---| -| 1 | `src/mir/mir_lowering.rb` | 187 | 1.0 | 187.0 | -| 2 | `src/mir/control_flow.rb` | 64 | 0.231 | 14.783 | -| 3 | `src/mir/escape_analysis.rb` | 22 | 0.142 | 3.128 | +- ...(+223 more genuine gaps) - Top file's genuine sites: - - /home/yahn/cheat/src/mir/mir_lowering.rb:hoist_alloc:226 - - /home/yahn/cheat/src/mir/mir_lowering.rb:hoist_owned_value_temp:244 - - /home/yahn/cheat/src/mir/mir_lowering.rb:owned_value_temp_needs_cleanup?:255 - - /home/yahn/cheat/src/mir/mir_lowering.rb:owned_value_temp_needs_cleanup?:260 - - /home/yahn/cheat/src/mir/mir_lowering.rb:owned_value_temp_needs_cleanup?:261 - - /home/yahn/cheat/src/mir/mir_lowering.rb:container_borrow_expr?:270 - - /home/yahn/cheat/src/mir/mir_lowering.rb:copy_container_borrow_if_needed:289 - - /home/yahn/cheat/src/mir/mir_lowering.rb:copy_container_borrow_if_needed:290 +## Category Summary +_935 dark arms; only 273 are genuine gaps. The rest are not test targets:_ -## Per-File Breakdown - -| file | total | type_norm | dead | defensive | genuine | ffi | diag | -|---|---|---|---|---|---|---|---| -| `src/mir/mir_lowering.rb` | 653 | 20.5% | 6.3% | 2.0% | 28.6% | 7.0% | 35.5% | -| `src/mir/control_flow.rb` | 170 | 31.8% | 10.0% | 0.6% | 37.6% | 0.0% | 20.0% | -| `src/mir/escape_analysis.rb` | 112 | 36.6% | 8.9% | 0.0% | 19.6% | 0.0% | 34.8% | +| category | arms | % | what it means | +|---|---|---|---| +| type_norm | 229 | 24.5% | type/nil guard -- likely dead if the contract were strictly typed | +| dead | 68 | 7.3% | decision never executes -- audit as dead code, delete | +| defensive | 14 | 1.5% | inert / invariant-pinned -- accept, exclude from denominator | +| ffi | 46 | 4.9% | external/boundary call -- needs an integration test | +| diagnostic | 305 | 32.6% | error/raise path -- reachable only by invalid input (negative test) | +| genuine | 273 | 29.2% | real reachable gap -- test it; ranked by fix-churn below | ## Run Summary - Repo: `.` -- Files triaged: 3; dark arms: 935 -- Owns categorization; consumes fix-cache churn. type_norm arms: confirm removable with nil-kill (see docs/agents/design.md) +- Files: 3; dark arms: 935; genuine gaps: 273 +- General engine: categorizes uncovered branches, ranks genuine gaps by consumed fix-cache churn. Project lexicon (external-boundary methods) is caller-supplied, not baked in (see docs/agents/design.md). diff --git a/gems/prick/test/classifier_test.rb b/gems/prick/test/classifier_test.rb index e532a0823..c2097b474 100644 --- a/gems/prick/test/classifier_test.rb +++ b/gems/prick/test/classifier_test.rb @@ -33,7 +33,7 @@ def test_trivial_is_the_narrow_inert_residue def test_categorize_priority_order g = node("x.is_a?(Type)") # FFI method name wins first - assert_equal :ffi, C.categorize("lower_require", :if, g, true) + assert_equal :ffi, C.categorize("lower_require", :if, g, true, nil, ["lower_require"]) # diagnostic (raise) before type_norm assert_equal :diagnostic, C.categorize("m", :if, node("raise 'x'"), true) # type_norm before dead/defensive diff --git a/gems/prick/test/rollup_test.rb b/gems/prick/test/rollup_test.rb index a48b129c0..55c27daf6 100644 --- a/gems/prick/test/rollup_test.rb +++ b/gems/prick/test/rollup_test.rb @@ -56,7 +56,7 @@ def test_missing_file_is_skipped_not_crashed out = Prick::Rollup.run(files: ["nope.rb"], repo: dir, resultset: "#{dir}/rs.json") assert_empty out[:per_file] - assert_empty out[:bug_likely] + assert_empty out[:top_gaps] end end end From 205e90d8373bb4b4ab9c87ec0fb108de34014328 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Fri, 15 May 2026 23:25:00 +0000 Subject: [PATCH 09/45] prick: links resolve from the report's location, not the gem dir report.md lives at gems/prick/report.md, so `src/x.rb` resolved to the nonexistent gems/prick/src/x.rb. Report now computes the href relative to the OUTPUT file's directory (link_base = dirname of --output; defaults to repo root for stdout). Link is now ../../src/mir/mir_lowering.rb#L226 from gems/prick/report.md; display text stays the readable repo-relative path. Verified target resolves; 6/30/0. Co-Authored-By: Claude Opus 4.7 --- gems/prick/exe/prick | 4 +- gems/prick/lib/prick/report.rb | 19 +++++- gems/prick/report.md | 102 ++++++++++++++++----------------- 3 files changed, 70 insertions(+), 55 deletions(-) diff --git a/gems/prick/exe/prick b/gems/prick/exe/prick index b84f6c8f9..4990ab14b 100644 --- a/gems/prick/exe/prick +++ b/gems/prick/exe/prick @@ -46,7 +46,9 @@ end md = Prick::Report.new( files: opts[:files], repo: opts[:repo], resultset: opts[:coverage], - ffi_boundary: opts[:ffi], top: opts[:top] + ffi_boundary: opts[:ffi], top: opts[:top], + # links must resolve from wherever the report is written. + link_base: (opts[:output] ? File.dirname(opts[:output]) : nil) ).to_markdown if opts[:output] diff --git a/gems/prick/lib/prick/report.rb b/gems/prick/lib/prick/report.rb index 4abad1581..018b83cdd 100644 --- a/gems/prick/lib/prick/report.rb +++ b/gems/prick/lib/prick/report.rb @@ -1,18 +1,31 @@ # frozen_string_literal: true require_relative "rollup" +require "pathname" module Prick # Markdown report. Leads with the actionable artifact: the top true # gaps, repo-relative + linked, ranked by fix-cache churn score. class Report - def initialize(files:, repo:, resultset:, ffi_boundary: [], top: 50) - @repo = repo + # link_base: the directory the markdown will be SAVED in, so link + # hrefs resolve correctly (a report at gems/prick/report.md must + # link ../../src/x.rb, not src/x.rb). Defaults to repo root + # (correct for stdout / a root-level report). + def initialize(files:, repo:, resultset:, ffi_boundary: [], top: 50, + link_base: nil) + @repo = File.realpath(repo) @top = top + @link_root = Pathname.new(File.expand_path(link_base || @repo)) @r = Rollup.run(files: files, repo: repo, resultset: resultset, ffi_boundary: ffi_boundary) end + # href from the report's directory to a repo-relative source file. + def href(rel_file) + Pathname.new(File.join(@repo, rel_file)) + .relative_path_from(@link_root).to_s + end + def to_markdown gaps = @r[:top_gaps] g = @r[:grand] @@ -28,7 +41,7 @@ def to_markdown else o << "| # | gap | method | churn |\n|---|---|---|---|\n" gaps.first(@top).each_with_index do |x, i| - link = "[`#{x[:file]}:#{x[:line]}`](#{x[:file]}#L#{x[:line]})" + link = "[`#{x[:file]}:#{x[:line]}`](#{href(x[:file])}#L#{x[:line]})" o << "| #{i + 1} | #{link} | `#{x[:method]}` | #{x[:churn]} |\n" end o << "\n- ...(+#{gaps.size - @top} more genuine gaps)\n" if gaps.size > @top diff --git a/gems/prick/report.md b/gems/prick/report.md index 1449a1a80..96175bc1d 100644 --- a/gems/prick/report.md +++ b/gems/prick/report.md @@ -9,56 +9,56 @@ | # | gap | method | churn | |---|---|---|---| -| 1 | [`src/mir/mir_lowering.rb:226`](src/mir/mir_lowering.rb#L226) | `hoist_alloc` | 1.0 | -| 2 | [`src/mir/mir_lowering.rb:244`](src/mir/mir_lowering.rb#L244) | `hoist_owned_value_temp` | 1.0 | -| 3 | [`src/mir/mir_lowering.rb:255`](src/mir/mir_lowering.rb#L255) | `owned_value_temp_needs_cleanup?` | 1.0 | -| 4 | [`src/mir/mir_lowering.rb:260`](src/mir/mir_lowering.rb#L260) | `owned_value_temp_needs_cleanup?` | 1.0 | -| 5 | [`src/mir/mir_lowering.rb:261`](src/mir/mir_lowering.rb#L261) | `owned_value_temp_needs_cleanup?` | 1.0 | -| 6 | [`src/mir/mir_lowering.rb:270`](src/mir/mir_lowering.rb#L270) | `container_borrow_expr?` | 1.0 | -| 7 | [`src/mir/mir_lowering.rb:289`](src/mir/mir_lowering.rb#L289) | `copy_container_borrow_if_needed` | 1.0 | -| 8 | [`src/mir/mir_lowering.rb:290`](src/mir/mir_lowering.rb#L290) | `copy_container_borrow_if_needed` | 1.0 | -| 9 | [`src/mir/mir_lowering.rb:361`](src/mir/mir_lowering.rb#L361) | `cleanup_entry_for_heap_result` | 1.0 | -| 10 | [`src/mir/mir_lowering.rb:362`](src/mir/mir_lowering.rb#L362) | `cleanup_entry_for_heap_result` | 1.0 | -| 11 | [`src/mir/mir_lowering.rb:369`](src/mir/mir_lowering.rb#L369) | `cleanup_entry_for_heap_result` | 1.0 | -| 12 | [`src/mir/mir_lowering.rb:435`](src/mir/mir_lowering.rb#L435) | `lower` | 1.0 | -| 13 | [`src/mir/mir_lowering.rb:462`](src/mir/mir_lowering.rb#L462) | `lower` | 1.0 | -| 14 | [`src/mir/mir_lowering.rb:499`](src/mir/mir_lowering.rb#L499) | `lower` | 1.0 | -| 15 | [`src/mir/mir_lowering.rb:517`](src/mir/mir_lowering.rb#L517) | `lower_body` | 1.0 | -| 16 | [`src/mir/mir_lowering.rb:529`](src/mir/mir_lowering.rb#L529) | `lower_body` | 1.0 | -| 17 | [`src/mir/mir_lowering.rb:544`](src/mir/mir_lowering.rb#L544) | `lower_body_with_break` | 1.0 | -| 18 | [`src/mir/mir_lowering.rb:552`](src/mir/mir_lowering.rb#L552) | `lower_body_with_break` | 1.0 | -| 19 | [`src/mir/mir_lowering.rb:557`](src/mir/mir_lowering.rb#L557) | `lower_body_with_break` | 1.0 | -| 20 | [`src/mir/mir_lowering.rb:583`](src/mir/mir_lowering.rb#L583) | `lower_program` | 1.0 | -| 21 | [`src/mir/mir_lowering.rb:589`](src/mir/mir_lowering.rb#L589) | `lower_program` | 1.0 | -| 22 | [`src/mir/mir_lowering.rb:674`](src/mir/mir_lowering.rb#L674) | `alloc_expr` | 1.0 | -| 23 | [`src/mir/mir_lowering.rb:681`](src/mir/mir_lowering.rb#L681) | `alloc_from_sym` | 1.0 | -| 24 | [`src/mir/mir_lowering.rb:682`](src/mir/mir_lowering.rb#L682) | `alloc_from_sym` | 1.0 | -| 25 | [`src/mir/mir_lowering.rb:700`](src/mir/mir_lowering.rb#L700) | `coerce_stdlib_arg` | 1.0 | -| 26 | [`src/mir/mir_lowering.rb:709`](src/mir/mir_lowering.rb#L709) | `coerce_stdlib_arg` | 1.0 | -| 27 | [`src/mir/mir_lowering.rb:750`](src/mir/mir_lowering.rb#L750) | `resolve_alloc_sym` | 1.0 | -| 28 | [`src/mir/mir_lowering.rb:774`](src/mir/mir_lowering.rb#L774) | `alloc_zig_str` | 1.0 | -| 29 | [`src/mir/mir_lowering.rb:775`](src/mir/mir_lowering.rb#L775) | `alloc_zig_str` | 1.0 | -| 30 | [`src/mir/mir_lowering.rb:893`](src/mir/mir_lowering.rb#L893) | `resolve_decl_stdlib_alloc` | 1.0 | -| 31 | [`src/mir/mir_lowering.rb:915`](src/mir/mir_lowering.rb#L915) | `lower_promote` | 1.0 | -| 32 | [`src/mir/mir_lowering.rb:951`](src/mir/mir_lowering.rb#L951) | `lower_struct_def` | 1.0 | -| 33 | [`src/mir/mir_lowering.rb:1061`](src/mir/mir_lowering.rb#L1061) | `lower_union_def` | 1.0 | -| 34 | [`src/mir/mir_lowering.rb:1245`](src/mir/mir_lowering.rb#L1245) | `lower_function_def` | 1.0 | -| 35 | [`src/mir/mir_lowering.rb:1434`](src/mir/mir_lowering.rb#L1434) | `lower_function_def` | 1.0 | -| 36 | [`src/mir/mir_lowering.rb:1519`](src/mir/mir_lowering.rb#L1519) | `build_post_outer_fn` | 1.0 | -| 37 | [`src/mir/mir_lowering.rb:1528`](src/mir/mir_lowering.rb#L1528) | `build_post_outer_fn` | 1.0 | -| 38 | [`src/mir/mir_lowering.rb:1602`](src/mir/mir_lowering.rb#L1602) | `build_catch_clauses` | 1.0 | -| 39 | [`src/mir/mir_lowering.rb:1647`](src/mir/mir_lowering.rb#L1647) | `collect_catch_reassigns` | 1.0 | -| 40 | [`src/mir/mir_lowering.rb:1662`](src/mir/mir_lowering.rb#L1662) | `walk_catch_body_for_reassigns` | 1.0 | -| 41 | [`src/mir/mir_lowering.rb:1672`](src/mir/mir_lowering.rb#L1672) | `walk_catch_body_for_reassigns` | 1.0 | -| 42 | [`src/mir/mir_lowering.rb:1675`](src/mir/mir_lowering.rb#L1675) | `walk_catch_body_for_reassigns` | 1.0 | -| 43 | [`src/mir/mir_lowering.rb:1856`](src/mir/mir_lowering.rb#L1856) | `lower_method_call` | 1.0 | -| 44 | [`src/mir/mir_lowering.rb:1982`](src/mir/mir_lowering.rb#L1982) | `lower_intrinsic` | 1.0 | -| 45 | [`src/mir/mir_lowering.rb:2177`](src/mir/mir_lowering.rb#L2177) | `extern_call_args_zig` | 1.0 | -| 46 | [`src/mir/mir_lowering.rb:2274`](src/mir/mir_lowering.rb#L2274) | `lower_lambda` | 1.0 | -| 47 | [`src/mir/mir_lowering.rb:2361`](src/mir/mir_lowering.rb#L2361) | `lower_list_lit` | 1.0 | -| 48 | [`src/mir/mir_lowering.rb:2373`](src/mir/mir_lowering.rb#L2373) | `lower_list_lit` | 1.0 | -| 49 | [`src/mir/mir_lowering.rb:2379`](src/mir/mir_lowering.rb#L2379) | `lower_list_lit` | 1.0 | -| 50 | [`src/mir/mir_lowering.rb:2402`](src/mir/mir_lowering.rb#L2402) | `lower_hash_lit` | 1.0 | +| 1 | [`src/mir/mir_lowering.rb:226`](../../src/mir/mir_lowering.rb#L226) | `hoist_alloc` | 1.0 | +| 2 | [`src/mir/mir_lowering.rb:244`](../../src/mir/mir_lowering.rb#L244) | `hoist_owned_value_temp` | 1.0 | +| 3 | [`src/mir/mir_lowering.rb:255`](../../src/mir/mir_lowering.rb#L255) | `owned_value_temp_needs_cleanup?` | 1.0 | +| 4 | [`src/mir/mir_lowering.rb:260`](../../src/mir/mir_lowering.rb#L260) | `owned_value_temp_needs_cleanup?` | 1.0 | +| 5 | [`src/mir/mir_lowering.rb:261`](../../src/mir/mir_lowering.rb#L261) | `owned_value_temp_needs_cleanup?` | 1.0 | +| 6 | [`src/mir/mir_lowering.rb:270`](../../src/mir/mir_lowering.rb#L270) | `container_borrow_expr?` | 1.0 | +| 7 | [`src/mir/mir_lowering.rb:289`](../../src/mir/mir_lowering.rb#L289) | `copy_container_borrow_if_needed` | 1.0 | +| 8 | [`src/mir/mir_lowering.rb:290`](../../src/mir/mir_lowering.rb#L290) | `copy_container_borrow_if_needed` | 1.0 | +| 9 | [`src/mir/mir_lowering.rb:361`](../../src/mir/mir_lowering.rb#L361) | `cleanup_entry_for_heap_result` | 1.0 | +| 10 | [`src/mir/mir_lowering.rb:362`](../../src/mir/mir_lowering.rb#L362) | `cleanup_entry_for_heap_result` | 1.0 | +| 11 | [`src/mir/mir_lowering.rb:369`](../../src/mir/mir_lowering.rb#L369) | `cleanup_entry_for_heap_result` | 1.0 | +| 12 | [`src/mir/mir_lowering.rb:435`](../../src/mir/mir_lowering.rb#L435) | `lower` | 1.0 | +| 13 | [`src/mir/mir_lowering.rb:462`](../../src/mir/mir_lowering.rb#L462) | `lower` | 1.0 | +| 14 | [`src/mir/mir_lowering.rb:499`](../../src/mir/mir_lowering.rb#L499) | `lower` | 1.0 | +| 15 | [`src/mir/mir_lowering.rb:517`](../../src/mir/mir_lowering.rb#L517) | `lower_body` | 1.0 | +| 16 | [`src/mir/mir_lowering.rb:529`](../../src/mir/mir_lowering.rb#L529) | `lower_body` | 1.0 | +| 17 | [`src/mir/mir_lowering.rb:544`](../../src/mir/mir_lowering.rb#L544) | `lower_body_with_break` | 1.0 | +| 18 | [`src/mir/mir_lowering.rb:552`](../../src/mir/mir_lowering.rb#L552) | `lower_body_with_break` | 1.0 | +| 19 | [`src/mir/mir_lowering.rb:557`](../../src/mir/mir_lowering.rb#L557) | `lower_body_with_break` | 1.0 | +| 20 | [`src/mir/mir_lowering.rb:583`](../../src/mir/mir_lowering.rb#L583) | `lower_program` | 1.0 | +| 21 | [`src/mir/mir_lowering.rb:589`](../../src/mir/mir_lowering.rb#L589) | `lower_program` | 1.0 | +| 22 | [`src/mir/mir_lowering.rb:674`](../../src/mir/mir_lowering.rb#L674) | `alloc_expr` | 1.0 | +| 23 | [`src/mir/mir_lowering.rb:681`](../../src/mir/mir_lowering.rb#L681) | `alloc_from_sym` | 1.0 | +| 24 | [`src/mir/mir_lowering.rb:682`](../../src/mir/mir_lowering.rb#L682) | `alloc_from_sym` | 1.0 | +| 25 | [`src/mir/mir_lowering.rb:700`](../../src/mir/mir_lowering.rb#L700) | `coerce_stdlib_arg` | 1.0 | +| 26 | [`src/mir/mir_lowering.rb:709`](../../src/mir/mir_lowering.rb#L709) | `coerce_stdlib_arg` | 1.0 | +| 27 | [`src/mir/mir_lowering.rb:750`](../../src/mir/mir_lowering.rb#L750) | `resolve_alloc_sym` | 1.0 | +| 28 | [`src/mir/mir_lowering.rb:774`](../../src/mir/mir_lowering.rb#L774) | `alloc_zig_str` | 1.0 | +| 29 | [`src/mir/mir_lowering.rb:775`](../../src/mir/mir_lowering.rb#L775) | `alloc_zig_str` | 1.0 | +| 30 | [`src/mir/mir_lowering.rb:893`](../../src/mir/mir_lowering.rb#L893) | `resolve_decl_stdlib_alloc` | 1.0 | +| 31 | [`src/mir/mir_lowering.rb:915`](../../src/mir/mir_lowering.rb#L915) | `lower_promote` | 1.0 | +| 32 | [`src/mir/mir_lowering.rb:951`](../../src/mir/mir_lowering.rb#L951) | `lower_struct_def` | 1.0 | +| 33 | [`src/mir/mir_lowering.rb:1061`](../../src/mir/mir_lowering.rb#L1061) | `lower_union_def` | 1.0 | +| 34 | [`src/mir/mir_lowering.rb:1245`](../../src/mir/mir_lowering.rb#L1245) | `lower_function_def` | 1.0 | +| 35 | [`src/mir/mir_lowering.rb:1434`](../../src/mir/mir_lowering.rb#L1434) | `lower_function_def` | 1.0 | +| 36 | [`src/mir/mir_lowering.rb:1519`](../../src/mir/mir_lowering.rb#L1519) | `build_post_outer_fn` | 1.0 | +| 37 | [`src/mir/mir_lowering.rb:1528`](../../src/mir/mir_lowering.rb#L1528) | `build_post_outer_fn` | 1.0 | +| 38 | [`src/mir/mir_lowering.rb:1602`](../../src/mir/mir_lowering.rb#L1602) | `build_catch_clauses` | 1.0 | +| 39 | [`src/mir/mir_lowering.rb:1647`](../../src/mir/mir_lowering.rb#L1647) | `collect_catch_reassigns` | 1.0 | +| 40 | [`src/mir/mir_lowering.rb:1662`](../../src/mir/mir_lowering.rb#L1662) | `walk_catch_body_for_reassigns` | 1.0 | +| 41 | [`src/mir/mir_lowering.rb:1672`](../../src/mir/mir_lowering.rb#L1672) | `walk_catch_body_for_reassigns` | 1.0 | +| 42 | [`src/mir/mir_lowering.rb:1675`](../../src/mir/mir_lowering.rb#L1675) | `walk_catch_body_for_reassigns` | 1.0 | +| 43 | [`src/mir/mir_lowering.rb:1856`](../../src/mir/mir_lowering.rb#L1856) | `lower_method_call` | 1.0 | +| 44 | [`src/mir/mir_lowering.rb:1982`](../../src/mir/mir_lowering.rb#L1982) | `lower_intrinsic` | 1.0 | +| 45 | [`src/mir/mir_lowering.rb:2177`](../../src/mir/mir_lowering.rb#L2177) | `extern_call_args_zig` | 1.0 | +| 46 | [`src/mir/mir_lowering.rb:2274`](../../src/mir/mir_lowering.rb#L2274) | `lower_lambda` | 1.0 | +| 47 | [`src/mir/mir_lowering.rb:2361`](../../src/mir/mir_lowering.rb#L2361) | `lower_list_lit` | 1.0 | +| 48 | [`src/mir/mir_lowering.rb:2373`](../../src/mir/mir_lowering.rb#L2373) | `lower_list_lit` | 1.0 | +| 49 | [`src/mir/mir_lowering.rb:2379`](../../src/mir/mir_lowering.rb#L2379) | `lower_list_lit` | 1.0 | +| 50 | [`src/mir/mir_lowering.rb:2402`](../../src/mir/mir_lowering.rb#L2402) | `lower_hash_lit` | 1.0 | - ...(+223 more genuine gaps) @@ -75,6 +75,6 @@ _935 dark arms; only 273 are genuine gaps. The rest are not test targets:_ | genuine | 273 | 29.2% | real reachable gap -- test it; ranked by fix-churn below | ## Run Summary -- Repo: `.` +- Repo: `/home/yahn/cheat` - Files: 3; dark arms: 935; genuine gaps: 273 - General engine: categorizes uncovered branches, ranks genuine gaps by consumed fix-cache churn. Project lexicon (external-boundary methods) is caller-supplied, not baked in (see docs/agents/design.md). From 83c8d417ba7c649eb984f27513c3f66aea9f3c5d Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 19:25:17 +0000 Subject: [PATCH 10/45] Rename the prick gem to slopcop (SlopCop) gems/prick -> gems/slopcop: directory, lib/, exe/, gemspec, module Prick -> SlopCop, require paths, CLI name, sorbet ignore entry, and README/design branding. Tests unchanged (6 runs / 30 assertions / 0 failures); CLI smoke-tested. Co-Authored-By: Claude Opus 4.7 --- gems/{prick => slopcop}/README.md | 8 ++++---- gems/{prick => slopcop}/docs/agents/design.md | 6 +++--- gems/{prick/exe/prick => slopcop/exe/slopcop} | 8 ++++---- gems/{prick/lib/prick.rb => slopcop/lib/slopcop.rb} | 10 +++++----- .../lib/prick => slopcop/lib/slopcop}/classifier.rb | 2 +- .../{prick/lib/prick => slopcop/lib/slopcop}/report.rb | 6 +++--- .../{prick/lib/prick => slopcop/lib/slopcop}/rollup.rb | 2 +- gems/{prick => slopcop}/report.md | 2 +- gems/{prick/prick.gemspec => slopcop/slopcop.gemspec} | 8 ++++---- gems/{prick => slopcop}/test/classifier_test.rb | 4 ++-- gems/{prick => slopcop}/test/rollup_test.rb | 6 +++--- sorbet/config | 2 +- 12 files changed, 32 insertions(+), 32 deletions(-) rename gems/{prick => slopcop}/README.md (89%) rename gems/{prick => slopcop}/docs/agents/design.md (95%) rename gems/{prick/exe/prick => slopcop/exe/slopcop} (90%) rename gems/{prick/lib/prick.rb => slopcop/lib/slopcop.rb} (56%) rename gems/{prick/lib/prick => slopcop/lib/slopcop}/classifier.rb (99%) rename gems/{prick/lib/prick => slopcop/lib/slopcop}/report.rb (95%) rename gems/{prick/lib/prick => slopcop/lib/slopcop}/rollup.rb (99%) rename gems/{prick => slopcop}/report.md (99%) rename gems/{prick/prick.gemspec => slopcop/slopcop.gemspec} (82%) rename gems/{prick => slopcop}/test/classifier_test.rb (97%) rename gems/{prick => slopcop}/test/rollup_test.rb (92%) diff --git a/gems/prick/README.md b/gems/slopcop/README.md similarity index 89% rename from gems/prick/README.md rename to gems/slopcop/README.md index 8096f8612..23e6aafcf 100644 --- a/gems/prick/README.md +++ b/gems/slopcop/README.md @@ -1,6 +1,6 @@ -# prick: pricks holes in your codebase. +# SlopCop: catches the slop your tests miss. -A flat "673/2732 uncovered" is unactionable. **prick** categorizes +A flat "673/2732 uncovered" is unactionable. **SlopCop** categorizes every dark branch arm and gives you the one thing you want: **the top true gaps to test, ranked by fix-churn.** @@ -29,7 +29,7 @@ external/boundary method names) is caller-supplied via `--ffi`. ## Usage ``` -prick report --repo=. --coverage=coverage/.resultset.json \ +slopcop report --repo=. --coverage=coverage/.resultset.json \ --output=report.md \ --files=src/a.rb,src/b.rb \ --ffi=my_extern_call,my_boundary_method @@ -41,7 +41,7 @@ and a git repo (for the fix-cache churn overlay). See ## Boundary -prick **owns** gap-categorization. It **consumes** the sibling +SlopCop **owns** gap-categorization. It **consumes** the sibling `fix-cache` gem for churn (it does not compute churn itself) and an optional nil-kill verdict for the `type_norm` bucket. It re-derives nothing. diff --git a/gems/prick/docs/agents/design.md b/gems/slopcop/docs/agents/design.md similarity index 95% rename from gems/prick/docs/agents/design.md rename to gems/slopcop/docs/agents/design.md index 508b81892..cfdd87bbb 100644 --- a/gems/prick/docs/agents/design.md +++ b/gems/slopcop/docs/agents/design.md @@ -1,8 +1,8 @@ -# prick — design +# SlopCop — design ## What it is (and why it is a general gem) -A flat uncovered-count is unactionable: gaps are not equal. prick is +A flat uncovered-count is unactionable: gaps are not equal. SlopCop is a **general engine**: it categorizes every dark branch arm by reachability class and ranks the genuine ones by consumed fix-cache churn — "the top true gaps to test, in order." It is a gem for the @@ -22,7 +22,7 @@ gem. Fixed: - **The project lexicon is caller-supplied.** `ffi_boundary:` (the external/boundary method names) defaults to empty in the gem; the consuming project passes its own (CLEAR's set lives in the CLI - `exe/prick`, not the library). `DIAGNOSTIC_MIDS` is general Ruby + `exe/slopcop`, not the library). `DIAGNOSTIC_MIDS` is general Ruby (`raise`/`fail`/`abort`). The *engine* — categorize uncovered branches, rank genuine by churn — diff --git a/gems/prick/exe/prick b/gems/slopcop/exe/slopcop similarity index 90% rename from gems/prick/exe/prick rename to gems/slopcop/exe/slopcop index 4990ab14b..ae9a63981 100644 --- a/gems/prick/exe/prick +++ b/gems/slopcop/exe/slopcop @@ -1,13 +1,13 @@ #!/usr/bin/env ruby # frozen_string_literal: true -require_relative "../lib/prick" +require_relative "../lib/slopcop" def usage warn <<~U - prick -- top true coverage gaps, ranked by fix-churn + slopcop -- top true coverage gaps, ranked by fix-churn - prick report [--repo=.] [--coverage=coverage/.resultset.json] \\ + slopcop report [--repo=.] [--coverage=coverage/.resultset.json] \\ [--output=report.md] [--files=a.rb,b.rb] \\ [--ffi=meth1,meth2] [--top=N] @@ -44,7 +44,7 @@ ARGV[1..].each do |a| end end -md = Prick::Report.new( +md = SlopCop::Report.new( files: opts[:files], repo: opts[:repo], resultset: opts[:coverage], ffi_boundary: opts[:ffi], top: opts[:top], # links must resolve from wherever the report is written. diff --git a/gems/prick/lib/prick.rb b/gems/slopcop/lib/slopcop.rb similarity index 56% rename from gems/prick/lib/prick.rb rename to gems/slopcop/lib/slopcop.rb index 7d64c6d41..5865aef46 100644 --- a/gems/prick/lib/prick.rb +++ b/gems/slopcop/lib/slopcop.rb @@ -1,13 +1,13 @@ # frozen_string_literal: true -require_relative "prick/classifier" -require_relative "prick/rollup" -require_relative "prick/report" +require_relative "slopcop/classifier" +require_relative "slopcop/rollup" +require_relative "slopcop/report" -# prick: categorical coverage-gap synthesis (the capstone). +# slopcop: categorical coverage-gap synthesis (the capstone). # Owns the gap-categorization analysis; consumes the sibling fix-cache # gem for churn and an optional nil-kill verdict for type_norm # removability. See docs/agents/design.md. -module Prick +module SlopCop VERSION = "0.0.1" end diff --git a/gems/prick/lib/prick/classifier.rb b/gems/slopcop/lib/slopcop/classifier.rb similarity index 99% rename from gems/prick/lib/prick/classifier.rb rename to gems/slopcop/lib/slopcop/classifier.rb index 2629195b6..b26aec2fa 100644 --- a/gems/prick/lib/prick/classifier.rb +++ b/gems/slopcop/lib/slopcop/classifier.rb @@ -2,7 +2,7 @@ require "json" -module Prick +module SlopCop # Classifies every never-taken branch arm in a target file into ONE # actionable category. AST-structural, never a regex over the arm # line. General -- no project lexicon baked in (see ffi_boundary:). diff --git a/gems/prick/lib/prick/report.rb b/gems/slopcop/lib/slopcop/report.rb similarity index 95% rename from gems/prick/lib/prick/report.rb rename to gems/slopcop/lib/slopcop/report.rb index 018b83cdd..ac1558999 100644 --- a/gems/prick/lib/prick/report.rb +++ b/gems/slopcop/lib/slopcop/report.rb @@ -3,12 +3,12 @@ require_relative "rollup" require "pathname" -module Prick +module SlopCop # Markdown report. Leads with the actionable artifact: the top true # gaps, repo-relative + linked, ranked by fix-cache churn score. class Report # link_base: the directory the markdown will be SAVED in, so link - # hrefs resolve correctly (a report at gems/prick/report.md must + # hrefs resolve correctly (a report at gems/slopcop/report.md must # link ../../src/x.rb, not src/x.rb). Defaults to repo root # (correct for stdout / a root-level report). def initialize(files:, repo:, resultset:, ffi_boundary: [], top: 50, @@ -29,7 +29,7 @@ def href(rel_file) def to_markdown gaps = @r[:top_gaps] g = @r[:grand] - o = +"# Prick Report\n\n" + o = +"# SlopCop Report\n\n" o << "> Top true coverage gaps to test, ranked by fix-churn.\n" \ "> Every dark branch arm is categorized; only the GENUINE\n" \ "> reachable ones are gaps worth testing. Owns\n" \ diff --git a/gems/prick/lib/prick/rollup.rb b/gems/slopcop/lib/slopcop/rollup.rb similarity index 99% rename from gems/prick/lib/prick/rollup.rb rename to gems/slopcop/lib/slopcop/rollup.rb index 453a93ece..29ff4e3d2 100644 --- a/gems/prick/lib/prick/rollup.rb +++ b/gems/slopcop/lib/slopcop/rollup.rb @@ -5,7 +5,7 @@ # re-derive it (boundary: own categorization, consume churn). require_relative "../../../fix-cache/lib/fix_cache" -module Prick +module SlopCop # Per-file categorical totals + the headline artifact: every GENUINE # reachable gap, repo-relative, ranked by the file's fix-cache churn # score. "Here are the top N true gaps to test." diff --git a/gems/prick/report.md b/gems/slopcop/report.md similarity index 99% rename from gems/prick/report.md rename to gems/slopcop/report.md index 96175bc1d..1502b0dff 100644 --- a/gems/prick/report.md +++ b/gems/slopcop/report.md @@ -1,4 +1,4 @@ -# Prick Report +# SlopCop Report > Top true coverage gaps to test, ranked by fix-churn. > Every dark branch arm is categorized; only the GENUINE diff --git a/gems/prick/prick.gemspec b/gems/slopcop/slopcop.gemspec similarity index 82% rename from gems/prick/prick.gemspec rename to gems/slopcop/slopcop.gemspec index 7cd375b57..6f42b919d 100644 --- a/gems/prick/prick.gemspec +++ b/gems/slopcop/slopcop.gemspec @@ -1,25 +1,25 @@ # frozen_string_literal: true Gem::Specification.new do |s| - s.name = "prick" + s.name = "slopcop" s.version = "0.0.1" s.summary = "Categorical coverage-gap synthesis: not all gaps are equal" s.description = <<~DESC The capstone. A flat "673/2732 uncovered" is unactionable because - gaps are not equal. prick classifies every dark branch arm by + gaps are not equal. SlopCop classifies every dark branch arm by category -- type-normalization (likely removable, confirm with nil-kill), defensive/invariant-pinned (accept), dead-decision (delete: complexity down), or GENUINE reachable gap -- then overlays fix-churn so the genuine arms in churn-hot code surface as "bugs highly likely HERE." It OWNS the gap-categorization analysis and CONSUMES fix-cache (churn) + an optional nil-kill verdict; it does - not re-derive them. Promotes tools/branch_prick.rb to a + not re-derive them. Promotes the branch-gap triage probe to a first-class product. Zero runtime deps beyond the sibling fix-cache. DESC s.authors = ["CLEAR"] s.license = "MIT" s.files = Dir["lib/**/*.rb", "exe/*"] s.bindir = "exe" - s.executables = ["prick"] + s.executables = ["slopcop"] s.required_ruby_version = ">= 3.1" end diff --git a/gems/prick/test/classifier_test.rb b/gems/slopcop/test/classifier_test.rb similarity index 97% rename from gems/prick/test/classifier_test.rb rename to gems/slopcop/test/classifier_test.rb index c2097b474..bc8ee220f 100644 --- a/gems/prick/test/classifier_test.rb +++ b/gems/slopcop/test/classifier_test.rb @@ -4,10 +4,10 @@ require "tempfile" require "json" require "coverage" -require_relative "../lib/prick" +require_relative "../lib/slopcop" class ClassifierTest < Minitest::Test - C = Prick::Classifier + C = SlopCop::Classifier def node(expr) RubyVM::AbstractSyntaxTree.parse(expr).children.last diff --git a/gems/prick/test/rollup_test.rb b/gems/slopcop/test/rollup_test.rb similarity index 92% rename from gems/prick/test/rollup_test.rb rename to gems/slopcop/test/rollup_test.rb index 55c27daf6..a38e664a1 100644 --- a/gems/prick/test/rollup_test.rb +++ b/gems/slopcop/test/rollup_test.rb @@ -5,7 +5,7 @@ require "json" require "coverage" require "fileutils" -require_relative "../lib/prick" +require_relative "../lib/slopcop" class RollupTest < Minitest::Test def test_rollup_categorizes_and_surfaces_genuine_with_churn_overlay @@ -39,7 +39,7 @@ def shape(x, n) rsf = "#{dir}/.resultset.json" File.write(rsf, JSON.dump(rs)) - out = Prick::Rollup.run(files: ["src/m.rb"], repo: dir, resultset: rsf) + out = SlopCop::Rollup.run(files: ["src/m.rb"], repo: dir, resultset: rsf) assert out[:per_file].key?("src/m.rb") fh = out[:per_file]["src/m.rb"] assert fh[:total].positive?, "should find dark arms" @@ -53,7 +53,7 @@ def test_missing_file_is_skipped_not_crashed Dir.mktmpdir do |dir| system("git", "-C", dir, "init", "-q", out: File::NULL, err: File::NULL) File.write("#{dir}/rs.json", JSON.dump({ "T" => { "coverage" => {} } })) - out = Prick::Rollup.run(files: ["nope.rb"], repo: dir, + out = SlopCop::Rollup.run(files: ["nope.rb"], repo: dir, resultset: "#{dir}/rs.json") assert_empty out[:per_file] assert_empty out[:top_gaps] diff --git a/sorbet/config b/sorbet/config index 0acb1c403..6264c77b8 100644 --- a/sorbet/config +++ b/sorbet/config @@ -12,7 +12,7 @@ --ignore=docs/ --ignore=tools/ --ignore=gems/decomplex/ ---ignore=gems/prick/ +--ignore=gems/slopcop/ --ignore=gems/fix-cache/ --ignore=gems/nil-kill/ From 0a885a9bda4929f4c9146c5c0d29cc1c94466eef Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 19:32:07 +0000 Subject: [PATCH 11/45] deslop: collapse dead is_a?(Type) guards in Type#accepts_fn_type? other_type is sig-typed Type and nil-kill confirms the runtime producer is always Type, so both `other_type.is_a?(Type) &&` guards are provably dead. Removing them is behavior-preserving (nil-kill Union Decomplexity: "always Type: collapse, all 2 die"). Co-Authored-By: Claude Opus 4.7 --- src/ast/type.rb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/ast/type.rb b/src/ast/type.rb index e90500d1f..de505af9e 100644 --- a/src/ast/type.rb +++ b/src/ast/type.rb @@ -1742,8 +1742,8 @@ def schema_union_any?(schema, &blk) # Structural match for function/lambda types. Called by accepts? when self.fn_type?. sig { params(other_type: Type).returns(T::Boolean) } def accepts_fn_type?(other_type) - return true if other_type.is_a?(Type) && other_type.any? - return false unless other_type.is_a?(Type) && other_type.fn_type? + return true if other_type.any? + return false unless other_type.fn_type? other_raw = other_type.raw self_params = @raw.params || [] From 9cadc12eabff3c75c1c1f824d29c64423e476124 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 19:34:44 +0000 Subject: [PATCH 12/45] deslop: drop dead ti coercion in MIRLowering#build_drop_entry! ti is sig-typed Type and nil-kill confirms the runtime producer is always Type, so `ti = Type.new(ti) if ti && !ti.is_a?(Type)` and `ti = nil unless ti.is_a?(Type)` are dead. Removing them is behavior-preserving (nil-kill: "always Type: collapse, all 2 die"). Co-Authored-By: Claude Opus 4.7 --- src/mir/mir_lowering.rb | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index 16ef48bbf..bb3a5dda9 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -848,8 +848,6 @@ def mir_cast(mir_node, from_type, to_type) # (TAKES params) to avoid deferring type resolution to the emitter. sig { params(entry: T::Hash[Symbol, T.untyped], ti: Type, source_node: T.nilable(AST::VarDecl)).returns(T.nilable(T::Boolean)) } def build_drop_entry!(entry, ti, source_node) - ti = Type.new(ti) if ti && !ti.is_a?(Type) - ti = nil unless ti.is_a?(Type) zig_type = case entry[:kind] when :heap_slice From 7b4b201a22822db58cc4711c8210ad73d148c27c Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 19:36:34 +0000 Subject: [PATCH 13/45] deslop: record pass findings (no transpiler bugs; collapse-safety rule) No CLEAR transpiler bugs encountered. Documents that only nil-kill "always Type" verdicts are safe standalone guard collapses (#55,#56 done); the other 18 tracked contracts are legitimately nilable or unattributed -- their is_a?(Type) checks are correct discriminators, so they need the producer-side propagation typing program, not blind guard deletion. Co-Authored-By: Claude Opus 4.7 --- docs/agents/deslop-bugs.md | 74 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 docs/agents/deslop-bugs.md diff --git a/docs/agents/deslop-bugs.md b/docs/agents/deslop-bugs.md new file mode 100644 index 000000000..70fbec2eb --- /dev/null +++ b/docs/agents/deslop-bugs.md @@ -0,0 +1,74 @@ +# deslop-bugs + +Findings from the nil-kill / SlopCop complexity-reduction pass +(tracker items #45-#64). Records CLEAR transpiler bugs encountered and +methodological findings. + +## CLEAR transpiler bugs encountered + +None. Every change made (and every change considered) was validated +against `bundle exec prspec spec/`, `./clear test transpile-tests/` +(548/548, 0 leaks), and the stable fuzz matrix (141/141, 0 fail / 0 +leak / 0 mir-error). No transpiler miscompilation, leak, or +MIR-checker regression was observed. + +## Pre-existing flaky spec (not introduced here) + +`spec/fmt_verifier_spec.rb` fails exactly one (nondeterministic) +example under parallel `prspec` but passes 12/12 when run serially +(`bundle exec rspec spec/fmt_verifier_spec.rb`). Pre-exists on the +`origin/nil-kill-prod` base. Out of scope for this pass; flagged so it +is not mistaken for a regression. The per-item gate used here is +"prspec failures confined to that one flaky fmt example; serial run of +related specs green; transpile-tests + fuzz unchanged." + +## Methodological finding: only "always Type" verdicts are safe blind collapses + +nil-kill's Union Decomplexity list ranks contracts by how many +`is_a?(Type)` guards collapse if the contract is given a concrete +type. Two distinct verdict classes appear, and only one is a safe +*standalone* deslop commit: + +1. **"always `Type`: collapse, all N die"** (runtime evidence: the + producer is non-nilable `Type`). The guards are provably dead; + deleting them is behavior-preserving. SAFE standalone commit. + - #56 `Type#accepts_fn_type?` (`other_type`) -- done, commit + 916cd5caf. + - #55 `MIRLowering#build_drop_entry!` (`ti`) -- done, commit + d4507ea99. + +2. **Nilable / union producers** (`{NilClass, Type}`, + `T.nilable(Type)`, heterogeneous) **or "producers unattributed"** + (no runtime trace). The `is_a?(Type)` check is a *correct + nil/Type discriminator* or a *load-bearing coercion*, NOT a dead + guard. Verified by static inspection -- these sites source from the + nilable `.type_info` / `.full_type` contract, e.g.: + - `ti = node.type_info rescue nil; ti.provenance = :heap if + ti.is_a?(Type)` (EscapeAnalysis#per_fn_scan!, #52) + - `ti = source.type_info rescue nil; ti = Type.new(ti) if ti && + !ti.is_a?(Type)` (BorrowChecker#_collect_share_moves, #58) + - `inner_ti = Type.new(inner_ti) unless inner_ti.is_a?(Type)` + (CleanupClassifier, #54 -- the guard IS the coercion) + + Deleting these guards introduces NoMethodError-on-nil at compile + time. They are NOT standalone deslop commits. + +### Why #45-#54, #57-#64 are deferred (not done) + +These reduce to a single root: the `.type_info` / `.full_type` / +`.type` / `.return_type` / `:type` contracts are legitimately +`T.nilable` (a node has no `type_info` before Pass 1 annotation). The +guards are correct. The genuine complexity reduction is to **tighten +the producer** so the contract is non-nilable at every post-annotation +read site -- nil-kill's PropagationGap program. That is a multi-commit +*typing program per contract* (make every producer assign a `Type`, +prove no pre-annotation read, then the guards become provably dead and +collapse mechanically), not 18 quick guard deletions. Forcing the +deletions to "complete 20 items" would be metric-gaming that ships +compiler bugs -- precisely the anti-pattern in +`docs/retrospective`. + +Recommended next step for these: run them as the dedicated +contract-tightening program (one contract at a time: `.type_info` +first, 59 guards), each contract its own series of producer-side +commits ending in the mechanical guard collapse, full gates between. From 96a80ddfbbe73e4063b9bef538fa07c07b429416 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 19:42:56 +0000 Subject: [PATCH 14/45] deslop(source): pipeline_rewriter passes Type, not bare symbols, to full_type= 62 sites set `.full_type = :Sym`, which full_type= silently launders via Type.new -- the source of the nilable/non-Type pollution that forces is_a?(Type) re-guards across 38 reader methods. Pass Type.new at the producer instead (runtime-identical; the setter already did exactly this). Step 1 of the source-tightening program for #45. Co-Authored-By: Claude Opus 4.7 --- src/backends/pipeline_rewriter.rb | 124 +++++++++++++++--------------- 1 file changed, 62 insertions(+), 62 deletions(-) diff --git a/src/backends/pipeline_rewriter.rb b/src/backends/pipeline_rewriter.rb index ddb8fba5a..faf0ac6a6 100644 --- a/src/backends/pipeline_rewriter.rb +++ b/src/backends/pipeline_rewriter.rb @@ -340,16 +340,16 @@ def fuse_pipeline(smooth_node, source, stages, terminal) # 3. Create ForEach loop is_each = terminal.is_a?(AST::EachOp) foreach = AST::ForEach.new(token, it_var, source.dup, loop_body, nil, is_each) - foreach.full_type = :Void + foreach.full_type = Type.new(:Void) foreach.instance_variable_set(:@var_used, true) body << foreach # 4. Post-loop guards if terminal.is_a?(AST::MinOp) || terminal.is_a?(AST::MaxOp) found_ident = AST::Identifier.new(token, "#{res_var}_found") - found_ident.full_type = :Bool + found_ident.full_type = Type.new(:Bool) guard = AST::Assert.new(token, found_ident, "MIN/MAX applied to empty list") - guard.full_type = :Void + guard.full_type = Type.new(:Void) body << guard end @@ -359,7 +359,7 @@ def fuse_pipeline(smooth_node, source, stages, terminal) # Return the ForEach directly (or wrap in a sequence if there are init nodes). return foreach if body.length == 1 wrapper = AST::BlockExpr.new(token, body, nil) - wrapper.full_type = :Void + wrapper.full_type = Type.new(:Void) return wrapper end @@ -368,9 +368,9 @@ def fuse_pipeline(smooth_node, source, stages, terminal) if terminal.is_a?(AST::AverageOp) avg_var = "#{res_var}_avg" zero = AST::Literal.new(token, :NUMBER, 0.0) - zero.full_type = :Float64 + zero.full_type = Type.new(:Float64) avg_decl = AST::VarDecl.new(token, avg_var, nil, zero.dup, true) - avg_decl.full_type = :Float64 + avg_decl.full_type = Type.new(:Float64) avg_decl.storage = :stack avg_decl.slot_size = 1 avg_decl.instance_variable_set(:@var_used, true) @@ -379,20 +379,20 @@ def fuse_pipeline(smooth_node, source, stages, terminal) sum_id = AST::Identifier.new(token, "#{res_var}_sum") cnt_id = AST::Identifier.new(token, "#{res_var}_cnt") - sum_id.full_type = :Float64 - cnt_id.full_type = :Float64 + sum_id.full_type = Type.new(:Float64) + cnt_id.full_type = Type.new(:Float64) cond = AST::BinaryOp.new(token, cnt_id.dup, :GT, zero.dup) - cond.full_type = :Bool + cond.full_type = Type.new(:Bool) div = AST::BinaryOp.new(token, sum_id, :DIV, cnt_id) - div.full_type = :Float64 + div.full_type = Type.new(:Float64) avg_assign = AST::Assignment.new(token, avg_var, div) - avg_assign.full_type = :Float64 + avg_assign.full_type = Type.new(:Float64) guard = AST::IfStatement.new(token, cond, [avg_assign], nil) - guard.full_type = :Void + guard.full_type = Type.new(:Void) body << guard result = AST::Identifier.new(token, avg_var) - result.full_type = :Float64 + result.full_type = Type.new(:Float64) else result = build_final_result(terminal, res_var, token, smooth_node) end @@ -420,14 +420,14 @@ def build_init(terminal, res_var, token, smooth_node) when AST::AverageOp # Two accumulators: sum and count sum_decl = AST::VarDecl.new(token, "#{res_var}_sum", nil, AST::Literal.new(token, :NUMBER, 0.0), true) - sum_decl.full_type = :Float64 + sum_decl.full_type = Type.new(:Float64) sum_decl.storage = :stack sum_decl.slot_size = 1 sum_decl.instance_variable_set(:@var_used, true) sum_decl.var_mutated = true cnt_decl = AST::VarDecl.new(token, "#{res_var}_cnt", nil, AST::Literal.new(token, :NUMBER, 0.0), true) - cnt_decl.full_type = :Float64 + cnt_decl.full_type = Type.new(:Float64) cnt_decl.storage = :stack cnt_decl.slot_size = 1 cnt_decl.instance_variable_set(:@var_used, true) @@ -437,9 +437,9 @@ def build_init(terminal, res_var, token, smooth_node) when AST::AnyOp, AST::AllOp init_val = terminal.is_a?(AST::AllOp) val = AST::Literal.new(token, :BOOLEAN, init_val) - val.full_type = :Bool + val.full_type = Type.new(:Bool) decl = AST::VarDecl.new(token, res_var, nil, val, true) - decl.full_type = :Bool + decl.full_type = Type.new(:Bool) decl.storage = :stack decl.slot_size = 1 decl.instance_variable_set(:@var_used, true) @@ -455,7 +455,7 @@ def build_init(terminal, res_var, token, smooth_node) [decl] when AST::FindOp val = AST::Literal.new(token, :NIL, nil) - val.full_type = :NIL + val.full_type = Type.new(:NIL) decl = AST::VarDecl.new(token, res_var, nil, val, true) decl.full_type = smooth_node.full_type decl.storage = :stack @@ -466,18 +466,18 @@ def build_init(terminal, res_var, token, smooth_node) when AST::MinOp, AST::MaxOp # Found-flag pattern: first element always sets result, subsequent compare zero = AST::Literal.new(token, :NUMBER, 0.0) - zero.full_type = :Float64 + zero.full_type = Type.new(:Float64) val_decl = AST::VarDecl.new(token, res_var, nil, zero, true) - val_decl.full_type = :Float64 + val_decl.full_type = Type.new(:Float64) val_decl.storage = :stack val_decl.slot_size = 1 val_decl.instance_variable_set(:@var_used, true) val_decl.var_mutated = true found_init = AST::Literal.new(token, :BOOLEAN, false) - found_init.full_type = :Bool + found_init.full_type = Type.new(:Bool) found_decl = AST::VarDecl.new(token, "#{res_var}_found", nil, found_init, true) - found_decl.full_type = :Bool + found_decl.full_type = Type.new(:Bool) found_decl.storage = :stack found_decl.slot_size = 1 found_decl.instance_variable_set(:@var_used, true) @@ -513,7 +513,7 @@ def build_recursive_body(stages, terminal, current_val, res_var, token, stage_in pred = replace_placeholder(stage.expression, current_val) then_branch = build_recursive_body(T.must(remaining), terminal, current_val, res_var, token, stage_inits, res_type) if_stmt = AST::IfStatement.new(stage.token, pred, then_branch, nil) - if_stmt.full_type = :Void + if_stmt.full_type = Type.new(:Void) [if_stmt] when AST::SelectOp expr = replace_placeholder(stage.expression, current_val) @@ -536,14 +536,14 @@ def build_recursive_body(stages, terminal, current_val, res_var, token, stage_in pred = replace_placeholder(stage.expression, current_val) then_branch = build_recursive_body(T.must(remaining), terminal, current_val, res_var, token, stage_inits, res_type) if_stmt = AST::IfStatement.new(stage.token, pred, then_branch, [AST::BreakNode.new(stage.token)]) - if_stmt.full_type = :Void + if_stmt.full_type = Type.new(:Void) [if_stmt] when AST::SkipOp cnt_var = next_var("__skip_cnt") zero = AST::Literal.new(token, :INT64, 0) - zero.full_type = :Int64 + zero.full_type = Type.new(:Int64) cnt_decl = AST::VarDecl.new(token, cnt_var, nil, zero, true) - cnt_decl.full_type = :Int64 + cnt_decl.full_type = Type.new(:Int64) cnt_decl.storage = :stack cnt_decl.slot_size = 1 cnt_decl.instance_variable_set(:@var_used, true) @@ -551,26 +551,26 @@ def build_recursive_body(stages, terminal, current_val, res_var, token, stage_in stage_inits << cnt_decl cnt_ident = AST::Identifier.new(token, cnt_var) - cnt_ident.full_type = :Int64 + cnt_ident.full_type = Type.new(:Int64) one = AST::Literal.new(token, :INT64, 1) - one.full_type = :Int64 + one.full_type = Type.new(:Int64) increment = AST::Assignment.new(token, cnt_ident, AST::BinaryOp.new(token, cnt_ident.dup, :ADD, one)) - increment.full_type = :Void + increment.full_type = Type.new(:Void) skip_n = stage.count.dup cond = AST::BinaryOp.new(token, cnt_ident.dup, :LTE, skip_n) - cond.full_type = :Bool + cond.full_type = Type.new(:Bool) skip_if = AST::IfStatement.new(token, cond, [AST::ContinueNode.new(token)], nil) - skip_if.full_type = :Void + skip_if.full_type = Type.new(:Void) rest = build_recursive_body(T.must(remaining), terminal, current_val, res_var, token, stage_inits, res_type) [increment, skip_if] + rest when AST::LimitOp cnt_var = next_var("__lim_cnt") zero = AST::Literal.new(token, :INT64, 0) - zero.full_type = :Int64 + zero.full_type = Type.new(:Int64) cnt_decl = AST::VarDecl.new(token, cnt_var, nil, zero, true) - cnt_decl.full_type = :Int64 + cnt_decl.full_type = Type.new(:Int64) cnt_decl.storage = :stack cnt_decl.slot_size = 1 cnt_decl.instance_variable_set(:@var_used, true) @@ -578,17 +578,17 @@ def build_recursive_body(stages, terminal, current_val, res_var, token, stage_in stage_inits << cnt_decl cnt_ident = AST::Identifier.new(token, cnt_var) - cnt_ident.full_type = :Int64 + cnt_ident.full_type = Type.new(:Int64) one = AST::Literal.new(token, :INT64, 1) - one.full_type = :Int64 + one.full_type = Type.new(:Int64) increment = AST::Assignment.new(token, cnt_ident, AST::BinaryOp.new(token, cnt_ident.dup, :ADD, one)) - increment.full_type = :Void + increment.full_type = Type.new(:Void) limit_n = stage.count.dup cond = AST::BinaryOp.new(token, cnt_ident.dup, :GT, limit_n) - cond.full_type = :Bool + cond.full_type = Type.new(:Bool) limit_if = AST::IfStatement.new(token, cond, [AST::BreakNode.new(token)], nil) - limit_if.full_type = :Void + limit_if.full_type = Type.new(:Void) rest = build_recursive_body(T.must(remaining), terminal, current_val, res_var, token, stage_inits, res_type) [increment, limit_if] + rest @@ -605,15 +605,15 @@ def build_terminal_action(terminal, current_val, res_var, token, res_type = nil) when AST::SumOp expr = replace_placeholder(terminal.expression, current_val) assign = AST::Assignment.new(token, res_ident, AST::BinaryOp.new(token, res_ident, :ADD, expr)) - assign.full_type = :Void + assign.full_type = Type.new(:Void) [assign] when AST::CountOp expr = replace_placeholder(terminal.expression, current_val) one = AST::Literal.new(token, :INT64, 1) increment = AST::Assignment.new(token, res_ident, AST::BinaryOp.new(token, res_ident, :ADD, one)) - increment.full_type = :Void + increment.full_type = Type.new(:Void) if_stmt = AST::IfStatement.new(token, expr, [increment], nil) - if_stmt.full_type = :Void + if_stmt.full_type = Type.new(:Void) [if_stmt] when AST::AverageOp expr = replace_placeholder(terminal.expression, current_val) @@ -624,51 +624,51 @@ def build_terminal_action(terminal, current_val, res_var, token, res_type = nil) when AST::AnyOp expr = replace_placeholder(terminal.expression, current_val) set_true = AST::Assignment.new(token, res_ident, AST::Literal.new(token, :BOOLEAN, true)) - set_true.full_type = :Void + set_true.full_type = Type.new(:Void) if_stmt = AST::IfStatement.new(token, expr, [set_true, AST::BreakNode.new(token)], nil) - if_stmt.full_type = :Void + if_stmt.full_type = Type.new(:Void) [if_stmt] when AST::AllOp expr = replace_placeholder(terminal.expression, current_val) set_false = AST::Assignment.new(token, res_ident, AST::Literal.new(token, :BOOLEAN, false)) - set_false.full_type = :Void + set_false.full_type = Type.new(:Void) if_stmt = AST::IfStatement.new(token, AST::UnaryOp.new(token, :NOT, expr), [set_false, AST::BreakNode.new(token)], nil) - if_stmt.full_type = :Void + if_stmt.full_type = Type.new(:Void) [if_stmt] when AST::ReduceOp expr = replace_placeholder(terminal.expression, current_val) expr = replace_named_placeholder(expr, "acc", res_ident) assign = AST::Assignment.new(token, res_ident, expr) - assign.full_type = :Void + assign.full_type = Type.new(:Void) [assign] when AST::FindOp expr = replace_placeholder(terminal.expression, current_val) assign = AST::Assignment.new(token, res_ident, current_val.dup) - assign.full_type = :Void + assign.full_type = Type.new(:Void) if_stmt = AST::IfStatement.new(token, expr, [assign, AST::BreakNode.new(token)], nil) - if_stmt.full_type = :Void + if_stmt.full_type = Type.new(:Void) [if_stmt] when AST::MinOp, AST::MaxOp expr = replace_placeholder(terminal.expression, current_val) op = terminal.is_a?(AST::MinOp) ? :LT : :GT found_ident = AST::Identifier.new(token, "#{res_var}_found") - found_ident.full_type = :Bool + found_ident.full_type = Type.new(:Bool) # if !found || expr < res { res = expr; found = true } not_found = AST::UnaryOp.new(token, :NOT, found_ident.dup) - not_found.full_type = :Bool + not_found.full_type = Type.new(:Bool) cmp = AST::BinaryOp.new(token, expr, op, res_ident.dup) - cmp.full_type = :Bool + cmp.full_type = Type.new(:Bool) cond = AST::BinaryOp.new(token, not_found, :OR, cmp) - cond.full_type = :Bool + cond.full_type = Type.new(:Bool) assign_val = AST::Assignment.new(token, res_ident, expr.dup) - assign_val.full_type = :Void + assign_val.full_type = Type.new(:Void) set_found = AST::Assignment.new(token, found_ident, AST::Literal.new(token, :BOOLEAN, true)) - set_found.full_type = :Void + set_found.full_type = Type.new(:Void) if_stmt = AST::IfStatement.new(token, cond, [assign_val, set_found], nil) - if_stmt.full_type = :Void + if_stmt.full_type = Type.new(:Void) [if_stmt] when AST::EachOp terminal.body.map { |s| replace_placeholder(s, current_val) } @@ -680,14 +680,14 @@ def build_terminal_action(terminal, current_val, res_var, token, res_type = nil) inner_it = AST::Identifier.new(token, inner_it_var) append = AST::MethodCall.new(token, res_ident, "append", [inner_it.dup]) - append.full_type = :Void + append.full_type = Type.new(:Void) append.zig_pattern = STD_LIB["append"][:zig] append.matched_stdlib_def = STD_LIB["append"] # Iterate directly over the expression (avoids ArrayList/slice confusion). # Mark collection as a slice so the transpiler uses &expr, not .items. inner_foreach = AST::ForEach.new(token, inner_it_var, inner_expr, [append], nil, false) - inner_foreach.full_type = :Void + inner_foreach.full_type = Type.new(:Void) inner_foreach.instance_variable_set(:@var_used, true) [inner_foreach] @@ -695,14 +695,14 @@ def build_terminal_action(terminal, current_val, res_var, token, res_type = nil) # Set insert: result is a T[]@set; insert deduplicates in O(1). key_expr = replace_placeholder(terminal.expression, current_val) insert_call = AST::MethodCall.new(token, res_ident.dup, "insert", [key_expr]) - insert_call.full_type = :Void + insert_call.full_type = Type.new(:Void) insert_call.zig_pattern = "try {0}.insert({alloc}, {1})" insert_call.matched_stdlib_def = STD_LIB["insert"] if STD_LIB.key?("insert") [insert_call] when nil, AST::SelectOp, AST::WhereOp, AST::TapOp, AST::TakeWhileOp # Produces a list call = AST::MethodCall.new(token, res_ident, "append", [current_val.dup]) - call.full_type = :Void + call.full_type = Type.new(:Void) call.zig_pattern = STD_LIB["append"][:zig] call.matched_stdlib_def = STD_LIB["append"] [call] @@ -719,10 +719,10 @@ def build_final_result(terminal, res_var, token, smooth_node) if terminal.is_a?(AST::AverageOp) sum_ident = AST::Identifier.new(token, "#{res_var}_sum") cnt_ident = AST::Identifier.new(token, "#{res_var}_cnt") - sum_ident.full_type = :Float64 - cnt_ident.full_type = :Float64 + sum_ident.full_type = Type.new(:Float64) + cnt_ident.full_type = Type.new(:Float64) div = AST::BinaryOp.new(token, sum_ident, :DIV, cnt_ident) - div.full_type = :Float64 + div.full_type = Type.new(:Float64) div else res = AST::Identifier.new(token, res_var) From e0dd5220bdd53e656158e62208756396d05436e1 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 19:47:12 +0000 Subject: [PATCH 15/45] deslop: record source-fix finding (blanket producer rewrite unsafe) pipeline_rewriter producer conversion is safe (uniform Locatable receivers, landed f29524a10). The same transform on annotator.rb et al. regressed 1799 specs: heterogeneous full_type receivers, some of which genuinely store/read a raw Symbol. The source fix needs per-receiver typing, not a blanket caller rewrite. Co-Authored-By: Claude Opus 4.7 --- docs/agents/deslop-bugs.md | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/docs/agents/deslop-bugs.md b/docs/agents/deslop-bugs.md index 70fbec2eb..2ee08e962 100644 --- a/docs/agents/deslop-bugs.md +++ b/docs/agents/deslop-bugs.md @@ -72,3 +72,35 @@ Recommended next step for these: run them as the dedicated contract-tightening program (one contract at a time: `.type_info` first, 59 guards), each contract its own series of producer-side commits ending in the mechanical guard collapse, full gates between. + +## Source-fix attempt: producers passing bare Symbols to full_type= + +The correct strategy (per the user) is to fix the *source*: 120 +sites across 5 files do `node.full_type = :Sym`, which `full_type=` +(ast.rb:309) silently launders via `Type.new(val)`. Passing `Type` +at the producer is runtime-identical *iff* the receiver's +`full_type=` is the laundering `AST::Locatable` setter. + +- **SAFE / landed**: `src/backends/pipeline_rewriter.rb` (62 sites). + Receivers are uniformly freshly-built `AST::Locatable` nodes -> + `.full_type = :Sym` -> `.full_type = Type.new(:Sym)` is provably + identical. All gates green. Commit f29524a10. +- **UNSAFE / reverted**: `annotator.rb` (35), `pipe_analysis.rb` + (14), `test_annotation.rb` (8), `function_analysis.rb` (1). A + blanket `:Sym -> Type.new(:Sym)` here regressed 1799 specs + + collapsed transpile-tests. Root cause: `.full_type` in these files + has **heterogeneous receivers** and many readers compare the value + with `== :Sym` / `case ... when :Sym`. (Note `full_type=` already + normalized symbols, so symbol-equality readers were *already* + reading a `Type` for Locatable nodes -- meaning the breaking sites + are receivers whose `full_type`/`full_type=` is NOT the laundering + setter: a plain accessor / Struct / Hash-shape that genuinely + stores and reads the raw Symbol.) + +Conclusion: the source fix is correct in principle but cannot be a +blanket caller rewrite. It requires per-receiver typing: identify +which `full_type` carriers are `AST::Locatable` (laundering setter, +safe to convert) vs other carriers (raw-Symbol contract, must +instead be typed at *their* definition or left). That per-receiver +discrimination is the actual program -- the mechanical transform is +not a substitute for it. From 1845f44ba36b52ecffa81a5c145723e61fa72fc1 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 19:56:57 +0000 Subject: [PATCH 16/45] deslop(source): function_analysis:740 producer passes Type to full_type= nil-kill-attributed Locatable#full_type= producer site. Wrap the Symbol RHS in Type.new (runtime-identical to the setter's existing launder). Validates the per-site approach for the 22-site producer worklist nil-kill enumerates for this contract. Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/function_analysis.rb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index 8326339ac..0b3bf8f10 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -737,7 +737,7 @@ def declare_and_verify_params(node) error!(node, :DEFAULT_STRUCT_MISSING_DEFAULTS, name: param[:name], type: param[:type], missing: missing.join(', ')) end end - param[:default].full_type = param[:type].to_sym rescue param[:type] + param[:default].full_type = Type.new((param[:type].to_sym rescue param[:type])) else visit(param[:default]) def_type = param[:default].resolved_type From 8910ca38b2c32296f62c645cb6b17267ba26baa3 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 19:59:51 +0000 Subject: [PATCH 17/45] deslop(source): pipe_analysis 10 producers pass Type to full_type= The 10 nil-kill-attributed Locatable#full_type= producer sites (303,307,368,445,503,611,645,911 + the 1623/1686 case exprs) wrapped in Type.new -- runtime-identical to the setter's launder. Scoped strictly to nil-kill's worklist (other :Void/:Bool sites untouched). Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/pipe_analysis.rb | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/src/annotator-helpers/pipe_analysis.rb b/src/annotator-helpers/pipe_analysis.rb index 40aabf695..3d4b5bf60 100644 --- a/src/annotator-helpers/pipe_analysis.rb +++ b/src/annotator-helpers/pipe_analysis.rb @@ -300,11 +300,11 @@ def analyze_select_family_op(node) when AST::IndexOp # INDEX returns HashMap key_type = node.right.expression.resolved_type - node.full_type = :"HashMap<#{item_type}[]>" + node.full_type = Type.new(:"HashMap<#{item_type}[]>") node.right.full_type = key_type when AST::OrderByOp # ORDER_BY returns the same list type, sorted - node.full_type = :"#{item_type}[]" + node.full_type = Type.new(:"#{item_type}[]") node.right.full_type = node.right.expression.resolved_type end @@ -365,7 +365,7 @@ def analyze_window_op(node) # Result is a list of whatever the expression produces expr_type = node.right.expression.full_type || node.right.expression.resolved_type - node.full_type = :"#{expr_type}[]" + node.full_type = Type.new(:"#{expr_type}[]") node.storage = :frame current_fn_ctx.frame_count += 1 if current_fn_ctx end @@ -442,7 +442,7 @@ def analyze_batch_window_op(node) end expr_type = bw.expression.full_type || bw.expression.resolved_type - node.full_type = :"#{expr_type}[]" + node.full_type = Type.new(:"#{expr_type}[]") node.storage = :heap current_fn_ctx.frame_count += 1 if current_fn_ctx end @@ -500,7 +500,7 @@ def analyze_join_op(node) })) end - node.full_type = :"#{join_type_name}[]" + node.full_type = Type.new(:"#{join_type_name}[]") node.storage = :frame current_fn_ctx.frame_count += 1 if current_fn_ctx end @@ -608,7 +608,7 @@ def analyze_limit_op(node) end # Result type is a materialized list of the element type - node.full_type = :"#{item_type}[]" + node.full_type = Type.new(:"#{item_type}[]") node.storage = :frame end @@ -642,7 +642,7 @@ def analyze_unnest_op(node) # Result type is the element type of the nested array nested_element_type = T.must(expr_type.element_type).resolved - node.full_type = :"#{nested_element_type}[]" + node.full_type = Type.new(:"#{nested_element_type}[]") node.right.full_type = node.right.expression.full_type node.storage = :frame @@ -908,7 +908,7 @@ def analyze_find_op(node) error!(node.right, :PIPE_CLAUSE_NEEDS_BOOL, clause: "FIND", got: node.right.expression.resolved_type) end - node.full_type = :"?#{item_type}" + node.full_type = Type.new(:"?#{item_type}") node.storage = :stack mark_observable_terminal!(node, terminal: :find, raw: :"~?#{item_type}") end @@ -1620,12 +1620,12 @@ def analyze_concurrent_bounded_select_family_op(node) error!(node.right.op, :WHERE_NEEDS_BOOL) end - node.full_type = case node.right.op + node.full_type = Type.new(case node.right.op when AST::SelectOp :"#{node.right.op.expression.full_type}[]" when AST::WhereOp :"#{item_type}[]" - end + end) node.storage = :heap current_fn_ctx.frame_count += 1 if current_fn_ctx end @@ -1683,10 +1683,10 @@ def analyze_concurrent_stream_select_family_op(node) error!(node.right.op, :WHERE_NEEDS_BOOL) end - node.full_type = case node.right.op + node.full_type = Type.new(case node.right.op when AST::SelectOp then :"#{node.right.op.expression.full_type}[]" when AST::WhereOp then :"#{item_type}[]" - end + end) node.storage = :heap current_fn_ctx.frame_count += 1 if current_fn_ctx end From a13bcc37ba1164cf6aef43ffcad7474b12a06cc9 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 20:05:00 +0000 Subject: [PATCH 18/45] deslop(source): annotator 11 producers pass Type; fix visit_Slice sig The 11 nil-kill-attributed Locatable#full_type= producer sites in annotator.rb wrapped in Type.new (3838 case fixed per-branch, no double-wrap). Surfaced real slop: visit_Slice declared `returns(Symbol)` but is a side-effecting annotator whose return is unused -- only "satisfied" by the pre-launder Symbol. Corrected to `.void`, matching its sibling visitors (visit_HashLit/_YieldExpr). Completes nil-kill's 22-site producer worklist for this contract. Co-Authored-By: Claude Opus 4.7 --- src/annotator.rb | 48 ++++++++++++++++++++++++------------------------ 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/src/annotator.rb b/src/annotator.rb index a83cf985b..d4329ea0b 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -3453,7 +3453,7 @@ def visit_GetField(node) end end - sig { params(node: AST::Slice).returns(Symbol) } + sig { params(node: AST::Slice).void } def visit_Slice(node) visit(node.target) visit(node.start) if node.start @@ -3464,7 +3464,7 @@ def visit_Slice(node) target_type = node.target.type_info if target_type&.array? element = target_type.element_type.resolved - node.full_type = :"#{element}[]" + node.full_type = Type.new(:"#{element}[]") else node.full_type = :Any end @@ -3523,7 +3523,7 @@ def visit_HashLit(node) end end - node.full_type = :"HashMap<#{first_val_type}>" + node.full_type = Type.new(:"HashMap<#{first_val_type}>") node.storage = :heap current_fn_ctx.heap_count += 1 if current_fn_ctx record_effect(EffectTracker::HEAP) @@ -3744,7 +3744,7 @@ def visit_ListLit(node) if inner_types.size > 1 error!(node, :BOUNDED_STREAM_MIXED_TYPES, types: inner_types.join(', ')) end - node.full_type = :"~#{inner_types.first}[#{node.items.size}]" + node.full_type = Type.new(:"~#{inner_types.first}[#{node.items.size}]") node.storage = :stack return end @@ -3787,7 +3787,7 @@ def visit_ListLit(node) end if node.storage == :stack - node.full_type = :"#{base_type}[#{node.items.size}]" + node.full_type = Type.new(:"#{base_type}[#{node.items.size}]") else t = Type.new(:"#{base_type}[]", location: :heap) t.provenance = :frame # makeList uses frameAlloc for backing @@ -3837,8 +3837,8 @@ def visit_RangeLit(node) def visit_Literal(node) node.full_type = case node.type - when :NUMBER then :Float64 - when :INT64 then :Int64 + when :NUMBER then Type.new(:Float64) + when :INT64 then Type.new(:Int64) when :STRING # provenance auto-inferred from location: :rodata in Type constructor if node.storage == :stack @@ -3849,17 +3849,17 @@ def visit_Literal(node) when :SYMBOL # Symbol literals: compile-time interned, static lifetime, O(1) equality by pointer. Type.new(Type::STRING_TYPE, sync: :symbol) - when :BYTE then :Byte - when :PREFIXED_INT then :Byte # Default; overflows checked after coercion context is known - when :INT8 then :Int8 - when :INT16 then :Int16 - when :INT32 then :Int32 - when :UINT16 then :UInt16 - when :UINT32 then :UInt32 - when :UINT64 then :UInt64 - when :FLOAT32 then :Float32 - when :BOOLEAN then :Bool - when :NIL then :NIL + when :BYTE then Type.new(:Byte) + when :PREFIXED_INT then Type.new(:Byte) # Default; overflows checked after coercion context is known + when :INT8 then Type.new(:Int8) + when :INT16 then Type.new(:Int16) + when :INT32 then Type.new(:Int32) + when :UINT16 then Type.new(:UInt16) + when :UINT32 then Type.new(:UInt32) + when :UINT64 then Type.new(:UInt64) + when :FLOAT32 then Type.new(:Float32) + when :BOOLEAN then Type.new(:Bool) + when :NIL then Type.new(:NIL) else error!(node, :UNKNOWN_LITERAL) end @@ -5140,7 +5140,7 @@ def visit_BgStreamBlock(node) error!(node, :BG_STREAM_INCONSISTENT_YIELD, types: elem_syms.join(', ')) end - node.full_type = :"~?#{elem_syms.first}[]" + node.full_type = Type.new(:"~?#{elem_syms.first}[]") # Detect YIELD of frame strings: when any YIELD expression is frame-allocated, # the MIR pass will heap-dupe it before push. NEXT callers own the duped copy. @@ -5186,7 +5186,7 @@ def visit_YieldExpr(node) error!(node, :YIELD_OUTSIDE_BG_STREAM) end visit(node.expr) - node.full_type = node.expr.full_type || :Void + node.full_type = Type.new(node.expr.full_type || :Void) T.must(@stream_yield_types) << Type.new(node.full_type) record_effect(EffectTracker::SUSPENDS) end @@ -5221,7 +5221,7 @@ def visit_BgBlock(node) if last_type_str.start_with?('!') last_type = T.must(last_type_str[1..]).to_sym end - node.full_type = :"~#{last_type}" + node.full_type = Type.new(:"~#{last_type}") # Propagate returns_promoted through BG blocks: if the last expression # calls a function with returns_promoted, the BG block's promise carries @@ -5369,7 +5369,7 @@ def visit_NextExpr(node) node.storage = :heap elsif promise_type.dynamic_stream? elem_sym = promise_type.tense_type.element_type.to_sym - node.full_type = :"?#{elem_sym}" + node.full_type = Type.new(:"?#{elem_sym}") elsif promise_type.bounded_stream? # NEXT on ~T[N]: returns T (the element type). # Does NOT mark the stream as moved — the stream can be NEXT'd up to N times. @@ -5382,12 +5382,12 @@ def visit_NextExpr(node) # NEXT on ~?T[]@split: returns ?T — each handle advances independently through # the shared memoized sequence until exhaustion. elem_sym = promise_type.open_stream_element_type.to_sym - node.full_type = :"?#{elem_sym}" + node.full_type = Type.new(:"?#{elem_sym}") elsif promise_type.open_stream? # NEXT on ~?T[]: returns ?T — null signals stream exhaustion. # Does NOT mark as moved — stream is a resource cleaned up via deinit. elem_sym = promise_type.open_stream_element_type.to_sym - node.full_type = :"?#{elem_sym}" + node.full_type = Type.new(:"?#{elem_sym}") elsif promise_type.inf_stream? # NEXT on ~T[INF]: returns T (never nil — stream is infinite, rendezvous-style). # Does NOT mark as moved — stream is a resource cleaned up via deinit. From 73653b394fbd6b0afbf0d38774ae400cc0316e78 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 20:19:39 +0000 Subject: [PATCH 19/45] deslop: collapse 14 .type_info is_a?(Type) reader guards With producers passing Type (prior commits) the Locatable#full_type= setter guarantees the .type_info contract is strictly nil|Type, so `x.type_info.is_a?(Type)` is a redundant nil-check. Collapsed to nil-safe access (&. / truthiness / drop the dead Type.new branch) across function_analysis(5), escape_analysis(3), generic_analysis(3), mir_checker(1), mir_lowering(2). The 3 remaining sites (annotator.rb:2793 final_type; 6522/6618 classify_og_kind param) are different contracts, intentionally untouched. Completes #45. Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/function_analysis.rb | 10 +++++----- src/annotator-helpers/generic_analysis.rb | 6 +++--- src/mir/escape_analysis.rb | 6 +++--- src/mir/mir_checker.rb | 2 +- src/mir/mir_lowering.rb | 5 ++--- 5 files changed, 14 insertions(+), 15 deletions(-) diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index 0b3bf8f10..f58c82ff1 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -240,13 +240,13 @@ def resolve_call(node, args) # String returns only get heap_promoted_call from callee.returns_promoted # (not from type alone) because stdlib string functions like readFile use # frameAlloc internally — the caller shouldn't try to free those. - if node.type_info.is_a?(Type) + if node.type_info callee_node = @fn_nodes[func_name] sig_return_heap = func_type.is_a?(FunctionSignature) && func_type.return_provenance == :heap if callee_node&.return_provenance == :heap || sig_return_heap - node.type_info.provenance = :heap if node.type_info.is_a?(Type) + node.type_info&.provenance = :heap elsif node.type_info&.needs_escape_promotion? && !node.type_info&.string? - node.type_info.provenance = :heap if node.type_info.is_a?(Type) + node.type_info&.provenance = :heap else # Union return types with heap variants need heap_promoted_call # when the callee allocates at all (frame, heap, or alloc). @@ -258,7 +258,7 @@ def resolve_call(node, args) has_heap = (schema[:variants] || {}).any? { |_, vt| Type.variant_has_heap?(vt) } callee_allocates = callee_node&.return_provenance == :heap || callee_node&.uses_frame || callee_node&.uses_heap || callee_node&.uses_alloc if has_heap && callee_allocates - node.type_info.provenance = :heap if node.type_info.is_a?(Type) + node.type_info&.provenance = :heap end end end @@ -1022,7 +1022,7 @@ def find_matching_intrinsic(definitions, args) expected = spec[:type] next false unless is_safe_autocast?(arg.resolved_type, expected) # Check capability constraints (sync, ownership, etc.) - arg_type = arg.type_info.is_a?(Type) ? arg.type_info : nil + arg_type = arg.type_info next false if spec[:sync] && arg_type&.sync != spec[:sync] next false if spec[:ownership] && arg_type&.ownership != spec[:ownership] true diff --git a/src/annotator-helpers/generic_analysis.rb b/src/annotator-helpers/generic_analysis.rb index 974e96d20..e97febe9e 100644 --- a/src/annotator-helpers/generic_analysis.rb +++ b/src/annotator-helpers/generic_analysis.rb @@ -269,7 +269,7 @@ def infer_generic_type_args!(node, signature, actual_args, type_params) arg = actual_args[i] next unless arg param_type = param[:type].is_a?(Type) ? param[:type] : Type.new(param[:type] || :Any) - actual_type = if arg.respond_to?(:type_info) && arg.type_info.is_a?(Type) + actual_type = if arg.respond_to?(:type_info) && arg.type_info arg.type_info else Type.new(arg.resolved_type || :Any) @@ -294,7 +294,7 @@ def enforce_shared_family_call_sync!(node, signature, actual_args, type_params) next unless arg param_type = param[:type].is_a?(Type) ? param[:type] : Type.new(param[:type] || :Any) next unless generic_shared_family_param?(param_type) && type_params.include?(param_type.resolved) - actual_type = if arg.respond_to?(:type_info) && arg.type_info.is_a?(Type) + actual_type = if arg.respond_to?(:type_info) && arg.type_info arg.type_info else Type.new(arg.resolved_type || :Any) @@ -615,7 +615,7 @@ def propagate_collection_metadata!(node, final_type) def propagate_call_flags!(node) T.bind(self, SemanticAnnotator) rescue nil if has_heap_promoted_call?(node.value) - node.type_info.provenance = :heap if node.type_info.is_a?(Type) + node.type_info&.provenance = :heap end end diff --git a/src/mir/escape_analysis.rb b/src/mir/escape_analysis.rb index ab1a80c1e..a32a985f0 100644 --- a/src/mir/escape_analysis.rb +++ b/src/mir/escape_analysis.rb @@ -670,10 +670,10 @@ def self.tag_transitive_provenance!(fn_nodes, heap_fns) val = node.value callee_name = val.is_a?(AST::FuncCall) ? val.name.to_s : nil next unless callee_name && heap_fns.include?(callee_name) - T.must(node.type_info).provenance = :heap if node.type_info.is_a?(Type) + node.type_info&.provenance = :heap if node.is_a?(AST::BindExpr) && node.mode == :assign decl = e3_find_decl(fn.body, node.name) - decl.type_info.provenance = :heap if decl&.type_info.is_a?(Type) + decl&.type_info&.provenance = :heap end when AST::Assignment val = node.value @@ -681,7 +681,7 @@ def self.tag_transitive_provenance!(fn_nodes, heap_fns) next unless callee_name && heap_fns.include?(callee_name) sym = node.name.symbol decl = sym&.reg - decl.type_info.provenance = :heap if decl&.type_info.is_a?(Type) + decl&.type_info&.provenance = :heap end end end diff --git a/src/mir/mir_checker.rb b/src/mir/mir_checker.rb index 378c1393f..ad0744422 100644 --- a/src/mir/mir_checker.rb +++ b/src/mir/mir_checker.rb @@ -507,7 +507,7 @@ def verify_alloc_cleanup_match!(allocs, cleanups, errdefer_destroy_names = Set.n # INV-COPY-CLEANUP: primitives and Id (value types that can never own # heap memory) must not get a Cleanup node. If they do, needs_explicit_cleanup? # or visit_CopyNode missed the gate. - if (ti = alloc_marks.first.type_info).is_a?(Type) + if (ti = alloc_marks.first.type_info) no_caps = !ti.any_sync? && !ti.multiowned? && !ti.shared? if no_caps && (ti.primitive? || (ti.generic_instance? && ti.generic_base == :Id)) @errors << error(:COPY_CLEANUP, name, diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index bb3a5dda9..4441d3684 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -5406,7 +5406,7 @@ def lower_struct_lit(node) # err_cleanup: struct owns its fields on success; only clean up on error. hoist_alloc(lower(v), v, err_cleanup: true) end - vt = v.type_info.is_a?(Type) ? v.type_info : nil + vt = v.type_info needs_items = vt&.list_collection? && !v.is_a?(AST::CopyNode) && !(v.respond_to?(:target_is_list_field) && v.target_is_list_field) # BORROWED fields: source may be ArrayList but field expects slice @@ -6075,8 +6075,7 @@ def lower_var_decl(node) lower(node.value) elsif (rhs_unwrapped.is_a?(AST::CopyNode) || rhs_unwrapped.is_a?(AST::CloneNode)) && rhs_unwrapped.value.respond_to?(:type_info) && - rhs_unwrapped.value.type_info.is_a?(Type) && - rhs_unwrapped.value.type_info.list_collection? + rhs_unwrapped.value.type_info&.list_collection? # `let dest: T[]@list = COPY src;` where src is also @list. # The default lower_copy path returns a slice (via :list_shallow + # ItemsAccess), which doesn't match the @list destination. Use From cc58fef48b6e08c809cdee47a431742c68167165 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 20:28:52 +0000 Subject: [PATCH 20/45] deslop: collapse 14 .full_type is_a?(Type) reader guards (#46) .full_type and .type_info return the same @type_object, so the producer work in #45 already guarantees this contract is strictly nil|Type. All 14 .full_type is_a?(Type) reader guards collapsed to nil-safe access (&. / truthiness; dead Type.new branches dropped, incl. the 5014 block guarded by an outer non-nil check). Gates green. Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/capabilities.rb | 6 +++--- src/annotator-helpers/effects.rb | 2 +- src/annotator-helpers/function_analysis.rb | 2 +- src/annotator-helpers/generic_analysis.rb | 10 +++++----- src/annotator.rb | 4 ++-- src/mir/mir_lowering.rb | 8 ++------ 6 files changed, 14 insertions(+), 18 deletions(-) diff --git a/src/annotator-helpers/capabilities.rb b/src/annotator-helpers/capabilities.rb index 0e783868b..2aa12dc6d 100644 --- a/src/annotator-helpers/capabilities.rb +++ b/src/annotator-helpers/capabilities.rb @@ -92,7 +92,7 @@ def cap_var_sync(var_node) T.bind(self, SemanticAnnotator) rescue nil sym_sync = var_node.symbol&.sync return sym_sync if sym_sync - return var_node.full_type.sync if var_node.full_type.is_a?(Type) + return var_node.full_type.sync if var_node.full_type nil end @@ -101,7 +101,7 @@ def cap_var_storage(var_node) T.bind(self, SemanticAnnotator) rescue nil sym = var_node.symbol return sym.storage if sym - if var_node.full_type.is_a?(Type) + if var_node.full_type case T.must(var_node.full_type).ownership when :shared then return :shared when :multiowned then return :multiowned @@ -117,7 +117,7 @@ def cap_var_layout(var_node) T.bind(self, SemanticAnnotator) rescue nil sym_layout = var_node.symbol&.layout return sym_layout if sym_layout - return T.must(var_node.full_type).layout if var_node.full_type.is_a?(Type) + return var_node.full_type.layout if var_node.full_type nil end diff --git a/src/annotator-helpers/effects.rb b/src/annotator-helpers/effects.rb index d64d6bfde..24048e3e9 100644 --- a/src/annotator-helpers/effects.rb +++ b/src/annotator-helpers/effects.rb @@ -359,7 +359,7 @@ def compute_needs_rt! @call_graph = T.let(@call_graph, T.untyped) needs_rt = {} @fn_nodes.each do |name, fn_node| - raw = fn_node.full_type.is_a?(Type) ? fn_node.full_type.raw : fn_node.full_type + raw = fn_node.full_type&.raw ret_type = raw.is_a?(FunctionSignature) ? raw.return_type : nil heap_return = ret_type.is_a?(Type) && (ret_type.heap? || ret_type.dynamic?) has_takes_heap = fn_node.params&.any? { |p| diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index f58c82ff1..91701c56c 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -178,7 +178,7 @@ def resolve_call(node, args) # The original `!T` is stashed on `error_union_type` so # OR-RESCUE handlers (which read the LHS's union to pick # `catch`/`orelse`) can still see the un-stripped form. - if node.full_type.is_a?(Type) && node.full_type.respond_to?(:error_union?) && + if node.full_type.respond_to?(:error_union?) && node.full_type.error_union? node.error_union_type = node.full_type if node.respond_to?(:error_union_type=) outer = node.full_type diff --git a/src/annotator-helpers/generic_analysis.rb b/src/annotator-helpers/generic_analysis.rb index e97febe9e..025d80fd7 100644 --- a/src/annotator-helpers/generic_analysis.rb +++ b/src/annotator-helpers/generic_analysis.rb @@ -577,7 +577,7 @@ def propagate_collection_metadata!(node, final_type) node.type_info.provenance = :heap if coll_src.collection == :pool || coll_src.collection == :set node.type_info.shard_count = coll_src.shard_count if coll_src.shard_count node.type_info.soa = coll_src.soa if coll_src.respond_to?(:soa) && coll_src.soa - if node.full_type.is_a?(Type) + if node.full_type node.full_type.collection = coll_src.collection unless node.full_type.collection node.full_type.soa = coll_src.soa if coll_src.respond_to?(:soa) && coll_src.soa node.full_type.shard_count = coll_src.shard_count if coll_src.shard_count && !node.full_type.shard_count @@ -587,22 +587,22 @@ def propagate_collection_metadata!(node, final_type) # Standalone @soa on fixed arrays (no collection): propagate soa flag directly. if !coll_src && (decl_t = node.type).is_a?(Type) && decl_t.soa node.type_info.soa = true if node.type_info - node.full_type.soa = true if node.full_type.is_a?(Type) + node.full_type&.soa = true end # Map-specific propagation: maps don't use :collection, so the above doesn't cover them. if (decl_t = node.type).is_a?(Type) if decl_t.shard_count && !node.type_info&.shard_count node.type_info.shard_count = decl_t.shard_count if node.type_info - node.full_type.instance_variable_set(:@shard_count, decl_t.shard_count) if node.full_type.is_a?(Type) + node.full_type&.instance_variable_set(:@shard_count, decl_t.shard_count) end if decl_t.sync && node.type_info && !node.type_info.sync node.type_info.sync = decl_t.sync - node.full_type.sync = decl_t.sync if node.full_type.is_a?(Type) + node.full_type&.sync = decl_t.sync end if decl_t.ownership != :affine && node.type_info node.type_info.instance_variable_set(:@ownership, decl_t.ownership) - node.full_type.instance_variable_set(:@ownership, decl_t.ownership) if node.full_type.is_a?(Type) + node.full_type&.instance_variable_set(:@ownership, decl_t.ownership) end end end diff --git a/src/annotator.rb b/src/annotator.rb index d4329ea0b..9566c34c5 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -2647,7 +2647,7 @@ def promote_pipe_to_observable_dest!(node) # only the fold's analyzer knows whether this is :sum/:count/:max/... # Copying it onto node.type also propagates the kind to the binding's # symbol entry (so WITH VIEW / NEXT / cleanup all see it). - if pipe.full_type.is_a?(Type) && T.must(pipe.full_type).observable_terminal + if pipe.full_type&.observable_terminal pipe_terminal = T.must(pipe.full_type).observable_terminal target_t = node.type.is_a?(Type) ? node.type : Type.new(node.type) # The pipe is the authority on terminal kind: only the fold's @@ -2671,7 +2671,7 @@ def promote_pipe_to_observable_dest!(node) # OBSERVABLE_WRAPPERS can find it. Without this, the binding's # emitted Zig wrapper would default-or-raise. Same mismatch # check as above. - if node.full_type.is_a?(Type) && node.full_type.observable? + if node.full_type&.observable? if node.full_type.observable_terminal && node.full_type.observable_terminal != pipe_terminal raise CompilerError.new( node.token, diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index 4441d3684..b399d2a1d 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -2512,7 +2512,7 @@ def with_cap_var_name(var_node) # only works for non-Arc parameters. sig { params(var_node: AST::Identifier).returns(T::Array[T.nilable(Symbol)]) } def with_cap_sync_storage(var_node) - if var_node.is_a?(AST::GetField) && var_node.full_type.is_a?(Type) + if var_node.is_a?(AST::GetField) && var_node.full_type ft = var_node.full_type sync = ft.sync storage = case ft.ownership @@ -5010,11 +5010,7 @@ def lower_smooth(node) # COLLECT only needs to call .next() to read the final value. if rhs.is_a?(AST::CollectOp) left = lower(node.left) - ft = if node.left.full_type - node.left.full_type.is_a?(Type) ? node.left.full_type : Type.new(node.left.full_type) - else - nil - end + ft = node.left.full_type # Collection observable (DISTINCT producing `~T[]@set:observable`): # COLLECT must yield an owned ArrayList(T), not the StreamSetSnapshot # that `next()` returns. Mirrors lower_next_expr's collection branch From 260e67e42de7563b48e4c4390da6480f1bf55adc Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 20:32:51 +0000 Subject: [PATCH 21/45] deslop: collapse .type_info-sourced local guards (#52,#58,#60-#63) Locals sourced from `x.type_info rescue nil` are provably nil|Type post-#45, so the `Type.new(ti) if ti && !ti.is_a?(Type)` coercions are dead and `if ti.is_a?(Type)` is a redundant nil-check. Collapsed: escape_analysis per_fn_scan!(238), e2_loop_carry_names!(decl_ti, outer_ti), e3_mark_carry_expr!(904,910); control_flow _collect_share_moves; promotion_plan stamp_field_pre_cleanups!. #52's 327/374 are .symbol.type (heterogeneous, = #47), left alone. Gates green. Co-Authored-By: Claude Opus 4.7 --- src/mir/control_flow.rb | 3 +-- src/mir/escape_analysis.rb | 12 +++++------- src/mir/promotion_plan.rb | 2 -- 3 files changed, 6 insertions(+), 11 deletions(-) diff --git a/src/mir/control_flow.rb b/src/mir/control_flow.rb index 2b611c96f..497ec0c6a 100644 --- a/src/mir/control_flow.rb +++ b/src/mir/control_flow.rb @@ -1970,8 +1970,7 @@ def _collect_share_moves(node, names) if source.is_a?(AST::Identifier) ti = source.type_info rescue nil - ti = Type.new(ti) if ti && !ti.is_a?(Type) - return if ti.is_a?(Type) && ti.shared? + return if ti&.shared? names << source.name.to_s return end diff --git a/src/mir/escape_analysis.rb b/src/mir/escape_analysis.rb index a32a985f0..f5628876c 100644 --- a/src/mir/escape_analysis.rb +++ b/src/mir/escape_analysis.rb @@ -235,7 +235,7 @@ def self.analyze!(fn_nodes, heap_fns:, promotion_plans: {}) end if callee_name && heap_fns.include?(callee_name) ti = node.type_info rescue nil - ti.provenance = :heap if ti.is_a?(Type) && !ti.heap_provenance? + ti.provenance = :heap if ti && !ti.heap_provenance? end end @@ -541,8 +541,7 @@ def self.analyze!(fn_nodes, heap_fns:, promotion_plans: {}) next unless (local_decl.is_a?(AST::VarDecl) || (local_decl.is_a?(AST::BindExpr) && local_decl.mode == :decl)) && local_decl.name.to_s == rhs.name local_decl.storage = :heap decl_ti = local_decl.type_info rescue nil - decl_ti = Type.new(decl_ti) if decl_ti && !decl_ti.is_a?(Type) - decl_ti.provenance = :heap if decl_ti.is_a?(Type) + decl_ti&.provenance = :heap if rhs.symbol rhs.symbol.storage = :heap sym_reg = rhs.symbol.reg @@ -555,8 +554,7 @@ def self.analyze!(fn_nodes, heap_fns:, promotion_plans: {}) next unless (outer_decl.is_a?(AST::VarDecl) || (outer_decl.is_a?(AST::BindExpr) && outer_decl.mode == :decl)) && outer_decl.name.to_s == outer_name outer_decl.storage = :heap outer_ti = outer_decl.type_info rescue nil - outer_ti = Type.new(outer_ti) if outer_ti && !outer_ti.is_a?(Type) - outer_ti.provenance = :heap if outer_ti.is_a?(Type) + outer_ti&.provenance = :heap outer_decl.symbol.storage = :heap if outer_decl.symbol end end @@ -903,13 +901,13 @@ def self.tag_carry_call_sites!(fn_nodes) when AST::FuncCall if carry_fns.include?(node.name.to_s) ti = node.type_info rescue nil - ti.provenance = :heap if ti.is_a?(Type) && !ti.heap_provenance? + ti.provenance = :heap if ti && !ti.heap_provenance? end node.args.each { |a| e3_mark_carry_expr!(a, carry_fns) } when AST::MethodCall if carry_fns.include?(node.name.to_s) ti = node.type_info rescue nil - ti.provenance = :heap if ti.is_a?(Type) && !ti.heap_provenance? + ti.provenance = :heap if ti && !ti.heap_provenance? end e3_mark_carry_expr!(node.object, carry_fns) node.args.each { |a| e3_mark_carry_expr!(a, carry_fns) } diff --git a/src/mir/promotion_plan.rb b/src/mir/promotion_plan.rb index 967a4f19d..34de068ed 100644 --- a/src/mir/promotion_plan.rb +++ b/src/mir/promotion_plan.rb @@ -320,8 +320,6 @@ def self.stamp_field_pre_cleanups!(body, bindings, schema_lookup: nil) target_node = stmt.name.target field_ti = stmt.name.type_info rescue nil - field_ti = Type.new(field_ti) if field_ti && !field_ti.is_a?(Type) - field_ti = nil unless field_ti.is_a?(Type) # Auto-lock string fields: locked/always_mutable structs heap-dupe # string fields, so overwriting needs explicit free of the old value. From 8e01fe325537d90bcc97a0203d906c425a060397 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 20:35:27 +0000 Subject: [PATCH 22/45] deslop: collapse dead .wrapped_type coercion (#54) Type#wrapped_type is structurally nil|Type (type.rb:1030), and both promotion_plan sites are guarded by `next unless inner_ti`, so `inner_ti = Type.new(inner_ti) unless inner_ti.is_a?(Type)` is dead. Removed. (annotator.rb:1325 nil-kill grouped here is actually a heterogeneous hash value b[:unwrapped_type], not wrapped_type -- left alone.) Gates green. Co-Authored-By: Claude Opus 4.7 --- src/mir/promotion_plan.rb | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/mir/promotion_plan.rb b/src/mir/promotion_plan.rb index 34de068ed..7e9ac01e3 100644 --- a/src/mir/promotion_plan.rb +++ b/src/mir/promotion_plan.rb @@ -596,7 +596,6 @@ def self.stamp_field_pre_cleanups!(body, bindings, schema_lookup: nil) ti = nil unless ti.is_a?(Type) inner_ti = ti&.wrapped_type next unless inner_ti - inner_ti = Type.new(inner_ti) unless inner_ti.is_a?(Type) e = classify_binding(node.binding_name.to_s, inner_ti, node, promoted_fns, schema_lookup) next unless e e[:zig_type] ||= (Type.new(inner_ti.resolved).zig_type rescue inner_ti.resolved.to_s) @@ -621,7 +620,6 @@ def self.stamp_field_pre_cleanups!(body, bindings, schema_lookup: nil) ti = nil unless ti.is_a?(Type) inner_ti = ti&.wrapped_type next unless inner_ti - inner_ti = Type.new(inner_ti) unless inner_ti.is_a?(Type) e = classify_binding(b[:name].to_s, inner_ti, node, promoted_fns, schema_lookup) next unless e e[:zig_type] ||= (Type.new(inner_ti.resolved).zig_type rescue inner_ti.resolved.to_s) From 71c5d971b4e81830ad2080d8f0ccf50ceb2ff3b4 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 20:36:57 +0000 Subject: [PATCH 23/45] deslop: record 20-item pass outcome (11 done, 9 blocked) 11 items landed (seam-backed nil|Type contracts -- safe behavior-preserving collapses, all gate-verified). Remaining 9 are a major blocker: genuinely heterogeneous contracts (.type holds FunctionSignature/String, .return_type holds Hash/Proc, final_type is Symbol|Type by design) where is_a?(Type) is a real discriminator -- collapsing is not behavior-preserving and needs a deep per-contract retype, not a deslop commit. Documented, not faked. Co-Authored-By: Claude Opus 4.7 --- docs/agents/deslop-bugs.md | 52 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/docs/agents/deslop-bugs.md b/docs/agents/deslop-bugs.md index 2ee08e962..9d0e61b88 100644 --- a/docs/agents/deslop-bugs.md +++ b/docs/agents/deslop-bugs.md @@ -104,3 +104,55 @@ safe to convert) vs other carriers (raw-Symbol contract, must instead be typed at *their* definition or left). That per-receiver discrimination is the actual program -- the mechanical transform is not a substitute for it. + +## Outcome of the 20-item pass + +**Done (11 items), all gate-verified standalone commits** (specs: +pre-existing flaky fmt only; transpile 548/548 0 leaks; fuzz 141/141 +0 fail/leak/mir-error): + +- #45 `.type_info` -- 22 producers -> Type at the Locatable seam + + `visit_Slice` returns(Symbol)->.void slop fix + 14 reader guards + collapsed. (3b90fd4b6, aba4b1f26, c79bf07d6, 6544881b4) +- #46 `.full_type` -- same @type_object seam; 14 reader guards + collapsed. (b8e60bab8) +- #52(partial),#58,#60,#61,#62,#63 -- `.type_info`-sourced + single-method locals; dead coercions removed, guards -> nil-safe. + (e658b0622) +- #54 `.wrapped_type` -- structurally nil|Type; 2 dead coercions + removed. (c5749215e) +- #55,#56 -- nil-kill "always Type" param collapses. (d4507ea99, + 916cd5caf) + +The unifying safe pattern: a contract whose **producer is +structurally `nil|Type`** (the `Locatable#full_type=` laundering +seam, or `wrapped_type`'s own ctor) -- there the `is_a?(Type)` is a +redundant nil-check and collapses behavior-preservingly. + +**MAJOR BLOCKER -- remaining 9 (#47,#48,#49,#50,#51,#53,#57,#59,#64).** +These contracts are *genuinely heterogeneous*; `is_a?(Type)` is a +real, load-bearing discriminator, NOT a redundant nil-check: + +- `.type` (#47): producers `{Type, Symbol, NilClass, + T.nilable(Type), FunctionSignature, String}`. `node.type.is_a?(Type) + ? node.type : Type.new(node.type)` legitimately coerces a Symbol; + `FunctionSignature`/`String` are NOT `Type.new`-able. Collapsing + changes behavior / crashes. +- `.return_type` (#48): `{T.nilable(Type), Type, Symbol, Hash, Proc}` + -- `Hash`/`Proc` are not Types. +- `final_type` (#50): `Symbol|Type` *by design* -- finalize_storage! + normalizes a raw type spec; the discriminator is the whole point. +- `:type`/`:resolved_type` hash-keys (#49,#53), match-binding + (#51): heterogeneous hash values. +- `expected_type` (#57), `source_type` (#59): genuinely nilable / + no runtime evidence of always-Type. + +Collapsing any of these is **not behavior-preserving**. Each needs +its own deep per-contract retype program (find the `@ivar=` / hash +writer, give it a real `Type`, handle the non-Type members like +`FunctionSignature`/`Proc`/`Hash` explicitly) -- a #45-scale-or-larger +*semantic* change per contract, with real miscompilation risk. That +is the major blocker: forcing these collapses to "finish 20" would +ship compiler bugs (the exact anti-pattern in docs/retrospective). +They are left as pending, scoped, with this rationale, rather than +faked. No CLEAR transpiler bugs were introduced anywhere in the pass. From 592b1f9ec20391017a111e8f17fdf2f757b82905 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 20:53:55 +0000 Subject: [PATCH 24/45] deslop: spec #47 (.type) as a reviewed semantic refactor User's model validated (target nil|Type|FunctionSignature; String/ Symbol slop; sigs -> T.any(Type,FunctionSignature)). Carrier disambiguated: Literal#type is a separate token-kind field, no blast radius. Real obstacle is semantic, not mechanical: the 11 is_a?(Type) sites double as a resolved-vs-unresolved gate, so normalizing the seam changes which decls get processed. Documented as the spec for a focused reviewed PR; not auto-run (gates-green != provably-correct for semantic change). #48-#51,#53,#57,#59 same. Co-Authored-By: Claude Opus 4.7 --- docs/agents/deslop-bugs.md | 42 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/docs/agents/deslop-bugs.md b/docs/agents/deslop-bugs.md index 9d0e61b88..e329c710b 100644 --- a/docs/agents/deslop-bugs.md +++ b/docs/agents/deslop-bugs.md @@ -156,3 +156,45 @@ is the major blocker: forcing these collapses to "finish 20" would ship compiler bugs (the exact anti-pattern in docs/retrospective). They are left as pending, scoped, with this rationale, rather than faked. No CLEAR transpiler bugs were introduced anywhere in the pass. + +## #47 `.type` -- deep analysis (user-directed re-attack) + +The user correctly rejected the first "blocker" framing. Full +analysis confirms their model AND pins the real obstacle: + +- VALIDATED: target contract for `VarDecl#type` / `BindExpr#type` is + `nil | Type | FunctionSignature`; `String`/`Symbol` are slop; + consumers should be `T.any(Type, FunctionSignature)` not + `T.untyped`. +- The `.type` accessor is overloaded across Structs. `Literal#type` + is a lexical token kind (`:NUMBER`/`:STRING`) -- a Symbol by + design, a *different field*. Every `case node.type` / `node.type + == :Sym` reader in src is on `Literal` (lower_literal, + int_lit_value, literal_source_length, visit_Literal), NOT on + VarDecl/BindExpr. So there is no Symbol-comparison blast radius on + the declared-type carrier -- the earlier fear was unfounded. +- Clean seam: a memoizing-normalizing reader on VarDecl/BindExpr + (`Symbol|String -> Type.new`, pass nil/Type/FunctionSignature + through). No `FunctionSignature` constant reference needed. + +REAL obstacle (semantic, not mechanical): the 11 +`node.type.is_a?(Type)` sites tangle two roles: + 1. pure laundering (`is_a?(Type) ? t : Type.new(t)`) -- collapses + cleanly once the seam normalizes; + 2. a *resolved-vs-unresolved gate* (`return unless + node.type.is_a?(Type) && node.type.future?`) -- normalizing the + seam changes which declarations get processed (a previously + skipped unresolved/Symbol-typed decl now proceeds). That is a + behavior change, and `is_a?(Type)` also still legitimately + discriminates `Type` from `FunctionSignature` (which has no + `.future?`). + +Therefore #47 is a reviewed *semantic* refactor: per-site decide +whether an unresolved / fn-typed decl should proceed or skip, add +the seam, retype `T.untyped -> T.any(Type, FunctionSignature)`. It +is bounded and the analysis above is its spec, but it requires +intent decisions across annotator/escape-analysis that must not be +made unilaterally under "gates green" (gates green != provably +correct for semantic change). #48/#49/#50/#51/#53/#57/#59 share this +shape. Recommended: do #47 as a focused reviewed PR using this +section as the spec; do not auto-run it. From 2ce5beef190e81b9cb1bbc4cd2eb8843da2cd03f Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 21:35:47 +0000 Subject: [PATCH 25/45] deslop(epic #65 step1): strongly-typed IntrinsicEmit + extend FunctionSignature New IntrinsicEmit T::Struct: every std_lib registry emission/dispatch key strongly typed (recursive for nested sub-descriptors like eql/cleanup/pool/string_map; registry-pointer via :registry). No untyped Hash. FunctionSignature gains the intrinsic signature semantics it lacked -- return_resolver (polymorphic return Proc), arg_validator, arg_spec, arity -- and `emit` (IntrinsicEmit). Keeps return_type a pure Type even for receiver-dependent intrinsics. Additive only: nothing consumes these yet, no behavior change; carried through dup. Gates green (specs flaky-fmt only; transpile 548/548 0 leaks; fuzz 141/141 clean). Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/function_signature.rb | 19 +++++ src/annotator-helpers/intrinsic_emit.rb | 84 +++++++++++++++++++++ 2 files changed, 103 insertions(+) create mode 100644 src/annotator-helpers/intrinsic_emit.rb diff --git a/src/annotator-helpers/function_signature.rb b/src/annotator-helpers/function_signature.rb index 3cc6b4eec..5dcc19d37 100644 --- a/src/annotator-helpers/function_signature.rb +++ b/src/annotator-helpers/function_signature.rb @@ -6,6 +6,7 @@ # computed metadata (needs_rt, can_fail, return_provenance) that callers # need for code generation and cleanup planning. require "sorbet-runtime" +require_relative "intrinsic_emit" class FunctionSignature extend T::Sig @@ -24,6 +25,14 @@ class FunctionSignature # Intrinsic marker attr_accessor :intrinsic, :zig_pattern + # Intrinsic signature semantics (set by the registry converter; nil + # for ordinary user functions). `return_resolver` is the polymorphic + # return Proc (receiver-type -> Type); `arg_validator` the custom + # arg type-checker; `arg_spec` the raw args shape; `emit` the typed + # codegen/dispatch metadata (IntrinsicEmit). Keeps `return_type` a + # pure Type even for receiver-dependent intrinsics. + attr_accessor :return_resolver, :arg_validator, :arg_spec, :arity, :emit + # P2: REQUIRES clause as { param_name_string => Set[Symbol] } or nil. # Mirrors FunctionDef#requires; needed at signature level so call-site # checks survive cross-module flow. @@ -89,6 +98,11 @@ def initialize(params:, return_type:, return_lifetime: nil, visibility: nil, @return_strategy = T.let(nil, T.untyped) @stack_tier = T.let(nil, T.untyped) @requires = T.let(nil, T.untyped) + @return_resolver = T.let(nil, T.nilable(Proc)) + @arg_validator = T.let(nil, T.nilable(Proc)) + @arg_spec = T.let(nil, T.untyped) + @arity = T.let(nil, T.nilable(Integer)) + @emit = T.let(nil, T.nilable(IntrinsicEmit)) end sig { returns(FunctionSignature) } @@ -108,6 +122,11 @@ def dup s.return_strategy = @return_strategy s.stack_tier = @stack_tier s.requires = @requires + s.return_resolver = @return_resolver + s.arg_validator = @arg_validator + s.arg_spec = @arg_spec + s.arity = @arity + s.emit = @emit end end end diff --git a/src/annotator-helpers/intrinsic_emit.rb b/src/annotator-helpers/intrinsic_emit.rb new file mode 100644 index 000000000..79b334a8c --- /dev/null +++ b/src/annotator-helpers/intrinsic_emit.rb @@ -0,0 +1,84 @@ +# typed: strict +# Strongly-typed emission/dispatch metadata for an intrinsic. +# +# The std_lib registries (STD_LIB / POOL_METHODS / SET_METHODS / +# MAP_METHODS / INDEX_OPS / BUILTIN_OPS) stay defined as Hash literals +# (the readable authoring DSL). A startup converter (see EPIC) turns +# each entry into a FunctionSignature whose intrinsic-only codegen +# metadata lives HERE -- a typed value object, never an untyped Hash. +# +# Recursive: sub-descriptors (`eql:`, `cleanup:`, `pool:`, +# `string_map:` ...) are themselves IntrinsicEmit; registry-pointer +# forms (`{ registry: MAP_METHODS }`) carry the registry name in +# `:registry`. +require "sorbet-runtime" + +class IntrinsicEmit < T::Struct + extend T::Sig + + # --- Zig codegen templates --- + prop :zig, T.nilable(String), default: nil + prop :numeric_zig, T.nilable(String), default: nil + prop :sharded_zig, T.nilable(String), default: nil + prop :shard_direct_zig, T.nilable(String), default: nil + + # --- FSM emission fragments --- + prop :fsm_setup, T.nilable(T::Array[String]), default: nil + prop :fsm_state_decls, T.nilable(T::Array[String]), default: nil + prop :fsm_finish_block, T.nilable(T::Array[String]), default: nil + prop :fsm_state_finalize, T.nilable(T::Array[String]), default: nil + prop :fsm_finish_value, T.nilable(String), default: nil + + # --- Dispatch flags --- + prop :bc, T::Boolean, default: false + prop :is_method, T::Boolean, default: false + prop :suspends, T::Boolean, default: false + prop :narrows_collection, T::Boolean, default: false + prop :mutates_receiver, T::Boolean, default: false + prop :allocates, T::Boolean, default: false + prop :takes_value, T::Boolean, default: false + + # --- Symbol-valued dispatch / allocation --- + prop :tag, T.nilable(Symbol), default: nil + prop :builtin, T.nilable(Symbol), default: nil + prop :alloc, T.nilable(Symbol), default: nil + prop :return_alloc, T.nilable(Symbol), default: nil + prop :val_alloc, T.nilable(Symbol), default: nil + prop :key_alloc, T.nilable(Symbol), default: nil + prop :shard_alloc, T.nilable(Symbol), default: nil + prop :sharded_alloc, T.nilable(Symbol), default: nil + prop :borrows, T.nilable(Symbol), default: nil + prop :reject_when, T.nilable(Symbol), default: nil + prop :registry, T.nilable(Symbol), default: nil + + # --- Strings --- + prop :lifetime, T.nilable(String), default: nil + prop :reject_error, T.nilable(String), default: nil + + # --- Arg-shape (element typing deferred; union keeps it bounded) --- + prop :arity, T.nilable(Integer), default: nil + prop :takes_args, T.nilable(T::Array[Integer]), default: nil + prop :value_transforms, + T.nilable(T::Array[Symbol]), default: nil + prop :shard_direct_value_transforms, + T.nilable(T::Array[Symbol]), default: nil + + # --- Procs (varying arity by role) --- + prop :label, T.nilable(Proc), default: nil + + # --- Recursive sub-descriptors --- + prop :eql, T.nilable(IntrinsicEmit), default: nil + prop :strcmp, T.nilable(IntrinsicEmit), default: nil + prop :cleanup, T.nilable(IntrinsicEmit), default: nil + prop :assert, T.nilable(IntrinsicEmit), default: nil + prop :array, T.nilable(IntrinsicEmit), default: nil + prop :list, T.nilable(IntrinsicEmit), default: nil + prop :pool, T.nilable(IntrinsicEmit), default: nil + prop :set, T.nilable(IntrinsicEmit), default: nil + prop :get, T.nilable(IntrinsicEmit), default: nil + prop :string_raw, T.nilable(IntrinsicEmit), default: nil + prop :string_symbol, T.nilable(IntrinsicEmit), default: nil + prop :string_map, T.nilable(IntrinsicEmit), default: nil + prop :numeric_map, T.nilable(IntrinsicEmit), default: nil + prop :set_collection, T.nilable(IntrinsicEmit), default: nil +end From 6d1d804fbcc72df4c9e8597d74c91bb82af8ddbe Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 21:45:40 +0000 Subject: [PATCH 26/45] deslop(epic #65 step2): total Hash->FunctionSignature registry converter IntrinsicRegistry.convert_entry/convert_registry: registry Hash entry -> FunctionSignature (+ recursive IntrinsicEmit). Static return -> pure Type; Proc -> return_resolver; validate -> arg_validator; nested {zig:}/{registry:CONST} -> recursive IntrinsicEmit. Authoritative introspection over every real entry found exactly one missing key (bc_op, Symbol) -- added; the strongly-typed model now fits 100% of the authoring data. New intrinsic_registry_spec proves totality (every STD_LIB/POOL/SET/MAP/INDEX/BUILTIN_OPS entry converts, T::Struct enforcing types) + fidelity + recursion. Inert: nothing consumes it yet. Gates green (prspec +3 pass / flaky-fmt only; transpile 548/548 0 leaks; fuzz 141/141). Co-Authored-By: Claude Opus 4.7 --- spec/intrinsic_registry_spec.rb | 64 ++++++++++++ src/annotator-helpers/intrinsic_emit.rb | 14 ++- src/annotator-helpers/intrinsic_registry.rb | 104 ++++++++++++++++++++ 3 files changed, 177 insertions(+), 5 deletions(-) create mode 100644 spec/intrinsic_registry_spec.rb create mode 100644 src/annotator-helpers/intrinsic_registry.rb diff --git a/spec/intrinsic_registry_spec.rb b/spec/intrinsic_registry_spec.rb new file mode 100644 index 000000000..675dc3266 --- /dev/null +++ b/spec/intrinsic_registry_spec.rb @@ -0,0 +1,64 @@ +# frozen_string_literal: true + +require_relative "spec_helper" +require_relative "../src/ast/type" +require_relative "../src/mir/mir" +require_relative "../src/ast/std_lib" +require_relative "../src/annotator-helpers/intrinsic_registry" + +# Totality + fidelity: every real registry entry must convert without +# error (T::Struct raises on any mistyped IntrinsicEmit prop, so this +# proves the typed model fits the real authoring data), and key +# semantics must round-trip. +RSpec.describe IntrinsicRegistry do + REGISTRIES = { + STD_LIB: STD_LIB, POOL_METHODS: POOL_METHODS, SET_METHODS: SET_METHODS, + MAP_METHODS: MAP_METHODS, INDEX_OPS: INDEX_OPS, BUILTIN_OPS: BUILTIN_OPS + }.freeze + + it "converts every entry in every registry without error (totality)" do + REGISTRIES.each do |rname, reg| + reg.each do |mname, entry| + next unless entry.is_a?(Hash) + + expect { IntrinsicRegistry.convert_entry(mname, entry, REGISTRIES) } + .not_to(raise_error, "#{rname}[#{mname.inspect}] failed to convert") + end + end + end + + it "yields a pure Type return_type and Proc resolver fidelity" do + REGISTRIES.each_value do |reg| + reg.each do |mname, entry| + next unless entry.is_a?(Hash) + + fs = IntrinsicRegistry.convert_entry(mname, entry, REGISTRIES) + expect(fs.return_type).to be_a(Type) + src = entry.key?(:return_type) ? entry[:return_type] : entry[:return] + expect(fs.return_resolver).to be_a(Proc) if src.is_a?(Proc) + expect(fs.emit).to be_a(IntrinsicEmit).or be_nil + expect(fs.intrinsic).to be(true) + end + end + end + + it "round-trips representative emit fields incl. recursion" do + fs = IntrinsicRegistry.convert_entry( + "insert", POOL_METHODS["insert"], REGISTRIES + ) + expect(fs.emit.tag).to eq(:pool_method) + expect(fs.emit.is_method).to be(true) + expect(fs.emit.zig).to be_a(String) + expect(fs.return_resolver).to be_a(Proc) + + # Nested recursive sub-descriptor (eql/cleanup/... -> IntrinsicEmit) + nested = REGISTRIES.each_value.flat_map(&:values) + .select { |e| e.is_a?(Hash) } + .find { |e| e[:eql].is_a?(Hash) || e[:cleanup].is_a?(Hash) } + if nested + fe = IntrinsicRegistry.convert_entry("x", nested, REGISTRIES) + sub = fe.emit.eql || fe.emit.cleanup + expect(sub).to be_a(IntrinsicEmit) + end + end +end diff --git a/src/annotator-helpers/intrinsic_emit.rb b/src/annotator-helpers/intrinsic_emit.rb index 79b334a8c..c94d5a74a 100644 --- a/src/annotator-helpers/intrinsic_emit.rb +++ b/src/annotator-helpers/intrinsic_emit.rb @@ -16,11 +16,13 @@ class IntrinsicEmit < T::Struct extend T::Sig - # --- Zig codegen templates --- - prop :zig, T.nilable(String), default: nil - prop :numeric_zig, T.nilable(String), default: nil - prop :sharded_zig, T.nilable(String), default: nil - prop :shard_direct_zig, T.nilable(String), default: nil + # --- Zig codegen templates (String template, or Symbol macro + # directive like :macro_map in STD_LIB) --- + StrOrSym = T.type_alias { T.any(String, Symbol) } + prop :zig, T.nilable(StrOrSym), default: nil + prop :numeric_zig, T.nilable(StrOrSym), default: nil + prop :sharded_zig, T.nilable(StrOrSym), default: nil + prop :shard_direct_zig, T.nilable(StrOrSym), default: nil # --- FSM emission fragments --- prop :fsm_setup, T.nilable(T::Array[String]), default: nil @@ -37,6 +39,7 @@ class IntrinsicEmit < T::Struct prop :mutates_receiver, T::Boolean, default: false prop :allocates, T::Boolean, default: false prop :takes_value, T::Boolean, default: false + prop :container_borrow, T::Boolean, default: false # --- Symbol-valued dispatch / allocation --- prop :tag, T.nilable(Symbol), default: nil @@ -49,6 +52,7 @@ class IntrinsicEmit < T::Struct prop :sharded_alloc, T.nilable(Symbol), default: nil prop :borrows, T.nilable(Symbol), default: nil prop :reject_when, T.nilable(Symbol), default: nil + prop :bc_op, T.nilable(Symbol), default: nil prop :registry, T.nilable(Symbol), default: nil # --- Strings --- diff --git a/src/annotator-helpers/intrinsic_registry.rb b/src/annotator-helpers/intrinsic_registry.rb new file mode 100644 index 000000000..eb6e46be0 --- /dev/null +++ b/src/annotator-helpers/intrinsic_registry.rb @@ -0,0 +1,104 @@ +# typed: false +# Startup converter: std_lib registry Hash entry -> FunctionSignature +# (+ typed IntrinsicEmit). The Hash literals stay the authoring DSL; +# this builds the typed objects consumers will read. Inert until +# consumers are migrated (EPIC #65, per-registry slices). +require_relative "function_signature" +require_relative "intrinsic_emit" + +module IntrinsicRegistry + module_function + + # Keys consumed at the FunctionSignature level (not IntrinsicEmit). + FS_KEYS = %i[args arity validate return return_type can_fail needs_rt].freeze + + EMIT_BOOL = %i[bc is_method suspends narrows_collection mutates_receiver + allocates takes_value container_borrow].freeze + EMIT_STRSYM = %i[zig numeric_zig sharded_zig shard_direct_zig].freeze + EMIT_STR = %i[lifetime reject_error fsm_finish_value].freeze + EMIT_STRARR = %i[fsm_setup fsm_state_decls fsm_finish_block + fsm_state_finalize].freeze + EMIT_SYM = %i[tag builtin alloc return_alloc val_alloc key_alloc + shard_alloc sharded_alloc borrows reject_when + bc_op].freeze + EMIT_SYMARR = %i[value_transforms shard_direct_value_transforms].freeze + EMIT_INTARR = %i[takes_args].freeze + EMIT_PROC = %i[label].freeze + EMIT_NESTED = %i[eql strcmp cleanup assert array list pool set get + string_raw string_symbol string_map numeric_map + set_collection].freeze + + # registries: { Symbol => the registry Hash } (for {registry: X} ptrs) + def build_emit(h, registries) + return nil unless h.is_a?(Hash) + e = IntrinsicEmit.new + h.each do |k, v| + next if FS_KEYS.include?(k) + next if v.nil? + case k + when *EMIT_BOOL then e.public_send("#{k}=", !!v) + when *EMIT_STRSYM then e.public_send("#{k}=", v) + when *EMIT_STR then e.public_send("#{k}=", v.to_s) + when *EMIT_STRARR then e.public_send("#{k}=", Array(v).map(&:to_s)) + when *EMIT_SYM then e.public_send("#{k}=", v.to_sym) + when *EMIT_SYMARR then e.public_send("#{k}=", Array(v).map(&:to_sym)) + when *EMIT_INTARR then e.public_send("#{k}=", Array(v).map(&:to_i)) + when *EMIT_PROC then e.public_send("#{k}=", v) + when *EMIT_NESTED + e.public_send("#{k}=", nested_emit(v, registries)) + else + raise "IntrinsicRegistry: unmapped registry key #{k.inspect}" + end + end + e + end + + # A nested sub-descriptor is either another emit Hash or a + # {registry: } pointer (resolved to that registry's name). + def nested_emit(v, registries) + return nil unless v.is_a?(Hash) + if (ptr = v[:registry]) + name = registries.find { |_, r| r.equal?(ptr) }&.first + return IntrinsicEmit.new(registry: name || :unknown) + end + name = registries.find { |_, r| r.equal?(v) }&.first + return IntrinsicEmit.new(registry: name) if name + + build_emit(v, registries) + end + + # Symbol/String type-name -> Type. Inference/macro directives + # (:infer_*, :macro_*) are not type names -> polymorphic placeholder + # (the real resolution is a later, consumer-side concern). + def to_return_type(v) + return Type.new(:Void) if v.nil? + return Type.new(:Any) if v.is_a?(Proc) + s = v.to_s + return Type.new(:Any) if s.start_with?("infer_", "macro_") + v.is_a?(Type) ? v : Type.new(v) + end + + def convert_entry(_name, h, registries) + ret = h.key?(:return_type) ? h[:return_type] : h[:return] + fs = FunctionSignature.new( + params: [], + return_type: to_return_type(ret), + intrinsic: true + ) + fs.return_resolver = ret if ret.is_a?(Proc) + fs.arg_validator = h[:validate] if h[:validate].is_a?(Proc) + fs.arg_spec = h[:args] + fs.arity = h[:arity] + fs.can_fail = h[:can_fail] + fs.needs_rt = h[:needs_rt] + fs.emit = build_emit(h, registries) + fs + end + + # registries: { Symbol => Hash } + def convert_registry(reg, registries) + reg.each_with_object({}) do |(name, entry), out| + out[name] = convert_entry(name, entry, registries) if entry.is_a?(Hash) + end + end +end From 68f4de2654a61422bd6e2c22dccdd8fdbaed1761 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 22:10:48 +0000 Subject: [PATCH 27/45] deslop(epic #65): idempotent IntrinsicRegistry.fs + memoized registries fs(x): nil->nil, FunctionSignature->passthrough, Hash->convert. The single normalizer every `*.stdlib_def =`/`matched_stdlib_def =` site will route through so the carried value is always a FunctionSignature. Memoized registries map (lazy from std_lib constants, no load-order coupling). Additive, gate-clean (specs flaky-fmt only). Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/intrinsic_registry.rb | 22 +++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/src/annotator-helpers/intrinsic_registry.rb b/src/annotator-helpers/intrinsic_registry.rb index eb6e46be0..01e014f88 100644 --- a/src/annotator-helpers/intrinsic_registry.rb +++ b/src/annotator-helpers/intrinsic_registry.rb @@ -101,4 +101,26 @@ def convert_registry(reg, registries) out[name] = convert_entry(name, entry, registries) if entry.is_a?(Hash) end end + + # Memoized registry map (built lazily from the std_lib constants so + # there is no load-order coupling). Used by `fs` so call sites need + # not thread the map. + def registries + @registries ||= %i[STD_LIB POOL_METHODS SET_METHODS MAP_METHODS + INDEX_OPS BUILTIN_OPS].each_with_object({}) do |c, h| + h[c] = Object.const_get(c) if Object.const_defined?(c) + end + end + + # Idempotent normalizer for the flag-day migration: returns a + # FunctionSignature for a registry/ad-hoc entry Hash, passes a + # FunctionSignature through unchanged, and maps nil -> nil. Every + # `*.stdlib_def = X` / `matched_stdlib_def = X` site routes through + # this so the carried value is always a FunctionSignature. + def fs(x, name = "_inline") + return nil if x.nil? + return x if x.is_a?(FunctionSignature) + + convert_entry(name, x, registries) if x.is_a?(Hash) + end end From dd1e231f1b16e297d7a46bb24aea28a343357512 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 22:11:09 +0000 Subject: [PATCH 28/45] deslop: record measured stdlib_def migration scope + execution finding ~100 sites / ~12 files / hot path; no-shim flag-day not correctly reviewable in one pass. Recommends atomic flip via a transient typed-delegating reader on FunctionSignature, readers migrated in gated batches, scaffold deleted last. Co-Authored-By: Claude Opus 4.7 --- docs/agents/deslop-bugs.md | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/docs/agents/deslop-bugs.md b/docs/agents/deslop-bugs.md index e329c710b..87870205b 100644 --- a/docs/agents/deslop-bugs.md +++ b/docs/agents/deslop-bugs.md @@ -198,3 +198,41 @@ made unilaterally under "gates green" (gates green != provably correct for semantic change). #48/#49/#50/#51/#53/#57/#59 share this shape. Recommended: do #47 as a focused reviewed PR using this section as the spec; do not auto-run it. + +## EPIC #65 stdlib_def migration — measured scope & execution finding + +Steps 1-2 landed (IntrinsicEmit T::Struct + total converter + +idempotent IntrinsicRegistry.fs), all gate-clean, inert. Step 3+ +(actually wiring it) was fully measured before changing consumers: + +`stdlib_def`/`matched_stdlib_def` is a pervasive untyped-Hash contract, +NOT a per-registry or single-seam thing: +- ~6 stamp sites (method_analysis:114 `defn.merge(zig:).merge(alloc:)` + — override-by-merge semantics; pipeline_rewriter x4; pipeline_host + forwarding). +- carried on InlineBc/InlineZig/RawZig/RawBc/ShardedMapPut/Get. +- ~15 ad-hoc literal writes (`iz.stdlib_def = {allocates:false, + borrows:[]}` etc. in mir_lowering/test_lowering). +- ~26 matched_stdlib_def + ~24 stdlib_def reads across mir_emitter, + mir_checker, mir_lowering, fsm_transform x3, annotator-helpers x4, + mir_pass, pipeline_host — as [:zig]/[:return]/.dig(:allocates)/ + [:return_alloc]/[:bc_op]/op[kind]/op.keys/.merge. + +Total ~100 edits, ~12 files, including the 40k-line mir_lowering +codegen core, with per-site semantic adaptation (`:return` Symbol -> +Type.void?; dynamic `op[kind]`; `.merge` override; `.dig` chains). + +FINDING: a no-shim flag-day (rewrite all ~60 readers + all writers in +one commit, suite as only net) is not correctly/reviewably executable +in one pass on this hot path — the exact "huge change, tests pass, +compiler subtly broken" anti-pattern this repo's retrospective and +CLAUDE.md forbid. At ~100 sites the scale makes the no-shim flag-day +qualitatively infeasible, not merely "riskier". + +RECOMMENDED execution (contract-level "whole stdlib at once", landed +safely): (a) writes -> IntrinsicRegistry.fs uniformly; (b) +FunctionSignature transiently exposes typed-delegating []/dig/merge so +the flip is atomic and green in one commit; (c) readers migrated to +the pure typed API in gated batches; (d) the delegating scaffold +deleted as the epic's final commit (so it is a migration scaffold, +not a permanent band-aid). Awaiting direction on adopting (b). From a56fd7814662c640c35d8d9dfcd85aa9e81498f1 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 22:49:39 +0000 Subject: [PATCH 29/45] deslop(epic #65): FLAG-DAY stdlib_def -> FunctionSignature (no backdoor) Hard flip per direction: write-seam coercion (AST setter + prepended StdlibDefFsCoercion on the 6 MIR carriers, setter+initialize) makes every stdlib_def/matched_stdlib_def a FunctionSignature; ~25 readers migrated to the typed FS/IntrinsicEmit API; converter totality completed (bc_op/error_kind/error_type/elem/fallible_clauses props; fsm_* passthrough); 2 silent is_a?(Hash) regressions fixed; 5 specs migrated to the typed shape. No compatibility shim. 267 -> 20 failures; remaining 20 bounded+categorized in deslop-bugs.md as the precise finish-the-migration worklist. Co-Authored-By: Claude Opus 4.7 --- docs/agents/deslop-bugs.md | 42 +++++++++++++++++++++ spec/mir_lowering_spec.rb | 10 ++--- src/annotator-helpers/capabilities.rb | 8 ++-- src/annotator-helpers/effects.rb | 4 +- src/annotator-helpers/generic_analysis.rb | 2 +- src/annotator-helpers/intrinsic_emit.rb | 20 +++++++--- src/annotator-helpers/intrinsic_registry.rb | 14 ++++--- src/annotator.rb | 14 +++---- src/ast/ast.rb | 5 ++- src/mir/fsm_transform.rb | 2 +- src/mir/fsm_transform/segments.rb | 2 +- src/mir/fsm_transform/suspend_resolvers.rb | 11 +++--- src/mir/mir.rb | 22 +++++++++++ src/mir/mir_checker.rb | 14 +++---- src/mir/mir_emitter.rb | 10 ++--- src/mir/mir_lowering.rb | 20 +++++----- src/mir/mir_pass.rb | 2 +- 17 files changed, 141 insertions(+), 61 deletions(-) diff --git a/docs/agents/deslop-bugs.md b/docs/agents/deslop-bugs.md index 87870205b..3e852409d 100644 --- a/docs/agents/deslop-bugs.md +++ b/docs/agents/deslop-bugs.md @@ -236,3 +236,45 @@ the flip is atomic and green in one commit; (c) readers migrated to the pure typed API in gated batches; (d) the delegating scaffold deleted as the epic's final commit (so it is a migration scaffold, not a permanent band-aid). Awaiting direction on adopting (b). + +## EPIC #65 — stdlib_def FLAG-DAY executed (no backdoor), 267 -> 20 + +Per explicit direction ("rather have all tests fail and we know what's +left than a backdoor; do it all now"), the hard flip was executed in +one coordinated change -- NO compatibility/delegation shim: + +WRITE SEAM (single point): AST `matched_stdlib_def=` + a prepended +`StdlibDefFsCoercion` on RawZig/InlineZig/InlineBc/RawBc/ShardedMap* +coerce via `IntrinsicRegistry.fs` on both setter AND positional +`initialize` (Struct ctor bypasses setters). Every carried stdlib_def +is now a FunctionSignature (+ typed IntrinsicEmit). + +READERS migrated to the typed API (~25 sites): capabilities, effects, +generic_analysis, mir_pass, fsm_transform(+segments), mir_checker +(`:return` -> `return_type.void?`), mir_lowering, suspend_resolvers, +mir_emitter. Two silent-regression `matched_def.is_a?(Hash)` guards +(annotator resolve_borrow_source / cleanup provenance) fixed. +CONVERTER totality completed: added IntrinsicEmit props bc_op, +error_kind, error_type, elem, fallible_clauses; fsm_* are FsmOps +op-object arrays -> passthrough (not stringified). 5 specs asserting +the old Hash shape migrated to the typed shape. + +Result: 4786 examples, 267 -> **20 failures** (-92.5%), no shim. + +REMAINING 20 (the precise "properly finish" worklist): +1. Pool/sharded codegen (~7): Pool#insert/get/remove, @pool:sharded, + @pool.contains? -- the InlineBc/`pool_get_def` Zig emit path. +2. FSM-IO SuspendResolvers (4): resolve_io / fsm_setup / + fsm_state_decls rendering -- verify FsmOps op-objects flow through + `emit.fsm_*` correctly into `lower_stmts`. +3. ZigTranspiler OG move-emission / COPY-union / heap-cleanup (~6): + the mir_checker `stdlib_owned_return?` / `return_type.void?` + semantic migration shifted some cleanup/move decisions -- audit + owned_return_init? vs the old `:return == :Void` logic. +4. collections.md doc example (1, downstream of #1); FmtVerifier (1, + pre-existing parallel flake, not from this work). + +These are bounded and categorized; transpile-tests/fuzz NOT yet run +(blocked until #1/#3 resolved). This is the intended honest state: +the contract is genuinely flipped with zero backdoor, and exactly +what remains to finish is enumerated above. diff --git a/spec/mir_lowering_spec.rb b/spec/mir_lowering_spec.rb index a907c3048..d4c94883d 100644 --- a/spec/mir_lowering_spec.rb +++ b/spec/mir_lowering_spec.rb @@ -437,7 +437,7 @@ def collect_mir_nodes(root, klass) node = make_binop(left, :ADD, right) result = lowering.lower(node) expect(result).to be_a(MIR::InlineZig) - expect(result.stdlib_def).to include(borrows: :all) + expect(result.stdlib_def.emit.borrows).to eq(:all) expect(emit(result)).to eq("CheatLib.intAdd(a, b)") end @@ -456,7 +456,7 @@ def collect_mir_nodes(root, klass) node = make_binop(left, :EQ, right) result = lowering.lower(node) expect(result).to be_a(MIR::InlineZig) - expect(result.stdlib_def).to include(borrows: :all) + expect(result.stdlib_def.emit.borrows).to eq(:all) expect(emit(result)).to include("CheatLib.eql(name,") end @@ -474,7 +474,7 @@ def collect_mir_nodes(root, klass) node = make_binop(left, :WRAP_ADD, right) result = lowering.lower(node) expect(result).to be_a(MIR::InlineZig) - expect(result.stdlib_def).to include(borrows: :all) + expect(result.stdlib_def.emit.borrows).to eq(:all) expect(emit(result)).to eq("CheatLib.wrapAdd(a, b)") end @@ -923,7 +923,7 @@ def node.reassign_cleanup; @reassign_cleanup; end expect(result).to be_a(MIR::ExprStmt) expect(result.expr).to be_a(MIR::InlineZig) - expect(result.expr.stdlib_def).to include(:value_transforms) + expect(result.expr.stdlib_def.emit.value_transforms).not_to be_nil expect(emit(result)).to include("CheatLib.setAt(items, 0,") end @@ -1467,7 +1467,7 @@ def node.reassign_cleanup; @reassign_cleanup; end node.full_type = :Void result = lowering.lower(node) expect(result).to be_a(MIR::InlineZig) - expect(result.stdlib_def).to include(borrows: :all) + expect(result.stdlib_def.emit.borrows).to eq(:all) expect(emit(result)).to include("CheatLib.assert(true,") end diff --git a/src/annotator-helpers/capabilities.rb b/src/annotator-helpers/capabilities.rb index 2aa12dc6d..7365019b4 100644 --- a/src/annotator-helpers/capabilities.rb +++ b/src/annotator-helpers/capabilities.rb @@ -409,10 +409,10 @@ def predicate_impurity_reason(call, callee) return "can fail" if call.respond_to?(:can_fail) && call.can_fail if call.matched_stdlib_def md = call.matched_stdlib_def - return "allocates" if md[:allocates] - return "can fail" if md[:can_fail] - return "suspends" if md[:suspends] - return "mutates its receiver" if md[:mutates_receiver] + return "allocates" if md.emit&.allocates + return "can fail" if md.can_fail + return "suspends" if md.emit&.suspends + return "mutates its receiver" if md.emit&.mutates_receiver return nil end diff --git a/src/annotator-helpers/effects.rb b/src/annotator-helpers/effects.rb index 24048e3e9..167e43dc7 100644 --- a/src/annotator-helpers/effects.rb +++ b/src/annotator-helpers/effects.rb @@ -690,7 +690,7 @@ def scan_suspend_points(node, fn_node, points) node.each_pair { |_, v| scan_suspend_points(v, fn_node, points) } when AST::FuncCall, AST::MethodCall if func_call_suspends?(node) - kind = node.matched_stdlib_def && node.matched_stdlib_def[:suspends] ? :io : :call + kind = node.matched_stdlib_def&.emit&.suspends ? :io : :call points << { id: points.size, kind: kind, node: node } end node.each_pair { |_, v| scan_suspend_points(v, fn_node, points) } @@ -718,7 +718,7 @@ def with_block_suspends?(node) def func_call_suspends?(node) T.bind(self, SemanticAnnotator) rescue nil @fn_nodes = T.let(@fn_nodes, T.untyped) - return true if node.matched_stdlib_def && node.matched_stdlib_def[:suspends] + return true if node.matched_stdlib_def&.emit&.suspends return false if node.respond_to?(:fn_var_call) && node.fn_var_call callee = @fn_nodes[node.name] return false unless callee diff --git a/src/annotator-helpers/generic_analysis.rb b/src/annotator-helpers/generic_analysis.rb index 025d80fd7..a9eee4c54 100644 --- a/src/annotator-helpers/generic_analysis.rb +++ b/src/annotator-helpers/generic_analysis.rb @@ -706,7 +706,7 @@ def bg_exit_frame_string?(expr) # Check stdlib def for explicit frame allocation (provenance not yet set on expr). if expr.respond_to?(:matched_stdlib_def) msd = expr.matched_stdlib_def - return true if msd.is_a?(Hash) && msd[:return_alloc] == :frame + return true if msd && msd.emit&.return_alloc == :frame end false end diff --git a/src/annotator-helpers/intrinsic_emit.rb b/src/annotator-helpers/intrinsic_emit.rb index c94d5a74a..ed16a3e8b 100644 --- a/src/annotator-helpers/intrinsic_emit.rb +++ b/src/annotator-helpers/intrinsic_emit.rb @@ -25,10 +25,11 @@ class IntrinsicEmit < T::Struct prop :shard_direct_zig, T.nilable(StrOrSym), default: nil # --- FSM emission fragments --- - prop :fsm_setup, T.nilable(T::Array[String]), default: nil - prop :fsm_state_decls, T.nilable(T::Array[String]), default: nil - prop :fsm_finish_block, T.nilable(T::Array[String]), default: nil - prop :fsm_state_finalize, T.nilable(T::Array[String]), default: nil + # FsmOps DSL op-objects, not strings -- passthrough, no coercion. + prop :fsm_setup, T.nilable(T::Array[T.untyped]), default: nil + prop :fsm_state_decls, T.nilable(T::Array[T.untyped]), default: nil + prop :fsm_finish_block, T.nilable(T::Array[T.untyped]), default: nil + prop :fsm_state_finalize, T.nilable(T::Array[T.untyped]), default: nil prop :fsm_finish_value, T.nilable(String), default: nil # --- Dispatch flags --- @@ -50,11 +51,20 @@ class IntrinsicEmit < T::Struct prop :key_alloc, T.nilable(Symbol), default: nil prop :shard_alloc, T.nilable(Symbol), default: nil prop :sharded_alloc, T.nilable(Symbol), default: nil - prop :borrows, T.nilable(Symbol), default: nil + prop :borrows, T.nilable(T.any(Symbol, T::Array[T.untyped])), + default: nil prop :reject_when, T.nilable(Symbol), default: nil prop :bc_op, T.nilable(Symbol), default: nil + prop :error_kind, T.nilable(Symbol), default: nil + prop :error_type, T.nilable(Symbol), default: nil prop :registry, T.nilable(Symbol), default: nil + # elem: transient element-type-name hint (merged at lowering, e.g. + # pool_get_def). fallible_clauses: internal with-block clause + # structure injected at lowering (not authoring DSL). + prop :elem, T.nilable(String), default: nil + prop :fallible_clauses, T.untyped, default: nil + # --- Strings --- prop :lifetime, T.nilable(String), default: nil prop :reject_error, T.nilable(String), default: nil diff --git a/src/annotator-helpers/intrinsic_registry.rb b/src/annotator-helpers/intrinsic_registry.rb index 01e014f88..38d9c9bbf 100644 --- a/src/annotator-helpers/intrinsic_registry.rb +++ b/src/annotator-helpers/intrinsic_registry.rb @@ -15,12 +15,14 @@ module IntrinsicRegistry EMIT_BOOL = %i[bc is_method suspends narrows_collection mutates_receiver allocates takes_value container_borrow].freeze EMIT_STRSYM = %i[zig numeric_zig sharded_zig shard_direct_zig].freeze - EMIT_STR = %i[lifetime reject_error fsm_finish_value].freeze - EMIT_STRARR = %i[fsm_setup fsm_state_decls fsm_finish_block - fsm_state_finalize].freeze + EMIT_STR = %i[lifetime reject_error fsm_finish_value elem].freeze EMIT_SYM = %i[tag builtin alloc return_alloc val_alloc key_alloc - shard_alloc sharded_alloc borrows reject_when - bc_op].freeze + shard_alloc sharded_alloc reject_when bc_op + error_kind error_type].freeze + # Passthrough (no coercion): borrows (:all|Array), fallible_clauses + # (internal), fsm_* (FsmOps op-object arrays, not strings). + EMIT_PASS = %i[borrows fallible_clauses fsm_setup fsm_state_decls + fsm_finish_block fsm_state_finalize].freeze EMIT_SYMARR = %i[value_transforms shard_direct_value_transforms].freeze EMIT_INTARR = %i[takes_args].freeze EMIT_PROC = %i[label].freeze @@ -39,8 +41,8 @@ def build_emit(h, registries) when *EMIT_BOOL then e.public_send("#{k}=", !!v) when *EMIT_STRSYM then e.public_send("#{k}=", v) when *EMIT_STR then e.public_send("#{k}=", v.to_s) - when *EMIT_STRARR then e.public_send("#{k}=", Array(v).map(&:to_s)) when *EMIT_SYM then e.public_send("#{k}=", v.to_sym) + when *EMIT_PASS then e.public_send("#{k}=", v) when *EMIT_SYMARR then e.public_send("#{k}=", Array(v).map(&:to_sym)) when *EMIT_INTARR then e.public_send("#{k}=", Array(v).map(&:to_i)) when *EMIT_PROC then e.public_send("#{k}=", v) diff --git a/src/annotator.rb b/src/annotator.rb index 9566c34c5..6e512f6e9 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -5554,14 +5554,14 @@ def handle_assign_borrow(node) def resolve_borrow_source(call_node) # Path 1: stdlib functions with lifetime: "self" matched_def = call_node.matched_stdlib_def - if matched_def.is_a?(Hash) && matched_def[:lifetime] - lifetime = matched_def[:lifetime] + if matched_def && matched_def.emit&.lifetime + lifetime = matched_def.emit.lifetime if lifetime == "self" && call_node.is_a?(AST::MethodCall) return call_node.object end # Named param lifetime -- find by index in args list args = call_node.is_a?(AST::MethodCall) ? [call_node.object] + call_node.args : call_node.args - arg_types = matched_def[:args] + arg_types = matched_def.arg_spec if arg_types.is_a?(Array) idx = arg_types.index { |a| a.is_a?(Hash) && a[:name] == lifetime } return args[idx] if idx && args[idx] @@ -6491,16 +6491,16 @@ def set_cleanup_alloc!(node) val = node.value if val && (val.is_a?(AST::FuncCall) || val.is_a?(AST::MethodCall)) matched_def = val.matched_stdlib_def - if matched_def.is_a?(Hash) + if matched_def # Borrow returns (lifetime:) need no cleanup -- the caller owns the data - if matched_def[:lifetime] + if matched_def.emit&.lifetime ti.provenance = :borrow return end - ret_alloc = matched_def[:return_alloc] + ret_alloc = matched_def.emit&.return_alloc # For allocating methods without explicit return_alloc, the method's # alloc IS the return alloc (e.g. map.values() on sharded maps). - ret_alloc ||= matched_def[:alloc] if matched_def[:allocates] + ret_alloc ||= matched_def.emit&.alloc if matched_def.emit&.allocates if ret_alloc ti.provenance ||= ret_alloc if [:heap, :frame].include?(ret_alloc) return diff --git a/src/ast/ast.rb b/src/ast/ast.rb index 9f7ee0271..b18ae2e46 100644 --- a/src/ast/ast.rb +++ b/src/ast/ast.rb @@ -3,6 +3,7 @@ require_relative "type" require_relative "schemas" +require_relative "../annotator-helpers/intrinsic_registry" # ========================================== # AST @@ -220,7 +221,9 @@ def zig_pattern=(val); @zig_pattern = T.let(val, T.untyped); end sig { returns(T.untyped) } def matched_stdlib_def; @matched_stdlib_def = T.let(@matched_stdlib_def, T.untyped); end sig { params(val: T.untyped).returns(T.untyped) } - def matched_stdlib_def=(val); @matched_stdlib_def = T.let(val, T.untyped); end + def matched_stdlib_def=(val) + @matched_stdlib_def = T.let(IntrinsicRegistry.fs(val), T.untyped) + end sig { void } def stdlib_allocates; @stdlib_allocates = T.let(@stdlib_allocates, T.untyped); end diff --git a/src/mir/fsm_transform.rb b/src/mir/fsm_transform.rb index 7e04673aa..3d83362f0 100644 --- a/src/mir/fsm_transform.rb +++ b/src/mir/fsm_transform.rb @@ -257,7 +257,7 @@ def suspend_value?(value) return true if value.is_a?(AST::NextExpr) return false unless value.is_a?(AST::FuncCall) || value.is_a?(AST::MethodCall) md = value.matched_stdlib_def - !!(md && md[:suspends] && md[:fsm_setup]) + !!(md && md.emit&.suspends && md.emit&.fsm_setup) end sig { params(name: T.untyped, type_obj: T.untyped).returns(T.nilable(T::Hash[T.untyped, T.untyped])) } diff --git a/src/mir/fsm_transform/segments.rb b/src/mir/fsm_transform/segments.rb index a0e1b1472..7fa43ddf4 100644 --- a/src/mir/fsm_transform/segments.rb +++ b/src/mir/fsm_transform/segments.rb @@ -324,7 +324,7 @@ def classify_suspend(stmt) def io_suspending_call?(call_node) T.bind(self, T.untyped) rescue nil md = call_node.matched_stdlib_def - !!(md && md[:suspends] && md[:fsm_setup]) + !!(md && md.emit&.suspends && md.emit&.fsm_setup) end sig { params(expr: T.untyped).returns(T::Boolean) } diff --git a/src/mir/fsm_transform/suspend_resolvers.rb b/src/mir/fsm_transform/suspend_resolvers.rb index 5e987680c..9656cc9d8 100644 --- a/src/mir/fsm_transform/suspend_resolvers.rb +++ b/src/mir/fsm_transform/suspend_resolvers.rb @@ -61,11 +61,12 @@ def resolve_io(io_tail, ctx, lowering) id = ctx[:id] bg_rt = ctx[:bg_rt] - setup_ops = stdlib_def[:fsm_setup] || [] - finish_block = stdlib_def[:fsm_finish_block] || [] - finish_value = stdlib_def[:fsm_finish_value] - state_decls = stdlib_def[:fsm_state_decls] || [] - state_finalize = stdlib_def[:fsm_state_finalize] || [] + em = stdlib_def.emit + setup_ops = em&.fsm_setup || [] + finish_block = em&.fsm_finish_block || [] + finish_value = em&.fsm_finish_value + state_decls = em&.fsm_state_decls || [] + state_finalize = em&.fsm_state_finalize || [] # Lower call args via the surrounding capture-map context. arg_mirs = (io_tail.call_node.respond_to?(:args) ? diff --git a/src/mir/mir.rb b/src/mir/mir.rb index 01b8d1e1a..3ade9db40 100644 --- a/src/mir/mir.rb +++ b/src/mir/mir.rb @@ -17,6 +17,7 @@ # New nodes here use distinct names to coexist during migration. require "sorbet-runtime" +require_relative "../annotator-helpers/intrinsic_registry" module MIR # Common interface for all MIR nodes. @@ -1794,4 +1795,25 @@ def expr?; true; end :resolved_allocs, :template_kind) do include Expr end + + # Hard flip (EPIC #65): every stdlib_def carrier coerces its payload + # to a FunctionSignature on write. No Hash backdoor -- readers still + # doing entry[:zig]/.dig(:...) will fail loudly, which is the + # intended map of remaining reader-migration work. + module StdlibDefFsCoercion + def stdlib_def=(v) + super(IntrinsicRegistry.fs(v)) + end + + # Struct positional construction (`InlineBc.new(op, args, hash)`) + # assigns the member directly, bypassing the setter -- re-run it + # through the coercing setter so the carrier is always FS. + def initialize(*) + super + self.stdlib_def = stdlib_def + end + end + [RawZig, InlineZig, InlineBc, RawBc, ShardedMapPut, ShardedMapGet].each do |k| + k.prepend(StdlibDefFsCoercion) + end end diff --git a/src/mir/mir_checker.rb b/src/mir/mir_checker.rb index ad0744422..7e763fad6 100644 --- a/src/mir/mir_checker.rb +++ b/src/mir/mir_checker.rb @@ -168,16 +168,16 @@ def owned_return_init?(init) return true if init.is_a?(MIR::TryCatch) && init.heap_provenance if init.is_a?(MIR::InlineZig) || init.is_a?(MIR::RawZig) return false unless stdlib_owned_return?(init) - ret = init.stdlib_def[:return] - return !(ret == :Void || ret.nil?) + ret = init.stdlib_def.return_type + return !ret.void? end false end sig { params(node: T.untyped).returns(T::Boolean) } def stdlib_owned_return?(node) - return false unless node.stdlib_def&.dig(:allocates) - return true if node.stdlib_def[:return_alloc] == :heap + return false unless node.stdlib_def&.emit&.allocates + return true if node.stdlib_def.emit&.return_alloc == :heap return false unless node.is_a?(MIR::InlineZig) allocs = node.allocs @@ -391,8 +391,8 @@ def scan_expr_for_hpt_leak!(node, leaks) "heap-returning try/catch result not bound to variable (leak)") end if (node.is_a?(MIR::InlineZig) || node.is_a?(MIR::RawZig)) && stdlib_owned_return?(node) - ret = node.stdlib_def[:return] - unless ret == :Void || ret.nil? + ret = node.stdlib_def.return_type + unless ret.void? label = node.is_a?(MIR::RawZig) ? "RawZig block" : "stdlib call" leaks << error(:HPT_LEAK, node.reason, "#{label} with allocates:true result not bound to variable (leak)") @@ -750,7 +750,7 @@ def expr_has_frame_alloc?(expr) return false unless expr case expr when MIR::InlineZig - return false if expr.stdlib_def&.dig(:mutates_receiver) + return false if expr.stdlib_def&.emit&.mutates_receiver expr.allocs&.any? { |_k, v| v == :frame } when MIR::DupeSlice, MIR::ConcatStr, MIR::HeapCreate, MIR::AllocSlice, MIR::ContainerInit, MIR::MakeList, MIR::DeepCopy, MIR::CapWrap diff --git a/src/mir/mir_emitter.rb b/src/mir/mir_emitter.rb index e7ebd445b..a53aebce1 100644 --- a/src/mir/mir_emitter.rb +++ b/src/mir/mir_emitter.rb @@ -178,8 +178,8 @@ def emit(node) sig { params(node: MIR::InlineBc).returns(String) } def emit_inline_bc_as_zig(node) entry = node.stdlib_def - raise "emit_inline_bc_as_zig: node has no stdlib_def (:#{node.op})" unless entry && entry[:zig] - pattern = entry[:zig].dup + raise "emit_inline_bc_as_zig: node has no stdlib_def (:#{node.op})" unless entry && entry.emit&.zig + pattern = entry.emit.zig.to_s.dup node.args.each_with_index { |a, i| pattern = pattern.gsub("{#{i}}") { emit(a) } } pattern end @@ -216,7 +216,7 @@ def emit_sharded_map_get(node) def sharded_map_template(node) op = node.stdlib_def kind = node.template_kind || :zig - op[kind] or raise "ShardedMap: op has no :#{kind} template (op keys=#{op.keys})" + op.emit&.public_send(kind) or raise "ShardedMap: op has no :#{kind} template (emit=#{op.emit.inspect})" end sig { params(pattern: String, node: T.untyped).returns(String) } @@ -242,8 +242,8 @@ def sharded_map_substitute_common(pattern, node) sig { params(node: T.untyped).returns(String) } def emit_raw_bc_as_zig(node) entry = node.stdlib_def - raise "emit_raw_bc_as_zig: node has no stdlib_def" unless entry && entry[:zig] - pattern = entry[:zig].dup + raise "emit_raw_bc_as_zig: node has no stdlib_def" unless entry && entry.emit&.zig + pattern = entry.emit.zig.to_s.dup node.args.each_with_index { |a, i| pattern = pattern.gsub("{#{i}}") { emit(a) } } pattern end diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index b399d2a1d..8f88e008b 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -200,7 +200,7 @@ def mir_allocates?(node) when MIR::InlineZig # Only hoist if the node heap-allocates (stdlib_def allocates: true AND # allocs contains a :heap entry -- frame-only intrinsics are excluded). - return false unless node.stdlib_def&.dig(:allocates) + return false unless node.stdlib_def&.emit&.allocates return true unless node.allocs node.allocs.any? { |_k, v| v == :heap } else @@ -1943,8 +1943,8 @@ def lower_intrinsic(node) # When the entry has an explicit :bc_op, prefer it over the AST name so # the BC dispatch key is decoupled from CLEAR's surface naming # (e.g. fileReadAll -> :file_read_all). - if @target == :bc && node.matched_stdlib_def&.dig(:bc) - op_name = node.matched_stdlib_def[:bc_op] || node.name.to_s.to_sym + if @target == :bc && node.matched_stdlib_def&.emit&.bc + op_name = node.matched_stdlib_def.emit&.bc_op || node.name.to_s.to_sym return MIR::InlineBc.new(op_name, mir_args, node.matched_stdlib_def) end @@ -1954,7 +1954,7 @@ def lower_intrinsic(node) # The {alloc} PLACEHOLDER stays in the pattern -- the emitter substitutes it. resolved_allocs = {} if pattern.include?("{alloc}") - alloc_sym = node.matched_stdlib_def&.dig(:alloc) || :node_storage + alloc_sym = node.matched_stdlib_def&.emit&.alloc || :node_storage # Resolve receiver type: MethodCall -> receiver object; UFCS FuncCall -> first arg receiver_type = if node.is_a?(AST::MethodCall) ti = node.object.type_info rescue nil @@ -1967,7 +1967,7 @@ def lower_intrinsic(node) resolved_allocs[:alloc] = resolved # Wrap non-heap strings at TAKES positions in DupeSlice (visible to MIR checker) - stdlib_args = node.matched_stdlib_def&.dig(:args) + stdlib_args = node.matched_stdlib_def&.arg_spec if stdlib_args.is_a?(Array) raw_args = node.is_a?(AST::MethodCall) ? node.args : node.args[1..] raw_args&.each_with_index do |arg_node, ai| @@ -1999,7 +1999,7 @@ def lower_intrinsic(node) # non-literal args pay nothing. We skip it for `:Any` (anytype) and # for arg specs without a concrete declared type (Hash forms whose # `:type` is missing or :Any). - stdlib_args = node.matched_stdlib_def&.dig(:args) + stdlib_args = node.matched_stdlib_def&.arg_spec if stdlib_args.is_a?(Array) args_zig = args_zig.each_with_index.map do |arg_zig, i| coerce_stdlib_arg(arg_zig, stdlib_args[i]) @@ -4327,9 +4327,9 @@ def lower_static_call(node) # bc:true. Both backends consume the same node: Zig emits via # emit_inline_bc_as_zig (substituting {0}, {1}, ... from stdlib_def[:zig]), # BC dispatches by op symbol in compile_inline_bc. - if node.matched_stdlib_def&.dig(:bc) + if node.matched_stdlib_def&.emit&.bc mir_args = node.args.map { |a| hoist_alloc(lower(a), a) } - return MIR::InlineBc.new(node.matched_stdlib_def[:bc_op], + return MIR::InlineBc.new(node.matched_stdlib_def.emit&.bc_op, mir_args, node.matched_stdlib_def) end @@ -6203,8 +6203,8 @@ def owned_return_transfer_binding?(binding_entry, init) end if init.is_a?(MIR::InlineZig) || init.is_a?(MIR::RawZig) - return false unless init.stdlib_def&.dig(:allocates) - return true if init.stdlib_def[:return_alloc] == :heap + return false unless init.stdlib_def&.emit&.allocates + return true if init.stdlib_def.emit&.return_alloc == :heap return false unless init.is_a?(MIR::InlineZig) allocs = init.allocs diff --git a/src/mir/mir_pass.rb b/src/mir/mir_pass.rb index 4a46f12e8..976e55fec 100644 --- a/src/mir/mir_pass.rb +++ b/src/mir/mir_pass.rb @@ -551,7 +551,7 @@ def bg_exit_needs_string_dupe?(expr, t) return true if t.frame? # No explicit provenance: check the stdlib def for frame allocation. msd = expr.matched_stdlib_def - msd.is_a?(Hash) && msd[:return_alloc] == :frame + !!(msd && msd.emit&.return_alloc == :frame) end # Annotate YieldExpr nodes inside a BgStreamBlock that yield frame-allocated strings. From 5b53dbf690fed6a3061e340c0fd58b5a74262f4e Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 22:57:49 +0000 Subject: [PATCH 30/45] =?UTF-8?q?deslop(epic=20#65):=20finish=20the=20rema?= =?UTF-8?q?ining=2020=20=E2=80=94=20flag-day=20fully=20green?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause of 15/20: owned_return_init? / HPT_LEAK read return_type unconditionally; pre-FS they read only the static `:return` key (absent for collection methods using a `:return_type` Proc), so Proc-resolved returns were correctly NOT owned-returns. Restore that: skip when stdlib_def.return_resolver is set (ownership governed by allocates/borrows). Cleared Pool/sharded codegen, OG move-emission, COPY-union, heap-cleanup, collections.md doc. Remaining 4/20: fsm_suspend_resolvers_spec built a raw-Hash stdlib_def (old contract); migrated it to construct the typed FunctionSignature via IntrinsicRegistry.fs (contract genuinely changed; no production shim). EPIC #65 COMPLETE: stdlib_def/matched_stdlib_def is a FunctionSignature everywhere, zero backdoor. 267 -> 0 real failures. Gates: prspec (only pre-existing fmt parallel flake, passes 12/12 serially), transpile-tests 548/548 0 leaks, fuzz 141/141 clean. Co-Authored-By: Claude Opus 4.7 --- spec/fsm_suspend_resolvers_spec.rb | 7 +++++-- src/mir/mir_checker.rb | 10 +++++++++- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/spec/fsm_suspend_resolvers_spec.rb b/spec/fsm_suspend_resolvers_spec.rb index af947eb54..153cf46c7 100644 --- a/spec/fsm_suspend_resolvers_spec.rb +++ b/spec/fsm_suspend_resolvers_spec.rb @@ -7,6 +7,7 @@ require_relative '../src/mir/fsm_ops' require_relative '../src/mir/fsm_transform/segments' require_relative '../src/mir/fsm_transform/suspend_resolvers' +require_relative '../src/annotator-helpers/intrinsic_registry' # Tests for FsmTransform::SuspendResolvers, the per-suspend-kind # resolvers that turn a Segments::*Suspend tail into a @@ -27,8 +28,10 @@ def lower(node); node; end describe "resolve_io" do # Build a fake stdlib_def with a sleep-like fsm_setup template. + # Production stamps go through IntrinsicRegistry.fs -> a typed + # FunctionSignature; this unit test constructs the same shape. let(:stdlib_def) { - { + IntrinsicRegistry.fs({ suspends: true, fsm_setup: [ FsmOps::StmtCall.new( @@ -43,7 +46,7 @@ def lower(node); node; end fsm_state_decls: [ FsmOps::StateFieldDecl.new("rf_fd", "i32", "-1"), ], - } + }) } let(:call_node) { diff --git a/src/mir/mir_checker.rb b/src/mir/mir_checker.rb index 7e763fad6..28567db67 100644 --- a/src/mir/mir_checker.rb +++ b/src/mir/mir_checker.rb @@ -168,6 +168,13 @@ def owned_return_init?(init) return true if init.is_a?(MIR::TryCatch) && init.heap_provenance if init.is_a?(MIR::InlineZig) || init.is_a?(MIR::RawZig) return false unless stdlib_owned_return?(init) + # Receiver-dependent (Proc-resolved) returns -- collection + # intrinsics like pool.insert/get -- are not a static owned- + # return declaration; their ownership is governed by + # allocates/borrows, handled elsewhere. Only a static return + # type counts here (matches pre-FS behavior, which read only + # the static `:return` key). + return false if init.stdlib_def.return_resolver ret = init.stdlib_def.return_type return !ret.void? end @@ -390,7 +397,8 @@ def scan_expr_for_hpt_leak!(node, leaks) leaks << error(:HPT_LEAK, "try-catch", "heap-returning try/catch result not bound to variable (leak)") end - if (node.is_a?(MIR::InlineZig) || node.is_a?(MIR::RawZig)) && stdlib_owned_return?(node) + if (node.is_a?(MIR::InlineZig) || node.is_a?(MIR::RawZig)) && stdlib_owned_return?(node) && + !node.stdlib_def.return_resolver ret = node.stdlib_def.return_type unless ret.void? label = node.is_a?(MIR::RawZig) ? "RawZig block" : "stdlib call" From e902d527743639313c7eaa26900a84853a4fb998 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 23:03:43 +0000 Subject: [PATCH 31/45] deslop(#65 unitA): real startup-conversion API (sigs/sig) IntrinsicRegistry.sigs(reg): memoized typed view of a whole registry (built once per frozen constant) -> name => FunctionSignature, or Array[FS] for overload sets (STD_LIB charAt). sig(reg,name) typed lookup. Consumers will read this instead of the raw Hash. Additive; gate-clean (specs flaky-fmt only). Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/intrinsic_registry.rb | 23 +++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/src/annotator-helpers/intrinsic_registry.rb b/src/annotator-helpers/intrinsic_registry.rb index 38d9c9bbf..0b050d560 100644 --- a/src/annotator-helpers/intrinsic_registry.rb +++ b/src/annotator-helpers/intrinsic_registry.rb @@ -104,6 +104,29 @@ def convert_registry(reg, registries) end end + # Startup conversion (memoized, built once per registry on first + # access — the registries are frozen constants). The typed view of + # a whole registry: name -> FunctionSignature, or + # Array[FunctionSignature] for overload sets (e.g. + # STD_LIB["charAt"]). Consumers read THIS, never the raw Hash. + def sigs(reg) + (@sigs ||= {})[reg.object_id] ||= + reg.each_with_object({}) do |(name, entry), out| + out[name] = + if entry.is_a?(Array) + entry.map { |e| convert_entry(name, e, registries) } + elsif entry.is_a?(Hash) + convert_entry(name, entry, registries) + end + end + end + + # Typed lookup into a registry: reg[name] as FunctionSignature + # (or Array[FS] for overloads, or nil if absent). + def sig(reg, name) + sigs(reg)[name] + end + # Memoized registry map (built lazily from the std_lib constants so # there is no load-order coupling). Used by `fs` so call sites need # not thread the map. From 970b4632fbff5bfab2b5e88d89416bc0c7e25f6d Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 23:16:22 +0000 Subject: [PATCH 32/45] deslop(#65 unitB+C): intrinsic resolution path consumes typed FS find_matching_intrinsic now returns a FunctionSignature (matches on the raw config DSL, converts the winner). FS gains return_spec (the verbatim polymorphic-return union: Type | type-Symbol | infer_* Symbol | Proc | {type:,sync:,ownership:} Hash) so the 4-form host dispatch in visit_IntrinsicFunc reads it typed, losslessly; to_return_type now handles the {type:} Hash form too. Migrated the coupled intrinsic-annotation path to the typed API: visit_IntrinsicFunc, normalize_intrinsic_signature (sig+body), narrow_collection_type! (sig+body), and resolve_typed_method (method_analysis): registry lookup -> IntrinsicRegistry.sig(...) FS, all defn[:k] -> typed accessors, the defn.merge(zig/alloc) override -> dup'd FS+emit. No raw [:key] left on this path; registries stay Hash DSL only as converter input / typo-suggestion key lists. Gates: prspec (flaky-fmt only), transpile 548/548 0 leaks, fuzz 141/141 clean. Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/function_analysis.rb | 15 ++--- src/annotator-helpers/function_signature.rb | 9 +++ src/annotator-helpers/intrinsic_registry.rb | 8 +++ src/annotator-helpers/method_analysis.rb | 61 +++++++++++---------- src/annotator.rb | 24 ++++---- 5 files changed, 70 insertions(+), 47 deletions(-) diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index 91701c56c..3767ef7b3 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -266,12 +266,12 @@ def resolve_call(node, args) end end - sig { params(config: T::Hash[Symbol, T.untyped]).returns(T.nilable(FunctionSignature)) } + sig { params(config: FunctionSignature).returns(T.nilable(FunctionSignature)) } def normalize_intrinsic_signature(config) T.bind(self, SemanticAnnotator) rescue nil - return nil if config[:args] == :Varargs + return nil if config.arg_spec == :Varargs - params = config[:args].each_with_index.map do |arg_def, i| + params = config.arg_spec.each_with_index.map do |arg_def, i| if arg_def.is_a?(Hash) # Extended format: { type: :Int64, mutable: true, takes: false } { @@ -295,9 +295,9 @@ def normalize_intrinsic_signature(config) FunctionSignature.new( params: params, - return_type: config[:return], + return_type: config.return_spec, intrinsic: true, - zig_pattern: config[:zig] + zig_pattern: config.emit&.zig ) end @@ -1003,10 +1003,10 @@ def reject_arg_type_matches?(arg, kind) pred.call(type) end - sig { params(definitions: T::Array[T.untyped], args: T::Array[T.untyped]).returns(T.nilable(T::Hash[Symbol, T.untyped])) } + sig { params(definitions: T::Array[T.untyped], args: T::Array[T.untyped]).returns(T.untyped) } def find_matching_intrinsic(definitions, args) T.bind(self, SemanticAnnotator) rescue nil - definitions.find do |config| + matched = definitions.find do |config| next true if config[:args] == :Varargs # Varargs accepts anything # Arity check @@ -1031,6 +1031,7 @@ def find_matching_intrinsic(definitions, args) end end end + matched && IntrinsicRegistry.fs(matched) end # Formats intrinsic args for error messages diff --git a/src/annotator-helpers/function_signature.rb b/src/annotator-helpers/function_signature.rb index 5dcc19d37..493fc9631 100644 --- a/src/annotator-helpers/function_signature.rb +++ b/src/annotator-helpers/function_signature.rb @@ -32,6 +32,13 @@ class FunctionSignature # codegen/dispatch metadata (IntrinsicEmit). Keeps `return_type` a # pure Type even for receiver-dependent intrinsics. attr_accessor :return_resolver, :arg_validator, :arg_spec, :arity, :emit + # Verbatim registry return spec (the authoring DSL's polymorphic + # return facility): a static Type, a type Symbol, an `infer_*` + # directive Symbol (host-dispatched via send), a Proc, or a + # { type:, sync:, ownership: } Hash. `return_type` is the + # best-effort static view; consumers needing the full dispatch read + # this. Strongly-typed sum, not T.untyped. + attr_accessor :return_spec # P2: REQUIRES clause as { param_name_string => Set[Symbol] } or nil. # Mirrors FunctionDef#requires; needed at signature level so call-site @@ -103,6 +110,7 @@ def initialize(params:, return_type:, return_lifetime: nil, visibility: nil, @arg_spec = T.let(nil, T.untyped) @arity = T.let(nil, T.nilable(Integer)) @emit = T.let(nil, T.nilable(IntrinsicEmit)) + @return_spec = T.let(nil, T.untyped) end sig { returns(FunctionSignature) } @@ -127,6 +135,7 @@ def dup s.arg_spec = @arg_spec s.arity = @arity s.emit = @emit + s.return_spec = @return_spec end end end diff --git a/src/annotator-helpers/intrinsic_registry.rb b/src/annotator-helpers/intrinsic_registry.rb index 0b050d560..193133804 100644 --- a/src/annotator-helpers/intrinsic_registry.rb +++ b/src/annotator-helpers/intrinsic_registry.rb @@ -72,9 +72,16 @@ def nested_emit(v, registries) # Symbol/String type-name -> Type. Inference/macro directives # (:infer_*, :macro_*) are not type names -> polymorphic placeholder # (the real resolution is a later, consumer-side concern). + # Best-effort STATIC view of the return spec. The verbatim spec is + # kept on fs.return_spec for the full host dispatch. def to_return_type(v) return Type.new(:Void) if v.nil? return Type.new(:Any) if v.is_a?(Proc) + if v.is_a?(Hash) && v[:type] + return Type.new(v[:type], sync: v[:sync], ownership: v[:ownership]) + end + return Type.new(:Any) if v.is_a?(Hash) + s = v.to_s return Type.new(:Any) if s.start_with?("infer_", "macro_") v.is_a?(Type) ? v : Type.new(v) @@ -87,6 +94,7 @@ def convert_entry(_name, h, registries) return_type: to_return_type(ret), intrinsic: true ) + fs.return_spec = ret fs.return_resolver = ret if ret.is_a?(Proc) fs.arg_validator = h[:validate] if h[:validate].is_a?(Proc) fs.arg_spec = h[:args] diff --git a/src/annotator-helpers/method_analysis.rb b/src/annotator-helpers/method_analysis.rb index 1fccedafa..d3b21f879 100644 --- a/src/annotator-helpers/method_analysis.rb +++ b/src/annotator-helpers/method_analysis.rb @@ -28,10 +28,10 @@ def resolve_collection_method(node) # # @param matched_def [Hash] the STD_LIB definition that matched # @param args [Array] the resolved argument nodes - sig { params(matched_def: T::Hash[Symbol, T.untyped], args: T::Array[T.untyped]).returns(T.nilable(Type)) } + sig { params(matched_def: FunctionSignature, args: T::Array[T.untyped]).returns(T.nilable(Type)) } def narrow_collection_type!(matched_def, args) T.bind(self, SemanticAnnotator) rescue nil - return unless matched_def[:narrows_collection] && args.size >= 2 + return unless matched_def.emit&.narrows_collection && args.size >= 2 list_arg = args[0] val_arg = args[1] @@ -57,7 +57,7 @@ def narrow_collection_type!(matched_def, args) sig { params(node: AST::MethodCall, obj_type: Type, registry: T::Hash[String, T::Hash[Symbol, T.untyped]], tag_field: Symbol, type_label: String).returns(T.nilable(T::Boolean)) } def resolve_typed_method(node, obj_type, registry, tag_field, type_label) T.bind(self, SemanticAnnotator) rescue nil - defn = registry[node.name] + defn = IntrinsicRegistry.sig(registry, node.name) unless defn available = registry.keys.join(", ") emit_typo_suggestion!( @@ -70,52 +70,57 @@ def resolve_typed_method(node, obj_type, registry, tag_field, type_label) end # Arity check - if defn[:arity] >= 0 && node.args.length != defn[:arity] - if defn[:arity] == 0 + if defn.arity && defn.arity >= 0 && node.args.length != defn.arity + if defn.arity == 0 error!(node, :STDLIB_METHOD_NO_ARGS, label: type_label, method: node.name, got: node.args.length) else - error!(node, :STDLIB_METHOD_ARITY, label: type_label, method: node.name, expected: defn[:arity], got: node.args.length) + error!(node, :STDLIB_METHOD_ARITY, label: type_label, method: node.name, expected: defn.arity, got: node.args.length) end return true end # Type validation (optional) - if defn[:validate] - defn[:validate].call(node, node.args, obj_type, method(:error!)) + if defn.arg_validator + defn.arg_validator.call(node, node.args, obj_type, method(:error!)) end # Set tag and return type node.send(:"#{tag_field}=", node.name.to_sym) - node.full_type = defn[:return_type].call(obj_type) + node.full_type = defn.return_resolver.call(obj_type) # Resolve zig pattern -- pick variant based on receiver type. # Sharded takes priority over numeric: PartitionedNumericMap shares the # sharded API (count/keys/values/put/get) with PartitionedStringMap. - zig = if (obj_type.sharded? || obj_type.striped?) && defn[:sharded_zig] - defn[:sharded_zig] - elsif obj_type.numeric_map? && !obj_type.sharded? && !obj_type.striped? && defn[:numeric_zig] - defn[:numeric_zig] + em = defn.emit + zig = if (obj_type.sharded? || obj_type.striped?) && em&.sharded_zig + em.sharded_zig + elsif obj_type.numeric_map? && !obj_type.sharded? && !obj_type.striped? && em&.numeric_zig + em.numeric_zig else - defn[:zig] + em&.zig end # Resolve alloc variant for sharded types - alloc = if (obj_type.sharded? || obj_type.striped?) && defn[:sharded_alloc] - defn[:sharded_alloc] + alloc = if (obj_type.sharded? || obj_type.striped?) && em&.sharded_alloc + em.sharded_alloc else - defn[:alloc] + em&.alloc end - # Set zig_pattern and matched_stdlib_def so lower_intrinsic handles emission + # Set zig_pattern and matched_stdlib_def so lower_intrinsic handles + # emission. Override the zig/alloc on a dup'd FS (+ its emit) so + # the shared registry FS is never mutated. if zig - resolved_defn = defn.merge(zig: zig) - resolved_defn = resolved_defn.merge(alloc: alloc) if alloc + resolved_defn = defn.dup + resolved_defn.emit = (resolved_defn.emit ? resolved_defn.emit.dup : IntrinsicEmit.new) + resolved_defn.emit.zig = zig + resolved_defn.emit.alloc = alloc if alloc node.zig_pattern = zig node.matched_stdlib_def = resolved_defn end - node.stdlib_allocates = true if defn[:allocates] - node.mutates_receiver = true if defn[:mutates_receiver] + node.stdlib_allocates = true if em&.allocates + node.mutates_receiver = true if em&.mutates_receiver # Narrow Set element type on first insert (Any[] -> T[]) if tag_field == :set_method && node.name == "insert" && obj_type.element_type&.resolved == :Any && node.args.length == 1 @@ -132,8 +137,8 @@ def resolve_typed_method(node, obj_type, registry, tag_field, type_label) end # Ownership: mark TAKES args as moved (same as function_analysis.rb line 305-310) - if defn[:takes_args] - defn[:takes_args].each do |arg_idx| + if defn.emit&.takes_args + defn.emit.takes_args.each do |arg_idx| arg_node = node.args[arg_idx] next unless arg_node if arg_node.is_a?(AST::Identifier) @@ -144,12 +149,12 @@ def resolve_typed_method(node, obj_type, registry, tag_field, type_label) end # Methods that allocate on the heap -- record so needs_rt is computed correctly. - if defn[:allocates] && current_fn_ctx + if defn.emit&.allocates && current_fn_ctx current_fn_ctx.heap_count += 1 end - node.can_fail = true if defn[:can_fail] || defn[:allocates] - node.error_kind = defn[:error_kind] if defn[:error_kind] - node.error_type = defn[:error_type] if defn[:error_type] + node.can_fail = true if defn.can_fail || defn.emit&.allocates + node.error_kind = defn.emit&.error_kind if defn.emit&.error_kind + node.error_type = defn.emit&.error_type if defn.emit&.error_type true end diff --git a/src/annotator.rb b/src/annotator.rb index 6e512f6e9..60d5de7a8 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -2547,8 +2547,8 @@ def visit_IntrinsicFunc(node, args) # `u32_val.negative?()` where Int64 autocast would otherwise mask # the bug. Generic — keyed by symbol so std_lib.rb stays # declarative and annotator.rb has no per-function logic. - if matched_def[:reject_when] && reject_arg_type_matches?(args.first, matched_def[:reject_when]) - reason = matched_def[:reject_error] || + if matched_def.emit&.reject_when && reject_arg_type_matches?(args.first, matched_def.emit.reject_when) + reason = matched_def.emit&.reject_error || "#{node.name}() is not valid for #{args.first.resolved_type}" error!(node, :INTRINSIC_REJECTED, message: reason) return @@ -2557,7 +2557,7 @@ def visit_IntrinsicFunc(node, args) # 3. Resolve return type (may be dynamic via method call). # Dynamic resolver methods are named `infer_*` to avoid collisions with # Ruby Kernel conversion methods (Integer, String, Array, etc.). - ret = matched_def[:return] + ret = matched_def.return_spec if ret.is_a?(Hash) && ret[:type] # Structured return: { type: :String, sync: :raw } etc. — preserves capabilities. node.full_type = Type.new(ret[:type], sync: ret[:sync], ownership: ret[:ownership]) @@ -2570,21 +2570,21 @@ def visit_IntrinsicFunc(node, args) end # 4. Store Zig pattern and stdlib metadata for transpiler - node.zig_pattern = matched_def[:zig] + node.zig_pattern = matched_def.emit&.zig node.matched_stdlib_def = matched_def - node.stdlib_allocates = true if matched_def[:allocates] - node.mutates_receiver = true if matched_def[:mutates_receiver] - node.can_fail = true if matched_def[:can_fail] || matched_def[:allocates] - node.error_kind = matched_def[:error_kind] if matched_def[:error_kind] - node.error_type = matched_def[:error_type] if matched_def[:error_type] - current_fn_ctx.alloc_count += 1 if current_fn_ctx && (matched_def[:allocates] || matched_def[:can_fail] || matched_def[:needs_rt]) - record_effect(EffectTracker::SUSPENDS) if matched_def[:suspends] + node.stdlib_allocates = true if matched_def.emit&.allocates + node.mutates_receiver = true if matched_def.emit&.mutates_receiver + node.can_fail = true if matched_def.can_fail || matched_def.emit&.allocates + node.error_kind = matched_def.emit&.error_kind if matched_def.emit&.error_kind + node.error_type = matched_def.emit&.error_type if matched_def.emit&.error_type + current_fn_ctx.alloc_count += 1 if current_fn_ctx && (matched_def.emit&.allocates || matched_def.can_fail || matched_def.needs_rt) + record_effect(EffectTracker::SUSPENDS) if matched_def.emit&.suspends # 5. Flag mutable access through list indexing. # When a mutating intrinsic (e.g., append, remove) is called on a receiver # that chains through a GetIndex, the GetIndex must emit pointer access # instead of by-value getAt(). - if matched_def[:mutates_receiver] && node.is_a?(AST::MethodCall) + if matched_def.emit&.mutates_receiver && node.is_a?(AST::MethodCall) mark_chain_needs_mut_ref!(node.object) root = chain_root_name(node.object) mark_var_mutated(root) if root From 3ab1a3c6c51fdbb73206aa445ed08f298a1d55ce Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 23:20:46 +0000 Subject: [PATCH 33/45] deslop(#65 unitD): static-method path consumes typed FS visit_StaticCall: method_def via IntrinsicRegistry.sig(static_methods, name) (FS); arg_spec/return_spec/emit.* typed accessors; no raw [:key]. schema[:static_methods] stays the Hash DSL only as converter input / typo-suggestion keys. Gates: prspec flaky-fmt only, transpile 548/548 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- src/annotator.rb | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/src/annotator.rb b/src/annotator.rb index 60d5de7a8..720ca86cd 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -2331,7 +2331,7 @@ def visit_StaticCall(node) end static_methods = schema[:static_methods] || {} - method_def = static_methods[node.method_name] + method_def = IntrinsicRegistry.sig(static_methods, node.method_name) unless method_def available = static_methods.keys.join(", ") @@ -2354,7 +2354,7 @@ def visit_StaticCall(node) end end - expected_args = method_def[:args] + expected_args = method_def.arg_spec if node.args.length != expected_args.length error!(node, :STATIC_ARITY, type: type_name, method: node.method_name, expected: expected_args.length, got: node.args.length) end @@ -2366,17 +2366,17 @@ def visit_StaticCall(node) end end - node.zig_pattern = method_def[:zig] - node.full_type = method_def[:return] + node.zig_pattern = method_def.emit&.zig + node.full_type = method_def.return_spec node.matched_stdlib_def = method_def - node.stdlib_allocates = true if method_def[:allocates] - node.mutates_receiver = true if method_def[:mutates_receiver] - node.can_fail = true if method_def[:can_fail] - node.error_kind = method_def[:error_kind] if method_def[:error_kind] - node.error_type = method_def[:error_type] if method_def[:error_type] - current_fn_ctx.alloc_count += 1 if current_fn_ctx && (method_def[:allocates] || method_def[:can_fail]) - - if method_def[:mutates_receiver] && node.is_a?(AST::MethodCall) + node.stdlib_allocates = true if method_def.emit&.allocates + node.mutates_receiver = true if method_def.emit&.mutates_receiver + node.can_fail = true if method_def.can_fail + node.error_kind = method_def.emit&.error_kind if method_def.emit&.error_kind + node.error_type = method_def.emit&.error_type if method_def.emit&.error_type + current_fn_ctx.alloc_count += 1 if current_fn_ctx && (method_def.emit&.allocates || method_def.can_fail) + + if method_def.emit&.mutates_receiver && node.is_a?(AST::MethodCall) root = chain_root_name(node.object) mark_var_mutated(root) if root end From 39ee739c8102793bddd0086077ac69091001ea71 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 23:24:51 +0000 Subject: [PATCH 34/45] deslop(#65 unitE/F/G): last raw-registry consumers -> typed FS mir_lowering: pool_get_def via IntrinsicRegistry.sig(POOL_METHODS, "get") dup'd + emit.elem; emit_builtin via sig(BUILTIN_OPS,name), emit.bc/emit.zig typed. pipeline_rewriter: all STD_LIB[...] lookups -> IntrinsicRegistry.sig (FS), emit&.zig, no [:zig]. method_rewriter: stdlib_method_names iterates IntrinsicRegistry.sigs(registry) FS; fsm_lowered? reads emit.suspends/fsm_* typed. Gates: prspec flaky-fmt only, transpile 548/548 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- src/backends/pipeline_rewriter.rb | 19 ++++++++++--------- src/mir/mir_lowering.rb | 10 ++++++---- src/tools/method_rewriter.rb | 28 +++++++++++++++++++++++----- 3 files changed, 39 insertions(+), 18 deletions(-) diff --git a/src/backends/pipeline_rewriter.rb b/src/backends/pipeline_rewriter.rb index faf0ac6a6..7a6c4c95f 100644 --- a/src/backends/pipeline_rewriter.rb +++ b/src/backends/pipeline_rewriter.rb @@ -234,9 +234,10 @@ def rewrite_pipeline(node) call = AST::FuncCall.new(rhs.token, rhs.name, [lhs_node]) call.full_type = node.full_type call.storage = node.storage - config = STD_LIB[rhs.name] + config = IntrinsicRegistry.sig(STD_LIB, rhs.name) if config - call.zig_pattern = config.is_a?(Array) ? config.first[:zig] : config[:zig] + sig0 = config.is_a?(Array) ? config.first : config + call.zig_pattern = sig0.emit&.zig end return call end @@ -681,8 +682,8 @@ def build_terminal_action(terminal, current_val, res_var, token, res_type = nil) append = AST::MethodCall.new(token, res_ident, "append", [inner_it.dup]) append.full_type = Type.new(:Void) - append.zig_pattern = STD_LIB["append"][:zig] - append.matched_stdlib_def = STD_LIB["append"] + append.zig_pattern = IntrinsicRegistry.sig(STD_LIB, "append").emit&.zig + append.matched_stdlib_def = IntrinsicRegistry.sig(STD_LIB, "append") # Iterate directly over the expression (avoids ArrayList/slice confusion). # Mark collection as a slice so the transpiler uses &expr, not .items. @@ -697,19 +698,19 @@ def build_terminal_action(terminal, current_val, res_var, token, res_type = nil) insert_call = AST::MethodCall.new(token, res_ident.dup, "insert", [key_expr]) insert_call.full_type = Type.new(:Void) insert_call.zig_pattern = "try {0}.insert({alloc}, {1})" - insert_call.matched_stdlib_def = STD_LIB["insert"] if STD_LIB.key?("insert") + insert_call.matched_stdlib_def = IntrinsicRegistry.sig(STD_LIB, "insert") if STD_LIB.key?("insert") [insert_call] when nil, AST::SelectOp, AST::WhereOp, AST::TapOp, AST::TakeWhileOp # Produces a list call = AST::MethodCall.new(token, res_ident, "append", [current_val.dup]) call.full_type = Type.new(:Void) - call.zig_pattern = STD_LIB["append"][:zig] - call.matched_stdlib_def = STD_LIB["append"] + call.zig_pattern = IntrinsicRegistry.sig(STD_LIB, "append").emit&.zig + call.matched_stdlib_def = IntrinsicRegistry.sig(STD_LIB, "append") [call] else call = AST::MethodCall.new(token, res_ident, "append", [current_val.dup]) - call.zig_pattern = STD_LIB["append"][:zig] - call.matched_stdlib_def = STD_LIB["append"] + call.zig_pattern = IntrinsicRegistry.sig(STD_LIB, "append").emit&.zig + call.matched_stdlib_def = IntrinsicRegistry.sig(STD_LIB, "append") [call] end end diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index 8f88e008b..d9171d848 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -5364,7 +5364,9 @@ def lower_get_index(node) # `IF pool[id] AS env`. elem_t = (ti.is_a?(Type) ? ti : Type.new(ti)).element_type elem_name = elem_t.respond_to?(:resolved) ? T.must(elem_t).resolved.to_s : elem_t.to_s - pool_get_def = POOL_METHODS["get"].merge(elem: elem_name) + pool_get_def = IntrinsicRegistry.sig(POOL_METHODS, "get").dup + pool_get_def.emit = (pool_get_def.emit ? pool_get_def.emit.dup : IntrinsicEmit.new) + pool_get_def.emit.elem = elem_name return MIR::InlineBc.new(:get, [target, index], pool_get_def) elsif ti&.set_collection? # @set[item]: membership check — returns ?T (item if present, null otherwise) @@ -7388,12 +7390,12 @@ def collect_identifier_names(nodes) # with stdlib_def attached so the MIR checker can verify ownership. sig { params(name: Symbol, args: T::Array[T.untyped]).returns(T.any(MIR::InlineBc, MIR::InlineZig)) } def emit_builtin(name, args) - entry = BUILTIN_OPS[name] + entry = IntrinsicRegistry.sig(BUILTIN_OPS, name) raise "emit_builtin: unknown builtin :#{name}" unless entry - if @target == :bc && entry[:bc] + if @target == :bc && entry.emit&.bc return MIR::InlineBc.new(name, args, entry) end - pattern = entry[:zig].dup + pattern = entry.emit&.zig.to_s.dup # Use block form of gsub so backslashes in Zig code (e.g. "\\" for a literal # backslash) are not interpreted as replacement specials by String#gsub. args.each_with_index { |a, i| code = emit_expr(a); pattern = pattern.gsub("{#{i}}") { code } } diff --git a/src/tools/method_rewriter.rb b/src/tools/method_rewriter.rb index c836e2a88..425ea7324 100644 --- a/src/tools/method_rewriter.rb +++ b/src/tools/method_rewriter.rb @@ -1,4 +1,6 @@ # typed: strict +require "sorbet-runtime" + require 'set' require_relative '../ast/lexer' require_relative '../ast/parser' @@ -20,8 +22,11 @@ # Nested METHOD calls (`length(filter(xs, p))` with both METHODs) # rewrite inside-out to method chains (`xs.filter(p).length()`). module MethodRewriter + extend T::Sig + module_function + sig { params(source: String).returns(String) } def rewrite(source) tokens = ::Lexer.new(source).tokenize ast = ::Parser.new(tokens, source).parse @@ -45,6 +50,7 @@ def rewrite(source) # User declarations always take precedence over stdlib — if the # user wrote `FN length(xs) -> ...`, calls to `length(xs)` stay in # prefix form regardless of stdlib's flag. + sig { params(ast: AST::Program).returns(Set) } def collect_method_names(ast) user_methods = Set.new user_fns = Set.new @@ -85,17 +91,18 @@ def walk_collect_user_decls(node, methods, fns) -> { MAP_METHODS rescue nil }, ].freeze + sig { returns(Set) } def stdlib_method_names @stdlib_method_names ||= begin names = Set.new STDLIB_REGISTRIES.each do |loader| registry = loader.call next unless registry.is_a?(Hash) - registry.each do |name, defs| + IntrinsicRegistry.sigs(registry).each do |name, defs| list = defs.is_a?(Array) ? defs : [defs] list.each do |d| - next unless d.is_a?(Hash) - next unless d[:is_method] + next unless d.is_a?(FunctionSignature) + next unless d.emit&.is_method # Skip stdlib functions whose Zig lowering is FSM-based # (suspending I/O calls like readFile / writeFile / accept). # Their MIR/FSM lowering reads the call's positional args @@ -119,9 +126,12 @@ def stdlib_method_names # yields, and the `fsm_*` keys carry the templates the FSM emitter # reads. Either alone wouldn't be enough — `suspends: true` is also # set on plain async helpers that don't go through FSM. + sig { params(defn: FunctionSignature).returns(T::Boolean) } def fsm_lowered?(defn) - return false unless defn[:suspends] - defn.keys.any? { |k| k.to_s.start_with?("fsm_") } + em = defn.emit + return false unless em&.suspends + !!(em.fsm_setup || em.fsm_state_decls || em.fsm_finish_block || + em.fsm_state_finalize || em.fsm_finish_value) end # Post-order walk: collect edits for inner calls first so outer @@ -156,6 +166,7 @@ def walk_collect_edits(node, methods, source, edits) # (e.g., contains a comment we'd rather not move). Source span is # the byte range from the start of the callee name to the closing # `)`, inclusive. + sig { params(call: AST::FuncCall, source: String).returns(T.nilable(Hash)) } def compute_edit(call, source) start_off = offset_for(source, call.token.line, call.token.column) return nil unless start_off @@ -242,6 +253,7 @@ def needs_parens?(node, text) # ---- Source / span helpers ---- + sig { params(source: String, line: Integer, col: Integer).returns(Integer) } def offset_for(source, line, col) return nil if line < 1 || col < 1 off = 0 @@ -257,6 +269,7 @@ def offset_for(source, line, col) target end + sig { params(source: String, off: Integer).returns(Integer) } def next_non_ws(source, off) while off < source.length && (source[off] == ' ' || source[off] == "\t") off += 1 @@ -267,6 +280,7 @@ def next_non_ws(source, off) # Find matching ')' for '(' at `open_off`, respecting nested parens, # brackets, braces, and string literals. Returns the byte offset of # the matching ')' or nil if unbalanced. + sig { params(source: String, open_off: Integer).returns(Integer) } def match_paren(source, open_off) depth = 0 i = open_off @@ -313,6 +327,7 @@ def match_paren(source, open_off) # Split args_text into [start, end_exclusive] spans by top-level # commas. Respects nested parens / brackets / braces and strings. + sig { params(args_text: String).returns(Array) } def split_args_by_comma(args_text) spans = [] depth = 0 @@ -368,6 +383,7 @@ def split_args_by_comma(args_text) # produces inside-out edits which are nested (overlapping). To get # the chain rewrite (`xs.filter(p).length()`) we apply the inner # edit first to the *replacement string* of the outer edit. + sig { params(source: String, edits: Array).returns(String) } def apply_edits(source, edits) # Post-order has inner edits first. Group: an inner edit is one # whose span is strictly inside an outer edit's span. Process by @@ -385,6 +401,7 @@ def apply_edits(source, edits) # the inner's original source text). Returns a flat list of # non-overlapping outer edits with replacements that include all # inner rewrites embedded. + sig { params(edits: Array, source: String).returns(Array) } def resolve_nested_edits(edits, source) outers = [] edits.each do |e| @@ -404,6 +421,7 @@ def resolve_nested_edits(edits, source) outers end + sig { params(source: String, edits: Array).returns(String) } def apply_flat_edits(source, edits) return source if edits.empty? # Apply right-to-left so unaffected positions remain valid. From 1bd62c7bd793c4cdf23375b6750cf111a545475a Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 23:27:03 +0000 Subject: [PATCH 35/45] deslop(#65): correct diagnostic prose referencing old matched_def[:k] The reject_when diagnostic summary referenced the pre-migration Hash mechanism; reworded to the typed signature. Final verification: zero raw-registry [:key] behavior reads; zero residual def [:key] code reads; the remaining T::Hash sigs are per-arg param descriptors (FunctionSignature#params elements, legitimately Hash). Gates: prspec flaky-fmt only (12/12 serial), transpile 548/548 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- src/ast/diagnostic_registry.rb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/ast/diagnostic_registry.rb b/src/ast/diagnostic_registry.rb index 151386282..9f21a13c4 100644 --- a/src/ast/diagnostic_registry.rb +++ b/src/ast/diagnostic_registry.rb @@ -2456,7 +2456,7 @@ module DiagnosticRegistry INTRINSIC_REJECTED: { severity: :error, category: :type, template: "%{message}", - summary: "Stdlib intrinsic rejected this call (matched_def[:reject_when] fired).", + summary: "Stdlib intrinsic rejected this call (the matched signature's reject_when fired).", cause: "A stdlib intrinsic (`.negative?`, `.zero?`, ...) rejected this call because the argument type isn't allowed. The stdlib uses `reject_when` patterns to rule out call shapes that look valid but produce wrong results — e.g. `.negative?` on an unsigned int.", fix_hint: "Check the message for the specific reject reason. Often the fix is to remove the call entirely (the answer is statically known) or use a different intrinsic.", }, From 3a448711c3a95fe3abcb865145e3bef1587f5427 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sat, 16 May 2026 23:55:22 +0000 Subject: [PATCH 36/45] Add FunctionReturn typed return model (Step 1: foundation) Introduces FunctionReturn (T::Struct + Kind T::Enum), the strongly-typed replacement for the untyped return union (Type | Symbol | nil | Proc | Hash) and the std_lib `return_type: ->(recv){...}` Procs. Every return is one of a closed set of variants; `resolve(receiver, args, host)` always yields a concrete non-nil Type. Wired additively into FunctionSignature as `return_def` (non-nil, defaults to Fixed(Void)); existing return_type/return_spec/ return_resolver paths are untouched. Subsequent steps migrate the std_lib registry descriptors and resolution sites onto it. Defers type.rb's require of function_signature to after `class Type` is fully defined so the function_signature -> function_return -> type require cycle resolves with Type present (FunctionReturn's `const :fixed, T.nilable(Type)` evaluates at class-body time). All Type refs to FunctionSignature are runtime-lazy, so deferral is safe. Gates: prspec 4786/1 (pre-existing fmt parallel flake only), transpile-tests 548/548 0 leaks, fuzz matrix 141/141. Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/function_return.rb | 87 +++++++++++++++++++++ src/annotator-helpers/function_signature.rb | 9 +++ src/ast/type.rb | 10 ++- 3 files changed, 104 insertions(+), 2 deletions(-) create mode 100644 src/annotator-helpers/function_return.rb diff --git a/src/annotator-helpers/function_return.rb b/src/annotator-helpers/function_return.rb new file mode 100644 index 000000000..706573368 --- /dev/null +++ b/src/annotator-helpers/function_return.rb @@ -0,0 +1,87 @@ +# typed: strict +# Strongly-typed representation of a function's return. +# +# Replaces the untyped `return_type` union (Type | Symbol | nil | +# Proc | Hash) and the std_lib `return_type: ->(recv){...}` Procs. +# Every return is one of a closed set of variants; `resolve` always +# yields a concrete non-nil Type. No Proc, no Hash, no nil. +# +# Fixed -> a concrete Type (covers all static returns, +# incl. {type:,sync:} -> Type.new with caps, +# and the implicit-Void case) +# ElementOf -> receiver.element_type +# OptionalOfElement -> ?element_type +# IdOfElement -> Id +# OptionalOfValue -> ?value_type +# ValueList -> value_type[]@list +# KeyList -> key_type[]@list +# Infer -> a host inference method (bounded Symbol set: +# infer_element_type / infer_optional_element_type +# / infer_map_return_type) -- a typed variant, +# not a Proc; resolve dispatches via the host. +require "sorbet-runtime" +require_relative "../ast/type" + +class FunctionReturn < T::Struct + extend T::Sig + + class Kind < T::Enum + enums do + Fixed = new("fixed") + ElementOf = new("element_of") + OptionalOfElement = new("optional_of_element") + IdOfElement = new("id_of_element") + OptionalOfValue = new("optional_of_value") + ValueList = new("value_list") + KeyList = new("key_list") + Infer = new("infer") + end + end + + const :kind, Kind + # Payload for Fixed only (the concrete return Type). For every + # parametric variant this is nil because the Type is computed from + # the receiver at resolve time -- that is the variant's whole point, + # not an "untyped" hole. + const :fixed, T.nilable(Type), default: nil + # Payload for Infer only: the host inference method name (bounded). + const :infer, T.nilable(Symbol), default: nil + + sig { params(t: Type).returns(FunctionReturn) } + def self.fixed(t) = new(kind: Kind::Fixed, fixed: t) + + sig { params(m: Symbol).returns(FunctionReturn) } + def self.infer(m) = new(kind: Kind::Infer, infer: m) + + # Resolve to a concrete Type. receiver is the call's receiver type + # (for parametric shapes); args/host support the Infer variant's + # host-method dispatch. Always returns a Type, never nil. + sig do + params(receiver: T.nilable(Type), args: T::Array[T.untyped], + host: T.untyped).returns(Type) + end + def resolve(receiver, args = [], host = nil) + case kind + when Kind::Fixed + T.must(fixed) + when Kind::ElementOf + el = receiver&.element_type + el.is_a?(Type) ? el : Type.new(el || :Any) + when Kind::OptionalOfElement + Type.new(:"?#{T.must(receiver).element_type.resolved}") + when Kind::IdOfElement + Type.new(:"Id<#{T.must(receiver).element_type.resolved}>") + when Kind::OptionalOfValue + Type.new(:"?#{T.must(receiver).value_type.resolved}") + when Kind::ValueList + Type.new(:"#{T.must(receiver).value_type.resolved}[]@list") + when Kind::KeyList + Type.new(:"#{T.must(receiver).key_type.resolved}[]@list") + when Kind::Infer + r = host.send(T.must(infer), args, nil) + r.is_a?(Type) ? r : Type.new(r || :Any) + else + Type.new(:Any) + end + end +end diff --git a/src/annotator-helpers/function_signature.rb b/src/annotator-helpers/function_signature.rb index 493fc9631..fec9b97cd 100644 --- a/src/annotator-helpers/function_signature.rb +++ b/src/annotator-helpers/function_signature.rb @@ -7,6 +7,7 @@ # need for code generation and cleanup planning. require "sorbet-runtime" require_relative "intrinsic_emit" +require_relative "function_return" class FunctionSignature extend T::Sig @@ -39,6 +40,11 @@ class FunctionSignature # best-effort static view; consumers needing the full dispatch read # this. Strongly-typed sum, not T.untyped. attr_accessor :return_spec + # Strongly-typed return (FunctionReturn). Non-nil; defaults to + # Fixed(Void). Supersedes return_spec/return_resolver/the Symbol|nil + # return_type union -- resolve(receiver,args,host) always yields a + # concrete Type. + attr_accessor :return_def # P2: REQUIRES clause as { param_name_string => Set[Symbol] } or nil. # Mirrors FunctionDef#requires; needed at signature level so call-site @@ -111,6 +117,8 @@ def initialize(params:, return_type:, return_lifetime: nil, visibility: nil, @arity = T.let(nil, T.nilable(Integer)) @emit = T.let(nil, T.nilable(IntrinsicEmit)) @return_spec = T.let(nil, T.untyped) + @return_def = T.let(FunctionReturn.fixed(Type.new(:Void)), + FunctionReturn) end sig { returns(FunctionSignature) } @@ -136,6 +144,7 @@ def dup s.arity = @arity s.emit = @emit s.return_spec = @return_spec + s.return_def = @return_def end end end diff --git a/src/ast/type.rb b/src/ast/type.rb index de505af9e..d02b44135 100644 --- a/src/ast/type.rb +++ b/src/ast/type.rb @@ -1,8 +1,6 @@ # typed: strict require "sorbet-runtime" -require_relative "../annotator-helpers/function_signature" - # Result struct for binary operation type resolution BinaryOpResult = Struct.new(:type, :left_coercion, :right_coercion, :storage, :error, keyword_init: true) @@ -2332,3 +2330,11 @@ def check_prefixed_int_range!(node, effective_type) end end + +# Loaded after `class Type` is fully defined so the +# function_signature -> function_return -> type require cycle resolves +# with `Type` already present (function_return's `const :fixed, +# T.nilable(Type)` evaluates at class-body time). All Type refs to +# FunctionSignature are runtime-lazy (method bodies), so deferring +# this require is safe. +require_relative "../annotator-helpers/function_signature" From bced16637fd9d52bb29b297f56ff6db686805986 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 00:19:51 +0000 Subject: [PATCH 37/45] Make FunctionReturn the live return path; delete return_spec/resolver (Step 2) The untyped intrinsic return union (Symbol | Hash | Proc | nil) is eliminated. The std_lib registries stay Hash-authored, but every `return_type:`/`:return` descriptor is now declarative: - constant Procs (->(_) { :Bool }) -> the bare type Symbol/Hash - receiver-parametric Procs -> r_* directives (r_element_of, r_optional_element, r_id_element, r_optional_value, r_value_list, r_key_list) mapped to FunctionReturn variants - the toList args-lambda -> :infer_to_list host method - :infer_* stays Infer(host method) IntrinsicRegistry#to_return_def converts each descriptor to a typed FunctionReturn (raises on any stray Proc); fs.return_type is now derived from it (Fixed -> concrete Type, else :Any placeholder), making FunctionReturn the single source of truth. All five resolution sites switched to return_def.resolve(receiver, args, host): collection methods (method_analysis), STD_LIB return (annotator), visit_StaticCall, visit_GetIndex/INDEX_OPS, and normalize_intrinsic_signature. MIRChecker's HPT_LEAK gates now use the typed FunctionSignature#fixed_return? predicate instead of the return_resolver Proc. FunctionSignature#return_spec and #return_resolver are deleted entirely (no readers remain). infer_* host methods take a nilable node (resolve dispatches without a call node). intrinsic_registry_spec migrated to assert the FunctionReturn variant model. Gates: prspec 4786/1 (pre-existing fmt parallel flake only), transpile-tests 548/548 0 leaks (6 pre-existing ast_node sig logs unchanged from baseline), fuzz matrix 141/141. Co-Authored-By: Claude Opus 4.7 --- spec/intrinsic_registry_spec.rb | 25 +++++- src/annotator-helpers/function_analysis.rb | 2 +- src/annotator-helpers/function_return.rb | 7 ++ src/annotator-helpers/function_signature.rb | 33 ++++---- src/annotator-helpers/intrinsic_registry.rb | 67 +++++++++++----- src/annotator-helpers/method_analysis.rb | 2 +- src/annotator.rb | 40 ++++++---- src/ast/std_lib.rb | 84 +++++++++------------ src/mir/mir_checker.rb | 4 +- 9 files changed, 158 insertions(+), 106 deletions(-) diff --git a/spec/intrinsic_registry_spec.rb b/spec/intrinsic_registry_spec.rb index 675dc3266..390715586 100644 --- a/spec/intrinsic_registry_spec.rb +++ b/spec/intrinsic_registry_spec.rb @@ -27,15 +27,32 @@ end end - it "yields a pure Type return_type and Proc resolver fidelity" do + it "yields a pure Type return_type and a typed FunctionReturn (no Proc/Hash)" do REGISTRIES.each_value do |reg| reg.each do |mname, entry| next unless entry.is_a?(Hash) fs = IntrinsicRegistry.convert_entry(mname, entry, REGISTRIES) expect(fs.return_type).to be_a(Type) + expect(fs.return_def).to be_a(FunctionReturn) src = entry.key?(:return_type) ? entry[:return_type] : entry[:return] - expect(fs.return_resolver).to be_a(Proc) if src.is_a?(Proc) + # No Proc/Hash leakage: every descriptor maps to a closed + # FunctionReturn variant, and the static return_type matches + # the FunctionReturn for the Fixed case. + expect(src).not_to be_a(Proc) + if fs.return_def.kind == FunctionReturn::Kind::Fixed + expect(fs.return_type).to eq(fs.return_def.fixed) + else + expect(fs.return_type.resolved).to eq(:Any) + end + case src + when :r_element_of + expect(fs.return_def.kind).to eq(FunctionReturn::Kind::ElementOf) + when :r_id_element + expect(fs.return_def.kind).to eq(FunctionReturn::Kind::IdOfElement) + when :r_optional_value + expect(fs.return_def.kind).to eq(FunctionReturn::Kind::OptionalOfValue) + end expect(fs.emit).to be_a(IntrinsicEmit).or be_nil expect(fs.intrinsic).to be(true) end @@ -49,7 +66,9 @@ expect(fs.emit.tag).to eq(:pool_method) expect(fs.emit.is_method).to be(true) expect(fs.emit.zig).to be_a(String) - expect(fs.return_resolver).to be_a(Proc) + # POOL_METHODS["insert"] returns `Id` -> IdOfElement variant. + expect(fs.return_def).to be_a(FunctionReturn) + expect(fs.return_def.kind).to eq(FunctionReturn::Kind::IdOfElement) # Nested recursive sub-descriptor (eql/cleanup/... -> IntrinsicEmit) nested = REGISTRIES.each_value.flat_map(&:values) diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index 3767ef7b3..b1ce40a66 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -295,7 +295,7 @@ def normalize_intrinsic_signature(config) FunctionSignature.new( params: params, - return_type: config.return_spec, + return_type: config.return_type, intrinsic: true, zig_pattern: config.emit&.zig ) diff --git a/src/annotator-helpers/function_return.rb b/src/annotator-helpers/function_return.rb index 706573368..312d946fe 100644 --- a/src/annotator-helpers/function_return.rb +++ b/src/annotator-helpers/function_return.rb @@ -53,6 +53,13 @@ def self.fixed(t) = new(kind: Kind::Fixed, fixed: t) sig { params(m: Symbol).returns(FunctionReturn) } def self.infer(m) = new(kind: Kind::Infer, infer: m) + # A receiver-parametric variant (ElementOf / OptionalOfElement / + # IdOfElement / OptionalOfValue / ValueList / KeyList) by Kind + # constant name. No payload -- the Type is computed from the + # receiver at resolve time. + sig { params(kind_name: Symbol).returns(FunctionReturn) } + def self.variant(kind_name) = new(kind: Kind.const_get(kind_name)) + # Resolve to a concrete Type. receiver is the call's receiver type # (for parametric shapes); args/host support the Infer variant's # host-method dispatch. Always returns a Type, never nil. diff --git a/src/annotator-helpers/function_signature.rb b/src/annotator-helpers/function_signature.rb index fec9b97cd..a7aca241a 100644 --- a/src/annotator-helpers/function_signature.rb +++ b/src/annotator-helpers/function_signature.rb @@ -27,23 +27,14 @@ class FunctionSignature attr_accessor :intrinsic, :zig_pattern # Intrinsic signature semantics (set by the registry converter; nil - # for ordinary user functions). `return_resolver` is the polymorphic - # return Proc (receiver-type -> Type); `arg_validator` the custom - # arg type-checker; `arg_spec` the raw args shape; `emit` the typed - # codegen/dispatch metadata (IntrinsicEmit). Keeps `return_type` a - # pure Type even for receiver-dependent intrinsics. - attr_accessor :return_resolver, :arg_validator, :arg_spec, :arity, :emit - # Verbatim registry return spec (the authoring DSL's polymorphic - # return facility): a static Type, a type Symbol, an `infer_*` - # directive Symbol (host-dispatched via send), a Proc, or a - # { type:, sync:, ownership: } Hash. `return_type` is the - # best-effort static view; consumers needing the full dispatch read - # this. Strongly-typed sum, not T.untyped. - attr_accessor :return_spec + # for ordinary user functions). `arg_validator` the custom arg + # type-checker; `arg_spec` the raw args shape; `emit` the typed + # codegen/dispatch metadata (IntrinsicEmit). + attr_accessor :arg_validator, :arg_spec, :arity, :emit # Strongly-typed return (FunctionReturn). Non-nil; defaults to - # Fixed(Void). Supersedes return_spec/return_resolver/the Symbol|nil - # return_type union -- resolve(receiver,args,host) always yields a - # concrete Type. + # Fixed(Void). The single return facility -- resolve(receiver, + # args, host) always yields a concrete Type. Replaced the former + # untyped return_spec (Symbol|Hash|Proc|nil) / return_resolver Proc. attr_accessor :return_def # P2: REQUIRES clause as { param_name_string => Set[Symbol] } or nil. @@ -111,16 +102,20 @@ def initialize(params:, return_type:, return_lifetime: nil, visibility: nil, @return_strategy = T.let(nil, T.untyped) @stack_tier = T.let(nil, T.untyped) @requires = T.let(nil, T.untyped) - @return_resolver = T.let(nil, T.nilable(Proc)) @arg_validator = T.let(nil, T.nilable(Proc)) @arg_spec = T.let(nil, T.untyped) @arity = T.let(nil, T.nilable(Integer)) @emit = T.let(nil, T.nilable(IntrinsicEmit)) - @return_spec = T.let(nil, T.untyped) @return_def = T.let(FunctionReturn.fixed(Type.new(:Void)), FunctionReturn) end + # True iff the return is a static Fixed Type (not receiver-parametric + # or host-inferred). Callers that only honor a statically-declared + # owned return (e.g. the MIR HPT_LEAK check) gate on this. + sig { returns(T::Boolean) } + def fixed_return? = @return_def.kind == FunctionReturn::Kind::Fixed + sig { returns(FunctionSignature) } def dup FunctionSignature.new( @@ -138,12 +133,10 @@ def dup s.return_strategy = @return_strategy s.stack_tier = @stack_tier s.requires = @requires - s.return_resolver = @return_resolver s.arg_validator = @arg_validator s.arg_spec = @arg_spec s.arity = @arity s.emit = @emit - s.return_spec = @return_spec s.return_def = @return_def end end diff --git a/src/annotator-helpers/intrinsic_registry.rb b/src/annotator-helpers/intrinsic_registry.rb index 193133804..9294d2332 100644 --- a/src/annotator-helpers/intrinsic_registry.rb +++ b/src/annotator-helpers/intrinsic_registry.rb @@ -69,33 +69,66 @@ def nested_emit(v, registries) build_emit(v, registries) end - # Symbol/String type-name -> Type. Inference/macro directives - # (:infer_*, :macro_*) are not type names -> polymorphic placeholder - # (the real resolution is a later, consumer-side concern). - # Best-effort STATIC view of the return spec. The verbatim spec is - # kept on fs.return_spec for the full host dispatch. - def to_return_type(v) - return Type.new(:Void) if v.nil? - return Type.new(:Any) if v.is_a?(Proc) - if v.is_a?(Hash) && v[:type] - return Type.new(v[:type], sync: v[:sync], ownership: v[:ownership]) + # Best-effort STATIC view of the return, derived from the typed + # FunctionReturn (single source of truth). Fixed -> its concrete + # Type; receiver-parametric / host-inferred -> polymorphic + # placeholder (the real resolution is consumer-side via + # return_def.resolve, gated by fixed_return?). + def to_return_type(rdef) + if rdef.kind == FunctionReturn::Kind::Fixed + rdef.fixed || Type.new(:Void) + else + Type.new(:Any) + end + end + + # Declarative receiver-parametric return directives (replace the old + # `return_type: ->(recv){...}` Procs). Mapped to FunctionReturn + # variants whose Type is computed from the receiver at resolve time. + RETURN_VARIANTS = { + r_element_of: :ElementOf, + r_optional_element: :OptionalOfElement, + r_id_element: :IdOfElement, + r_optional_value: :OptionalOfValue, + r_value_list: :ValueList, + r_key_list: :KeyList + }.freeze + + # Registry return descriptor -> FunctionReturn (strongly typed, + # non-nil). No Proc, no Hash, no bare nil escape: every form maps to + # Fixed(Type) | a receiver-parametric variant | Infer(host method). + def to_return_def(v) + return FunctionReturn.fixed(Type.new(:Void)) if v.nil? + return FunctionReturn.fixed(v) if v.is_a?(Type) + if v.is_a?(Hash) + return FunctionReturn.fixed( + v[:type] ? Type.new(v[:type], sync: v[:sync], ownership: v[:ownership]) + : Type.new(:Any) + ) + end + if v.is_a?(Proc) + raise "IntrinsicRegistry: Proc return descriptor is not allowed; " \ + "use a declarative directive (r_* variant or infer_* host method)" + end + if (kind = RETURN_VARIANTS[v]) + return FunctionReturn.variant(kind) end - return Type.new(:Any) if v.is_a?(Hash) s = v.to_s - return Type.new(:Any) if s.start_with?("infer_", "macro_") - v.is_a?(Type) ? v : Type.new(v) + return FunctionReturn.infer(v.to_sym) if s.start_with?("infer_", "macro_") + + FunctionReturn.fixed(Type.new(v)) end def convert_entry(_name, h, registries) - ret = h.key?(:return_type) ? h[:return_type] : h[:return] + ret = h.key?(:return_type) ? h[:return_type] : h[:return] + rdef = to_return_def(ret) fs = FunctionSignature.new( params: [], - return_type: to_return_type(ret), + return_type: to_return_type(rdef), intrinsic: true ) - fs.return_spec = ret - fs.return_resolver = ret if ret.is_a?(Proc) + fs.return_def = rdef fs.arg_validator = h[:validate] if h[:validate].is_a?(Proc) fs.arg_spec = h[:args] fs.arity = h[:arity] diff --git a/src/annotator-helpers/method_analysis.rb b/src/annotator-helpers/method_analysis.rb index d3b21f879..26ae1e84d 100644 --- a/src/annotator-helpers/method_analysis.rb +++ b/src/annotator-helpers/method_analysis.rb @@ -86,7 +86,7 @@ def resolve_typed_method(node, obj_type, registry, tag_field, type_label) # Set tag and return type node.send(:"#{tag_field}=", node.name.to_sym) - node.full_type = defn.return_resolver.call(obj_type) + node.full_type = defn.return_def.resolve(obj_type, [], self) # Resolve zig pattern -- pick variant based on receiver type. # Sharded takes priority over numeric: PartitionedNumericMap shares the diff --git a/src/annotator.rb b/src/annotator.rb index 720ca86cd..0026021c7 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -2367,7 +2367,7 @@ def visit_StaticCall(node) end node.zig_pattern = method_def.emit&.zig - node.full_type = method_def.return_spec + node.full_type = method_def.return_def.resolve(nil, node.args, self) node.matched_stdlib_def = method_def node.stdlib_allocates = true if method_def.emit&.allocates node.mutates_receiver = true if method_def.emit&.mutates_receiver @@ -2557,17 +2557,7 @@ def visit_IntrinsicFunc(node, args) # 3. Resolve return type (may be dynamic via method call). # Dynamic resolver methods are named `infer_*` to avoid collisions with # Ruby Kernel conversion methods (Integer, String, Array, etc.). - ret = matched_def.return_spec - if ret.is_a?(Hash) && ret[:type] - # Structured return: { type: :String, sync: :raw } etc. — preserves capabilities. - node.full_type = Type.new(ret[:type], sync: ret[:sync], ownership: ret[:ownership]) - elsif ret.is_a?(Symbol) && ret.to_s.start_with?("infer_") && respond_to?(ret, true) - node.full_type = send(ret, args, node) - elsif ret.respond_to?(:call) - node.full_type = ret.call(args.map(&:resolved_type), node) - else - node.full_type = ret - end + node.full_type = matched_def.return_def.resolve(nil, args, self) # 4. Store Zig pattern and stdlib metadata for transpiler node.zig_pattern = matched_def.emit&.zig @@ -3313,7 +3303,8 @@ def visit_GetIndex(node) if op # Registry-driven: type and ownership from INDEX_OPS - node.full_type = op[:return_type].call(target_type_info) + node.full_type = IntrinsicRegistry.to_return_def(op[:return_type]) + .resolve(target_type_info, [], self) node.container_borrow = true if op[:container_borrow] # Validate key types for maps @@ -4298,7 +4289,9 @@ def visit_CopyNode(node) end # Infer return type for list.remove(i) — returns the element type. - sig { params(args: T::Array[T.untyped], node: AST::MethodCall).returns(Symbol) } + # `node` is unused (the receiver is args.first); nilable because + # FunctionReturn#resolve dispatches without a call node. + sig { params(args: T::Array[T.untyped], node: T.untyped).returns(Symbol) } def infer_element_type(args, node) receiver = args.first ti = receiver&.type_info @@ -4314,6 +4307,25 @@ def infer_optional_element_type(args, node) :"?#{elem}" end + # Infer return type for stream/list `.toList()` — an owned heap list + # of the receiver's element type (unwrapping stream/promise tenses). + sig { params(args: T::Array[T.untyped], node: T.untyped).returns(Type) } + def infer_to_list(args, node) + recv_t = Type.new(args[0].resolved_type) + elem_t = if recv_t.dynamic_stream? || recv_t.promise_list? + recv_t.tense_type.element_type + elsif recv_t.bounded_stream? + recv_t.stream_element_type + elsif recv_t.inf_stream? + recv_t.inf_stream_element_type + elsif recv_t.open_stream? + recv_t.open_stream_element_type + else + recv_t.element_type + end + Type.new(:"#{elem_t.resolved}[]", collection: :list, location: :heap) + end + sig { params(node: AST::LinkNode).returns(T.nilable(Type)) } def visit_LinkNode(node) visit(node.value) diff --git a/src/ast/std_lib.rb b/src/ast/std_lib.rb index ae7bffde3..e88b71a08 100644 --- a/src/ast/std_lib.rb +++ b/src/ast/std_lib.rb @@ -209,21 +209,7 @@ "toList" => [ { args: [:"Any[]"], - return: lambda { |args, _node| - recv_t = Type.new(args[0]) - elem_t = if recv_t.dynamic_stream? || recv_t.promise_list? - recv_t.tense_type.element_type - elsif recv_t.bounded_stream? - recv_t.stream_element_type - elsif recv_t.inf_stream? - recv_t.inf_stream_element_type - elsif recv_t.open_stream? - recv_t.open_stream_element_type - else - recv_t.element_type - end - Type.new(:"#{elem_t.resolved}[]", collection: :list, location: :heap) - }, + return: :infer_to_list, zig: "try ({0}).toList({rt}.heapAlloc())", bc: true, bc_op: :to_list, allocates: true, @@ -1007,10 +993,12 @@ # ============================================================================ # Method Registry — type-specific method definitions for Pool and HashMap # ============================================================================ -# Each entry: { arity: N, validate: lambda, return_type: lambda, tag: symbol } +# Each entry: { arity: N, validate: lambda, return_type: , tag: symbol } # arity: expected arg count (-1 = any) # validate: lambda(node, args, obj_type, error_fn) — type-check args -# return_type: lambda(obj_type) — compute return type from receiver type +# return_type: declarative return directive (a type Symbol/Hash, an +# r_* receiver-parametric variant, or an infer_* host +# method) -> FunctionReturn via IntrinsicRegistry # tag: symbol to set on the node (pool_method / map_method) POOL_METHODS = T.let({ @@ -1028,14 +1016,14 @@ error_fn.call(node, "Pool.insert: argument type #{arg_type} does not match pool element type #{elem.resolved}") end }, - return_type: ->(obj_type) { Type.new(:"Id<#{obj_type.element_type.resolved}>") }, + return_type: :r_id_element, is_method: true, }, "get" => { arity: 1, tag: :pool_method, bc: true, zig: "{0}.get({1})", - return_type: ->(obj_type) { Type.new(:"?#{obj_type.element_type.resolved}") }, + return_type: :r_optional_element, borrows: :all, # returns borrowed pointer into pool storage, is_method: true, }, @@ -1044,7 +1032,7 @@ bc: true, zig: "{0}.remove({1})", mutates_receiver: true, - return_type: ->(_) { :Void }, + return_type: :Void, borrows: :all, # pool frees the slot internally, is_method: true, }, @@ -1052,7 +1040,7 @@ arity: 0, tag: :pool_method, bc: true, zig: "{0}.length()", - return_type: ->(_) { Type.new(:Int64) }, + return_type: :Int64, borrows: :all, is_method: true, }, @@ -1060,7 +1048,7 @@ arity: 1, tag: :pool_method, bc: true, zig: "({0}.get({1}) != null)", - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1068,7 +1056,7 @@ arity: 0, tag: :pool_method, bc: true, zig: "({0}.length() == 0)", - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1076,7 +1064,7 @@ arity: 0, tag: :pool_method, bc: true, zig: "({0}.length() > 0)", - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1098,14 +1086,14 @@ error_fn.call(node, "Set.insert: argument type #{arg_type} does not match set element type #{elem.resolved}") end }, - return_type: ->(_) { :Void }, + return_type: :Void, is_method: true, }, "contains?" => { arity: 1, tag: :set_method, zig: "{0}.contains({1})", bc: true, - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1115,7 +1103,7 @@ bc: true, alloc: :heap, mutates_receiver: true, - return_type: ->(_) { :Void }, + return_type: :Void, borrows: :all, # set frees the element internally, is_method: true, }, @@ -1123,7 +1111,7 @@ arity: 0, tag: :set_method, zig: "{0}.length()", bc: true, - return_type: ->(_) { Type.new(:Int64) }, + return_type: :Int64, borrows: :all, is_method: true, }, @@ -1131,7 +1119,7 @@ arity: 0, tag: :set_method, zig: "({0}.length() == 0)", bc: true, - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1139,7 +1127,7 @@ arity: 0, tag: :set_method, zig: "({0}.length() > 0)", bc: true, - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1163,7 +1151,7 @@ error_fn.call(node, "HashMap.put: key must be a String, got #{args[0].resolved_type}") unless key_type.string? end }, - return_type: ->(_) { :Void }, + return_type: :Void, is_method: true, }, "delete" => { @@ -1181,7 +1169,7 @@ error_fn.call(node, "HashMap.delete: key must be a String, got #{args[0].resolved_type}") unless arg_type.string? end }, - return_type: ->(_) { :Void }, + return_type: :Void, borrows: :all, # map frees key+value internally, is_method: true, }, @@ -1198,7 +1186,7 @@ error_fn.call(node, "HashMap.contains?: key must be a String, got #{args[0].resolved_type}") unless arg_type.string? end }, - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1207,7 +1195,7 @@ zig: "{0}.count()", bc: true, numeric_zig: "CheatLib.numericMapCount({key_zig}, {val_zig}, {0})", - return_type: ->(_) { Type.new(:Int64) }, + return_type: :Int64, borrows: :all, is_method: true, }, @@ -1216,7 +1204,7 @@ zig: "{0}.count()", bc: true, numeric_zig: "CheatLib.numericMapCount({key_zig}, {val_zig}, {0})", - return_type: ->(_) { Type.new(:Int64) }, + return_type: :Int64, borrows: :all, is_method: true, }, @@ -1237,7 +1225,7 @@ # type-mismatch in CheatLib.cleanup at the binding's defer site. # For string-keyed HashMap, key_type defaults to String; # numeric maps return e.g. `Int64[]@list`. - return_type: ->(obj_type) { :"#{obj_type.key_type.resolved}[]@list" }, + return_type: :r_key_list, borrows: :all, # borrows map; returns new owned list, is_method: true, }, @@ -1249,7 +1237,7 @@ zig: "({0}.count() == 0)", bc: true, numeric_zig: "(CheatLib.numericMapCount({key_zig}, {val_zig}, {0}) == 0)", - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1258,7 +1246,7 @@ zig: "({0}.count() > 0)", bc: true, numeric_zig: "(CheatLib.numericMapCount({key_zig}, {val_zig}, {0}) > 0)", - return_type: ->(_) { :Bool }, + return_type: :Bool, borrows: :all, is_method: true, }, @@ -1272,7 +1260,7 @@ numeric_zig: "try CheatLib.numericMapValues({key_zig}, {val_zig}, {alloc}, {0})", # See the matching note on `keys`: this allocates an owned list, # so the declared type must be `T[]@list`, not the bare slice. - return_type: ->(obj_type) { :"#{obj_type.value_type.resolved}[]@list" }, + return_type: :r_value_list, borrows: :all, # borrows map; returns new owned list, is_method: true, }, @@ -1284,7 +1272,7 @@ # Keyed by container kind (:string_map, :numeric_map, :array, :pool, :set_collection). # Each entry has :get and/or :set with: # zig: Zig pattern string ({target}, {index}, {value}, {alloc}, {key_alloc}, etc.) -# return_type: lambda(container_type) -> return type for get +# return_type: declarative return directive (r_* variant / type) for get # container_borrow: true if get returns a borrowed view (no cleanup) # takes_value: true if set takes ownership of the value # allocates: true if set requires an allocator @@ -1304,7 +1292,7 @@ get: { zig: "{target}.get({index})", shard_direct_zig: "{target}.getDirect({shard_idx}, {shard_key})", - return_type: ->(ct) { :"?#{ct.value_type.resolved}" }, + return_type: :r_optional_value, container_borrow: true, bc: true, bc_op: :map_get, }, @@ -1326,7 +1314,7 @@ zig: "CheatLib.numericMapGet({key_zig}, {val_zig}, {target}, {index})", sharded_zig: "{target}.get({index})", shard_direct_zig: "{target}.getDirect({shard_idx}, {shard_key})", - return_type: ->(ct) { :"?#{ct.value_type.resolved}" }, + return_type: :r_optional_value, container_borrow: true, bc: true, bc_op: :map_get, }, @@ -1346,7 +1334,7 @@ array: { get: { zig: "CheatLib.getAt({target}, {index})", - return_type: ->(ct) { ct.element_type }, + return_type: :r_element_of, container_borrow: true, }, set: { @@ -1358,7 +1346,7 @@ list: { get: { zig: "CheatLib.getAt({target}, {index})", - return_type: ->(ct) { ct.element_type }, + return_type: :r_element_of, container_borrow: true, }, set: { @@ -1370,7 +1358,7 @@ pool: { get: { zig: "{target}.get({index})", - return_type: ->(ct) { :"?#{ct.element_type.resolved}" }, + return_type: :r_optional_element, container_borrow: false, }, set: { @@ -1384,7 +1372,7 @@ set_collection: { get: { zig: "if ({target}.contains({index})) {index} else null", - return_type: ->(ct) { Type.new(:"?#{ct.element_type.resolved}") }, + return_type: :r_optional_element, container_borrow: true, }, set: { @@ -1399,7 +1387,7 @@ get: { # O(1) byte access on String@raw. No allocation. builtin: :charAt, - return_type: ->(_t) { Type.new(:String, sync: :raw) }, + return_type: {type: :String, sync: :raw}, container_borrow: true, }, # No :set — strings are immutable. @@ -1408,7 +1396,7 @@ get: { # Byte indexing on String@symbol — same as @raw, returns @symbol slice. builtin: :charAt, - return_type: ->(_t) { Type.new(:String, sync: :symbol) }, + return_type: {type: :String, sync: :symbol}, container_borrow: true, }, # No :set — symbols are immutable. diff --git a/src/mir/mir_checker.rb b/src/mir/mir_checker.rb index 28567db67..5b2b6272a 100644 --- a/src/mir/mir_checker.rb +++ b/src/mir/mir_checker.rb @@ -174,7 +174,7 @@ def owned_return_init?(init) # allocates/borrows, handled elsewhere. Only a static return # type counts here (matches pre-FS behavior, which read only # the static `:return` key). - return false if init.stdlib_def.return_resolver + return false unless init.stdlib_def.fixed_return? ret = init.stdlib_def.return_type return !ret.void? end @@ -398,7 +398,7 @@ def scan_expr_for_hpt_leak!(node, leaks) "heap-returning try/catch result not bound to variable (leak)") end if (node.is_a?(MIR::InlineZig) || node.is_a?(MIR::RawZig)) && stdlib_owned_return?(node) && - !node.stdlib_def.return_resolver + node.stdlib_def.fixed_return? ret = node.stdlib_def.return_type unless ret.void? label = node.is_a?(MIR::RawZig) ? "RawZig block" : "stdlib call" From e8f4fe4bc4228d892cc243f1ce5bc3e127e83e63 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 00:39:18 +0000 Subject: [PATCH 38/45] Launder AST::FunctionDef#return_type to Type|nil; collapse 5 guards (Phase 3) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Symbol pollution in a function's declared/inferred return is killed at a single seam. AST::FunctionDef now coerces return_type to Type at BOTH construction (positional Struct init from the parser and synthetic builders like compiler_frontend's test wrapper / union method synthesis) and post-parse assignment (return inference, auto-infer). nil is preserved — it is the genuine "undeclared" signal that implicit-return inference consumes (node.return_type.nil? / declared_return || :Any), which the user authorized keeping. With return_type now strictly Type|nil (never Symbol), five defensive `is_a?(Type)` Symbol/Type discriminators are dead and collapse to direct use / nil-checks: - annotator.rb:674 validate guard -> `if node.return_type` - annotator.rb:814 -> `node.return_type || Type.new(:Void)` - mir_lowering:1284 -> `node.return_type || Type.new(:Void)` - mir_lowering:1495 -> `node.return_type` (Type|nil already) - mir_lowering:2182 -> `return_type` (sig already guarantees Type; dead defensive branch on a typed param) annotator.rb:263 (program_has_auto?) is intentionally left: it is a duck-typed walk over arbitrary program nodes, not a return_type source-contract guard. Gates: prspec 4773/0, transpile-tests 554/554 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- src/annotator.rb | 4 ++-- src/ast/ast.rb | 18 ++++++++++++++++++ src/mir/mir_lowering.rb | 6 +++--- 3 files changed, 23 insertions(+), 5 deletions(-) diff --git a/src/annotator.rb b/src/annotator.rb index 0026021c7..a3ad455b0 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -671,7 +671,7 @@ def visit_FunctionDef(node) # Make type params visible during type annotation validation node.params.each { |p| validate_type_annotation!(node, p[:type], is_param: true) if p[:type].is_a?(Type) } - validate_type_annotation!(node, node.return_type) if node.return_type.is_a?(Type) + validate_type_annotation!(node, node.return_type) if node.return_type # 3. Pre-declaration (so the function can be recursive) signature = FunctionSignature.new( @@ -811,7 +811,7 @@ def visit_FunctionDef(node) # CATCH wrappers heap-dupe all string returns (both success and catch paths). # A fallible String return is an error union, so unwrap the payload # before classifying string-return ownership. - ret_type = node.return_type.is_a?(Type) ? node.return_type : Type.new(node.return_type || :Void) + ret_type = node.return_type || Type.new(:Void) bare_ret = if ret_type.respond_to?(:error_union?) && ret_type.error_union? && ret_type.respond_to?(:payload_type) ret_type.payload_type || ret_type diff --git a/src/ast/ast.rb b/src/ast/ast.rb index b18ae2e46..425434dc6 100644 --- a/src/ast/ast.rb +++ b/src/ast/ast.rb @@ -559,6 +559,24 @@ def metatype include HasBodies sig { returns(T::Array[T::Array[T.untyped]]) } def child_bodies = [body].compact + + # Seam: a function's declared/inferred return is always a Type + # (or nil when undeclared — the implicit-return signal that + # inference consumes). Coerced at BOTH construction (positional + # Struct init from parser/synthetic builders) and post-parse + # assignment (return inference, auto-infer) so no reader needs + # an `is_a?(Type)` Symbol/Type discriminator. + def initialize(*) + super + rt = self[:return_type] + self[:return_type] = Type.new(rt) unless rt.nil? || rt.is_a?(Type) + end + + sig { params(val: T.untyped).void } + def return_type=(val) + self[:return_type] = val.nil? || val.is_a?(Type) ? val : Type.new(val) + end + attr_accessor :type_params # Array of type param name strings, e.g. ["T", "K"], or nil # True when the user wrote RETURNS explicitly; fallible-signature checks # only enforce on user-authored return types. diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index d9171d848..ff2dde7b8 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -1281,7 +1281,7 @@ def lower_function_def(node) node.uses_frame end uses_frame_or_alloc = has_frame_bindings || node.uses_alloc - ret_type_obj = node.return_type.is_a?(Type) ? node.return_type : Type.new(node.return_type || :Void) + ret_type_obj = node.return_type || Type.new(:Void) # Unwrap `!T` so value-type and string-return classification sees the # payload; otherwise frame save/restore is skipped for error-union returns. bare_ret = if ret_type_obj.respond_to?(:error_union?) && ret_type_obj.error_union? && @@ -1492,7 +1492,7 @@ def build_post_outer_fn(node, params_mir, return_type_str, fn_needs_rt, vis, com # `anyerror!T`, and any whitespace variants the formatter might # emit. Type#error_union? / Type#void? / Type#payload_type are # the single source of truth. - rt_obj = node.return_type.is_a?(Type) ? node.return_type : (node.return_type ? Type.new(node.return_type) : nil) + rt_obj = node.return_type is_error_union = !!(rt_obj && rt_obj.error_union?) payload_type = is_error_union ? rt_obj.payload_type : rt_obj is_void = !!(payload_type && payload_type.respond_to?(:void?) && payload_type.void?) @@ -2179,7 +2179,7 @@ def extern_call_args_zig(argc, alloc_kind) sig { params(id: Integer, prefix: String, args_tuple_name: String, frame_name: String, arg_codes: T::Array[T.untyped], arg_field_types: NilClass, arg_tuple: String, alloc_kind: T.nilable(Symbol), return_type: Type, call_zig: String, receiver_field: T.nilable(String)).returns(MIR::InlineZig) } def build_extern_trampoline_common(id:, prefix:, args_tuple_name:, frame_name:, arg_codes:, arg_field_types:, arg_tuple:, alloc_kind:, return_type:, call_zig:, receiver_field:) - ret_t = return_type.is_a?(Type) ? return_type : Type.new(return_type || :Void) + ret_t = return_type can_fail = ret_t.error_union? payload_t = can_fail ? ret_t.payload_type : ret_t returns_void = payload_t.resolved == :Void From 6889e282b7768d2fa700971d36aaa7642560b7e2 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 00:57:26 +0000 Subject: [PATCH 39/45] FunctionSignature#return_type is a non-nil Type seam (Phase 4) Kills the Symbol/nil pollution in a function signature's return. The attr_accessor is replaced by a coercing seam: a sig-typed `attr_reader :return_type` returning Type and a `return_type=` writer that maps nil -> Type.new(:Void) and Symbol/String -> Type.new, so the field is ALWAYS a non-nil Type. initialize routes through the writer; the constructor sig tightens from `return_type: T.untyped` to `return_type: T.nilable(Type)` (no Symbol in the T.any). Producers that injected the `:Any` Symbol now pass a Type: - from_function_def -> fn.return_type || Type.new(:Any) - annotator extern sig -> node.return_type || Type.new(:Any) - annotator pre-decl sig -> node.return_type || Type.new(:Any) - build_lambda_signature -> Type.new(return_type) (annotator's `declared_return`/FunctionContext `== :Any` semantics are untouched; only the FunctionSignature feed is coerced.) With return_type now guaranteed non-nil Type, the dead Symbol/nil fallbacks and discriminators collapse: - mir_lowering lower_lambda -> sig.return_type.zig_type - annotator extern-method -> method_sig.return_type - union.rb method-return chk -> sig.return_type.resolved - type.rb accepts_fn_type? -> @raw.return_type.accepts?(...) (-2 is_a?(Type) guards) - type.rb fn-type zig emit -> @raw.return_type.zig_type (-1 is_a?(Type) guard) Spec migration: 3 mir_lowering_spec FunctionSignature.new sites that relied on the old loose contract now pass Type.new(:Sym) (the tightened, correct contract). is_a?(Type) in src/: 307 -> 304. Gates: prspec 4773/0, transpile-tests 554/554 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- spec/mir_lowering_spec.rb | 6 ++--- src/annotator-helpers/function_analysis.rb | 2 +- src/annotator-helpers/function_signature.rb | 25 ++++++++++++++++----- src/annotator-helpers/union.rb | 2 +- src/annotator.rb | 8 +++---- src/ast/type.rb | 11 ++++----- src/mir/mir_lowering.rb | 3 +-- 7 files changed, 34 insertions(+), 23 deletions(-) diff --git a/spec/mir_lowering_spec.rb b/spec/mir_lowering_spec.rb index d4c94883d..d0092f8b1 100644 --- a/spec/mir_lowering_spec.rb +++ b/spec/mir_lowering_spec.rb @@ -1911,7 +1911,7 @@ def make_fn(name, params: [], return_type: :Void, body: [], visibility: nil, node.full_type = :Void sig = FunctionSignature.new( params: [{ name: "count", type: Type.new(:Int64), mutable: true }], - return_type: :Void + return_type: Type.new(:Void) ) result = lowering(fn_sigs: { "bump" => sig }).lower(node) @@ -1927,7 +1927,7 @@ def make_fn(name, params: [], return_type: :Void, body: [], visibility: nil, node = AST::FuncCall.new(tok, "identity", [arg]) node.full_type = :Int64 node.generic_type_args = [:Int64] - sig = FunctionSignature.new(params: [{ name: "x", type: Type.new(:Int64) }], return_type: :Int64) + sig = FunctionSignature.new(params: [{ name: "x", type: Type.new(:Int64) }], return_type: Type.new(:Int64)) sig.needs_rt = true result = lowering(fn_sigs: { "identity" => sig }).lower(node) @@ -2003,7 +2003,7 @@ def make_fn(name, params: [], return_type: :Void, body: [], visibility: nil, body = make_lit(:NUMBER, 42, full_type: :Int64) body.coerced_type = :Int64 node = AST::LambdaLit.new(tok, [], nil, body, nil, nil) - node.full_type = FunctionSignature.new(params: [], return_type: :Int64) + node.full_type = FunctionSignature.new(params: [], return_type: Type.new(:Int64)) result = lowering.lower(node) expect(result).to be_a(MIR::LambdaExpr) zig = emit(result) diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index b1ce40a66..57fbcb72f 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -78,7 +78,7 @@ def build_lambda_signature(params, return_type) } end - FunctionSignature.new(params: normalized_params, return_type: return_type) + FunctionSignature.new(params: normalized_params, return_type: Type.new(return_type)) end # Resolve a function call: look up the function, dispatch based on type diff --git a/src/annotator-helpers/function_signature.rb b/src/annotator-helpers/function_signature.rb index a7aca241a..f01f71cf0 100644 --- a/src/annotator-helpers/function_signature.rb +++ b/src/annotator-helpers/function_signature.rb @@ -14,7 +14,22 @@ class FunctionSignature # Static signature fields (set at creation) attr_reader :params, :visibility, :type_params, :reentrant - attr_accessor :return_type, :return_lifetime, :return_strategy + attr_accessor :return_lifetime, :return_strategy + + # Seam: a function signature's return is ALWAYS a Type (Void for + # "no value"). Coerced here so callers may pass nil/Symbol during + # construction or late return-inference assignment without any + # reader ever needing a Symbol/Type/nil discriminator. + sig { returns(Type) } + attr_reader :return_type + + sig { params(val: T.untyped).void } + def return_type=(val) + @return_type = T.let( + val.nil? ? Type.new(:Void) : (val.is_a?(Type) ? val : Type.new(val)), + Type + ) + end # EXTERN function fields attr_accessor :extern, :module_alias, :extern_effects @@ -52,7 +67,7 @@ def self.from_function_def(fn) else FunctionSignature.new( params: fn.params || [], - return_type: fn.return_type || :Any, + return_type: fn.return_type || Type.new(:Any), return_lifetime: fn.return_lifetime, visibility: fn.visibility, type_params: fn.type_params, @@ -75,14 +90,14 @@ def self.sync_from_function_def!(sig, fn) sig end - sig { params(params: T::Array[T::Hash[Symbol, T.untyped]], return_type: T.untyped, return_lifetime: T.untyped, visibility: T.nilable(Symbol), type_params: T.nilable(T::Array[Symbol]), reentrant: T::Boolean, extern: T::Boolean, module_alias: T.nilable(String), extern_effects: T.nilable(T::Hash[Symbol, Symbol]), fn_type_params: T.nilable(T::Array[Symbol]), owner_type: T.nilable(String), owner_type_params: T.nilable(T::Array[T.untyped]), intrinsic: T::Boolean, zig_pattern: T.nilable(String)).void } - def initialize(params:, return_type:, return_lifetime: nil, visibility: nil, + sig { params(params: T::Array[T::Hash[Symbol, T.untyped]], return_type: T.nilable(Type), return_lifetime: T.untyped, visibility: T.nilable(Symbol), type_params: T.nilable(T::Array[Symbol]), reentrant: T::Boolean, extern: T::Boolean, module_alias: T.nilable(String), extern_effects: T.nilable(T::Hash[Symbol, Symbol]), fn_type_params: T.nilable(T::Array[Symbol]), owner_type: T.nilable(String), owner_type_params: T.nilable(T::Array[T.untyped]), intrinsic: T::Boolean, zig_pattern: T.nilable(String)).void } + def initialize(params:, return_type: nil, return_lifetime: nil, visibility: nil, type_params: nil, reentrant: false, extern: false, module_alias: nil, extern_effects: nil, fn_type_params: nil, owner_type: nil, owner_type_params: nil, intrinsic: false, zig_pattern: nil) @params = params - @return_type = return_type + self.return_type = return_type @return_lifetime = return_lifetime @visibility = visibility @type_params = type_params diff --git a/src/annotator-helpers/union.rb b/src/annotator-helpers/union.rb index 69e67e4c4..9d45a289b 100644 --- a/src/annotator-helpers/union.rb +++ b/src/annotator-helpers/union.rb @@ -81,7 +81,7 @@ def validate_union_methods!(node) # Return type check if req[:return_type] req_ret = to_type(req[:return_type]).resolved - sig_ret = to_type(sig.return_type).resolved + sig_ret = sig.return_type.resolved unless req_ret == sig_ret || req_ret == :Any || sig_ret == :Any error!(req_tok, :UNION_METHOD_RETURN_TYPE, union: union_name, method: fn_name, expected: req_ret, fn: fn_name, got: sig_ret) end diff --git a/src/annotator.rb b/src/annotator.rb index a3ad455b0..8142941ef 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -547,7 +547,7 @@ def visit_ExternFnDecl(node) mutable: p[:mutable] || false, comptime: p[:comptime] || false }}, - return_type: node.return_type || :Any, + return_type: node.return_type || Type.new(:Any), visibility: :pub, extern: true, module_alias: node.from_module, @@ -606,7 +606,7 @@ def pre_register_function(node) takes: p[:takes] || false, sync: (p[:type].is_a?(Type) && p[:type].any_sync?) ? p[:type].sync : nil }}, - return_type: (node.return_type || :Any), + return_type: node.return_type || Type.new(:Any), return_lifetime: get_lifetime_path(node), visibility: node.visibility, reentrant: node.reentrant == :reentrant @@ -680,7 +680,7 @@ def visit_FunctionDef(node) default: p[:default], mutable: p[:mutable], takes: p[:takes], sync: (p[:type].is_a?(Type) && p[:type].any_sync?) ? p[:type].sync : nil }}, - return_type: declared_return, return_lifetime: lifetime_paths, + return_type: node.return_type || Type.new(:Any), return_lifetime: lifetime_paths, visibility: node.visibility, type_params: fn_type_params.any? ? fn_type_params : nil, reentrant: node.reentrant == :reentrant @@ -2458,7 +2458,7 @@ def visit_MethodCall(node) node.extern_call = true node.extern_effects = method_sig.extern_effects if method_sig.extern_effects node.instance_variable_set(:@extern_method, true) - node.full_type = method_sig.return_type || :Void + node.full_type = method_sig.return_type record_effect(EffectTracker::EXTERN) # Track allocator usage for EFFECTS :alloc methods. alloc_kind = method_sig.extern_effects&.dig(:alloc) diff --git a/src/ast/type.rb b/src/ast/type.rb index d02b44135..04a31d727 100644 --- a/src/ast/type.rb +++ b/src/ast/type.rb @@ -1748,11 +1748,9 @@ def accepts_fn_type?(other_type) other_params = other_raw.params || [] return false unless self_params.length == other_params.length - self_ret = @raw.return_type - other_ret = other_raw.return_type - self_ret_t = self_ret.is_a?(Type) ? self_ret : Type.new(self_ret || :Any) - other_ret_t = other_ret.is_a?(Type) ? other_ret : Type.new(other_ret || :Any) - return false unless self_ret_t.accepts?(other_ret_t) + # @raw / other_raw are FunctionSignature (fn_type? gate); their + # return_type is a non-nil Type by the FunctionSignature seam. + return false unless @raw.return_type.accepts?(other_raw.return_type) self_params.zip(other_params).each do |sp, op| sp_t = sp[:type].is_a?(Type) ? sp[:type] : Type.new(sp[:type] || :Any) @@ -2099,8 +2097,7 @@ def compute_zig_type(is_param: false, is_field: false) t = p[:type] t.is_a?(Type) ? t.zig_type(is_param: true) : Type.new(t).zig_type(is_param: true) end - ret = @raw.return_type - ret_zig = ret.is_a?(Type) ? ret.zig_type : Type.new(ret).zig_type + ret_zig = @raw.return_type.zig_type all_params = ["*Runtime"] + param_types_zig ret_str = ret_zig.start_with?("!") ? ret_zig : "anyerror!#{ret_zig}" return "*const fn(#{all_params.join(', ')}) #{ret_str}" diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index ff2dde7b8..6d3a71acf 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -2266,8 +2266,7 @@ def lower_lambda(node) MIR::Param.new(p[:name], type_str, pp) } - ret = sig.return_type || :Void - ret_zig = ret.is_a?(Type) ? ret.zig_type : transpile_type(ret) + ret_zig = sig.return_type.zig_type ret_str = if ret_zig.start_with?("!") || ret_zig.include?("anyerror!") || ret_zig.include?("error{") ret_zig else From f8b906b38c82d30226f872bc26020d271669c998 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 01:15:26 +0000 Subject: [PATCH 40/45] FunctionContext#return_type is a non-nil Type seam (Phase 5) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Last return-type-adjacent untyped contract. FunctionContext#return_type was T.untyped, fed `declared_return` (Type | :Any Symbol). Replaced with the same coercing seam used for FunctionSignature: sig-typed `attr_reader :return_type` (returns Type) + a `return_type=` writer (nil -> Type.new(:Void), Symbol/String -> Type.new, Type passthrough). initialize routes through the writer; the constructor sig tightens `return_type: T.untyped` -> `T.nilable(Type)` (no Symbol in T.any). The sole producer (visit_FunctionDef) now feeds a Type: `return_type: node.return_type || Type.new(:Any)`. annotator's `declared_return` and its analyze_routine/verify_returns/`== :Any` logic are deliberately untouched — only the FunctionContext feed is coerced (same bounded-blast-radius approach as Phase 4). Return-check readers already tolerate a uniform Type: Type#== does `resolved == other.to_sym`, so the `expected == :Void` / `== :Any` guards in visit_ReturnNode stay correct. The now-dead `expected_type = Type.new(expected) if expected` coercion (a Type.new(Type) copy) collapses to `expected_type = expected`. Gates: prspec 4773/0, transpile-tests 554/554 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/function_context.rb | 23 +++++++++++++++++++---- src/annotator.rb | 4 ++-- 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/src/annotator-helpers/function_context.rb b/src/annotator-helpers/function_context.rb index 1a1d7801d..66f7cd66e 100644 --- a/src/annotator-helpers/function_context.rb +++ b/src/annotator-helpers/function_context.rb @@ -6,16 +6,31 @@ class FunctionContext extend T::Sig - attr_accessor :name, :return_type, :lifetime, :type_params, + attr_accessor :name, :lifetime, :type_params, :frame_count, :heap_count, :alloc_count, :needs_rt, # explicit "fn body references rt" flag (independent of allocation counters) :loop_depth, :conditional_depth, :returns, :stack_vars_bytes # accumulated bytes for stack-local variables - sig { params(name: String, return_type: T.untyped, lifetime: T.nilable(T::Array[String]), type_params: T::Array[Symbol]).void } - def initialize(name:, return_type:, lifetime: nil, type_params: []) + # Seam: the enclosing function's expected return is ALWAYS a Type + # (Void for "no value"). Coerced here so the producer may pass + # nil/Symbol without any return-check reader needing a Symbol/Type + # discriminator. + sig { returns(Type) } + attr_reader :return_type + + sig { params(val: T.untyped).void } + def return_type=(val) + @return_type = T.let( + val.nil? ? Type.new(:Void) : (val.is_a?(Type) ? val : Type.new(val)), + Type + ) + end + + sig { params(name: String, return_type: T.nilable(Type), lifetime: T.nilable(T::Array[String]), type_params: T::Array[Symbol]).void } + def initialize(name:, return_type: nil, lifetime: nil, type_params: []) @name = name - @return_type = return_type + self.return_type = return_type @lifetime = lifetime @type_params = type_params @frame_count = T.let(0, Integer) diff --git a/src/annotator.rb b/src/annotator.rb index 8142941ef..349bb44c3 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -655,7 +655,7 @@ def visit_FunctionDef(node) lifetime_paths = get_lifetime_paths(node) fn_type_params = (node.type_params || []).map(&:to_sym) @function_context_stack.push(FunctionContext.new( - name: node.name, return_type: declared_return, + name: node.name, return_type: node.return_type || Type.new(:Any), lifetime: lifetime_paths, type_params: fn_type_params )) @@ -2198,7 +2198,7 @@ def visit_ReturnNode(node) # Promote non-identifier literals to heap when the expected return type requires it. unless node.value.is_a?(AST::Identifier) - expected_type = Type.new(expected) if expected + expected_type = expected if expected_type && (expected_type.heap? || expected_type.dynamic?) && node.value.respond_to?(:storage=) && node.value.type_info&.requires_move? From 0711dad7c9b2fa55d60ef604864667b129d4ad07 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 01:23:06 +0000 Subject: [PATCH 41/45] Launder AST VarDecl/BindExpr #type to Type|nil; collapse 13 guards (#47) A declaration's annotated/inferred type was Symbol|Type|nil, forcing ~13 defensive is_a?(Type) discriminators. Killed at a single seam: VarDecl and BindExpr now coerce #type to Type at BOTH construction (positional Struct init from parser / pipeline_rewriter / fsm / mir_lowering proxy) and post-parse assignment (auto-infer, declared- type propagation). nil is preserved -- it is the genuine "no annotation, inference pending" signal that the auto/constraint passes consume. NOTE: AST::Literal#type is deliberately NOT touched -- it is a lexical token-kind Symbol (:STRING/:BOOLEAN/:NIL/:NUMBER), a correctly-Symbol contract read via `lit.type == :STRING`. Only the declaration-annotation contract (VarDecl/BindExpr) is the union. With #type now strictly Type|nil, 13 guards collapse to &.method / nil-checks: annotator visit_VarDecl/visit_BindExpr fixed?, observable dest target/target_t, future?+observable?, auto? generic_analysis validate_stream_type!, propagate_declared_type!, propagate_collection_metadata! (collection/soa/map) auto_inference walk_for_shape_decls auto? fixable_helpers emit_auto_shape_resolved_finding! annotator.rb:262 (program_has_auto?) intentionally left: a duck-typed recursive walk over arbitrary nodes (incl. Literal whose #type is a Symbol), not a declaration source-contract guard. is_a?(Type) in src/: 305 -> 298. Gates: prspec 4773/0, transpile-tests 554/554 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/auto_inference.rb | 2 +- src/annotator-helpers/fixable_helpers.rb | 2 +- src/annotator-helpers/generic_analysis.rb | 10 ++++---- src/annotator.rb | 12 +++++----- src/ast/ast.rb | 29 +++++++++++++++++++++++ 5 files changed, 42 insertions(+), 13 deletions(-) diff --git a/src/annotator-helpers/auto_inference.rb b/src/annotator-helpers/auto_inference.rb index 9b3ee65a7..79604030f 100644 --- a/src/annotator-helpers/auto_inference.rb +++ b/src/annotator-helpers/auto_inference.rb @@ -569,7 +569,7 @@ def walk_for_shape_decls(node, &block) return if node.nil? case node when AST::BindExpr, AST::VarDecl - yield node if node.type.is_a?(Type) && node.type.auto? + yield node if node.type&.auto? walk_for_shape_decls(node.value, &block) when AST::FunctionDef # Don't recurse into nested function definitions. diff --git a/src/annotator-helpers/fixable_helpers.rb b/src/annotator-helpers/fixable_helpers.rb index f63ade491..0c527d872 100644 --- a/src/annotator-helpers/fixable_helpers.rb +++ b/src/annotator-helpers/fixable_helpers.rb @@ -1481,7 +1481,7 @@ def emit_auto_resolved_finding!(resolution) sig { params(decl: T.untyped, slot: AutoConstraintCollector::Slot).returns(T.untyped) } def emit_auto_shape_resolved_finding!(decl, slot) T.bind(self, SemanticAnnotator) rescue nil - return unless decl && decl.type.is_a?(Type) + return unless decl&.type return if decl.type.auto? # not yet wrapped — skip type_str = auto_type_source_form(decl.type) name = decl.respond_to?(:name) ? decl.name : "" diff --git a/src/annotator-helpers/generic_analysis.rb b/src/annotator-helpers/generic_analysis.rb index a9eee4c54..5be002996 100644 --- a/src/annotator-helpers/generic_analysis.rb +++ b/src/annotator-helpers/generic_analysis.rb @@ -522,7 +522,7 @@ def substitute_type_params(signature, subst) sig { params(node: T.untyped).returns(NilClass) } def validate_stream_type!(node) T.bind(self, SemanticAnnotator) rescue nil - return unless node.type.is_a?(Type) && node.type.future? + return unless node.type&.future? if node.type.multiowned? error!(node, :RC_PROMISE_NEEDS_SHARED) end @@ -537,7 +537,7 @@ def validate_stream_type!(node) sig { params(node: T.untyped, final_type: T.untyped).returns(T.nilable(Type)) } def propagate_declared_type_to_value!(node, final_type) T.bind(self, SemanticAnnotator) rescue nil - return unless node.type.is_a?(Type) + return unless node.type # BgStreamBlock infers ~?T[]; declared ~T[INF] picks the runtime wrapper. if node.value.is_a?(AST::BgStreamBlock) && node.type.inf_stream? @@ -567,7 +567,7 @@ def propagate_declared_type_to_value!(node, final_type) sig { params(node: T.untyped, final_type: T.untyped).returns(T.nilable(Symbol)) } def propagate_collection_metadata!(node, final_type) T.bind(self, SemanticAnnotator) rescue nil - coll_src = if (decl_t = node.type).is_a?(Type) && decl_t.collection + coll_src = if (decl_t = node.type) && decl_t.collection decl_t elsif node.value.type_info&.collection node.value.type_info @@ -585,13 +585,13 @@ def propagate_collection_metadata!(node, final_type) end # Standalone @soa on fixed arrays (no collection): propagate soa flag directly. - if !coll_src && (decl_t = node.type).is_a?(Type) && decl_t.soa + if !coll_src && (decl_t = node.type) && decl_t.soa node.type_info.soa = true if node.type_info node.full_type&.soa = true end # Map-specific propagation: maps don't use :collection, so the above doesn't cover them. - if (decl_t = node.type).is_a?(Type) + if (decl_t = node.type) if decl_t.shard_count && !node.type_info&.shard_count node.type_info.shard_count = decl_t.shard_count if node.type_info node.full_type&.instance_variable_set(:@shard_count, decl_t.shard_count) diff --git a/src/annotator.rb b/src/annotator.rb index 349bb44c3..2a5d609a2 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -2600,7 +2600,7 @@ def visit_IntrinsicFunc(node, args) # CLEAR operations, so :stack is always safe here. sig { params(node: AST::VarDecl).void } def visit_VarDecl(node) - if node.value.is_a?(AST::ListLit) && node.type.is_a?(Type) && node.type.fixed? + if node.value.is_a?(AST::ListLit) && node.type&.fixed? node.value.storage = :stack end visit(node.value) @@ -2626,7 +2626,7 @@ def visit_VarDecl(node) def promote_pipe_to_observable_dest!(node) return unless node.respond_to?(:type) && node.type return unless node.value - target = node.type.is_a?(Type) ? node.type : Type.new(node.type) + target = node.type return unless target.future? && target.observable? pipe = node.value return unless pipe.is_a?(AST::BinaryOp) && pipe.op == :SMOOTH @@ -2639,7 +2639,7 @@ def promote_pipe_to_observable_dest!(node) # symbol entry (so WITH VIEW / NEXT / cleanup all see it). if pipe.full_type&.observable_terminal pipe_terminal = T.must(pipe.full_type).observable_terminal - target_t = node.type.is_a?(Type) ? node.type : Type.new(node.type) + target_t = node.type # The pipe is the authority on terminal kind: only the fold's # analyzer knows whether this is :sum / :count / :max / ... . # The LHS annotation (`~Int64@observable`) never carries one, so @@ -2699,7 +2699,7 @@ def finalize_decl_node!(node, mutable_flag) # check is correct because promote_pipe_to_observable_dest! sets # `observable_dest` only when the RHS is a SMOOTH-pipe over a # tense source; any other shape leaves it false. - if node.type.is_a?(Type) && node.type.future? && node.type.observable? + if node.type&.future? && node.type.observable? pipe = node.value ok = pipe.is_a?(AST::BinaryOp) && pipe.op == :SMOOTH && pipe.observable_dest unless ok @@ -2750,7 +2750,7 @@ def finalize_decl_node!(node, mutable_flag) # Empty collection literals annotated as Auto need a permissive # container type in scope so method dispatch works during the body walk; # the declaration annotation remains Auto for the later constraint pass. - if node.type.is_a?(Type) && node.type.auto? && + if node.type&.auto? && node.value.respond_to?(:type_object) && node.value.type_object && ( (node.value.is_a?(AST::ListLit) && node.value.items.empty? && @@ -2862,7 +2862,7 @@ def finalize_decl_node!(node, mutable_flag) sig { params(node: AST::BindExpr).returns(T.nilable(T::Hash[Symbol, T::Array[SymbolEntry]])) } def visit_BindExpr(node) # Same pre-set as visit_VarDecl: mark fixed-array list literals as :stack before visiting. - if node.value.is_a?(AST::ListLit) && node.type.is_a?(Type) && node.type.fixed? + if node.value.is_a?(AST::ListLit) && node.type&.fixed? node.value.storage = :stack end visit(node.value) diff --git a/src/ast/ast.rb b/src/ast/ast.rb index 425434dc6..740ce1fe5 100644 --- a/src/ast/ast.rb +++ b/src/ast/ast.rb @@ -686,6 +686,21 @@ def return_type=(val) VarDecl = Struct.new(:token, :name, :type, :value, :mutable) do include Locatable attr_accessor :mir_binding_entry # stamped by CleanupClassifier: per-node cleanup entry (avoids same-name collision) + + # Seam: a declaration's annotated/inferred type is always a Type + # (or nil when unannotated — the inference signal). Coerced at + # construction (positional Struct init) and post-parse assignment + # (auto-infer, propagation) so no reader needs an `is_a?(Type)` + # Symbol/Type discriminator. + def initialize(*) + super + t = self[:type] + self[:type] = Type.new(t) unless t.nil? || t.is_a?(Type) + end + + def type=(val) + self[:type] = val.nil? || val.is_a?(Type) ? val : Type.new(val) + end end Assignment = Struct.new(:token, :name, :value) do include Locatable @@ -707,6 +722,20 @@ def return_type=(val) attr_accessor :mir_binding_entry # stamped by CleanupClassifier: per-node cleanup entry (avoids same-name collision) attr_accessor :compound_op attr_accessor :auto_atomic_op + + # Seam: same contract as VarDecl#type — annotated/inferred type is + # always a Type (or nil when unannotated). Coerced at construction + # and post-parse assignment so no reader needs an `is_a?(Type)` + # Symbol/Type discriminator. + def initialize(*) + super + t = self[:type] + self[:type] = Type.new(t) unless t.nil? || t.is_a?(Type) + end + + def type=(val) + self[:type] = val.nil? || val.is_a?(Type) ? val : Type.new(val) + end end BinaryOp = Struct.new(:token, :left, :op, :right) do extend T::Sig From 584ace8afdad245490aa3eae322be6d579bd7e8a Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 01:37:14 +0000 Subject: [PATCH 42/45] Introduce AST::Param struct; normalize FunctionDef/FunctionSignature params (slice 1) The loose `{ name:, type:, ... }` parameter hash that flowed through FunctionDef#params and FunctionSignature#params is replaced by a strongly-typed AST::Param struct (name, type, default, mutable, takes, comptime, name_token, required, sync, symbol). Param#type is ALWAYS a Type -- coerced in initialize and the type= writer; nil only when the param is unannotated/inferred (the inference signal). Coerced at both seams via Param.coerce (idempotent): - FunctionDef#initialize + #params= (positional Struct construction from parser/synthetic builders, and post-parse assignment) - FunctionSignature#initialize (every signature, incl. lambda / generic-substituted / synthetic) so every params array is Array, never a Hash. generic substitute_type_params rebuilds via p.dup + np.type= (Param has no #merge). FunctionSignature#params / build_lambda_signature / param-typed sigs retyped Hash -> AST::Param. propagate_caller_sync spec migrated off the old have_key(:sync) Hash contract. Struct#[] keeps legacy `p[:type]` reads working natively during the subsequent reader-migration slices; this slice changes no behavior. Gates: prspec 4773/0, transpile-tests 554/554 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- spec/propagate_caller_sync_spec.rb | 6 ++-- src/annotator-helpers/function_analysis.rb | 8 ++--- src/annotator-helpers/function_signature.rb | 10 ++++-- src/annotator-helpers/generic_analysis.rb | 2 +- src/ast/ast.rb | 36 +++++++++++++++++++++ src/mir/escape_analysis.rb | 4 +-- 6 files changed, 53 insertions(+), 13 deletions(-) diff --git a/spec/propagate_caller_sync_spec.rb b/spec/propagate_caller_sync_spec.rb index 47743c780..9dd84f60f 100644 --- a/spec/propagate_caller_sync_spec.rb +++ b/spec/propagate_caller_sync_spec.rb @@ -185,9 +185,9 @@ def annotate(source) ast, annotator = annotate(src) sig = annotator.scope_stack.first.locals["bumpIt"].type expect(sig).to be_a(FunctionSignature) - # The field is present (key exists in the param hash). - expect(sig.params.first).to have_key(:sync) - expect(sig.params.first[:sync]).to be_nil + # The field is present on the Param struct (defaulting to nil). + expect(sig.params.first).to be_a(AST::Param) + expect(sig.params.first.sync).to be_nil end it "leaves :sync nil for params with no sync annotation" do diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index 57fbcb72f..9de776a18 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -64,7 +64,7 @@ def analyze_routine(node, body, declared_return, is_implicit) return_type end - sig { params(params: T::Array[T::Hash[Symbol, T.untyped]], return_type: Symbol).returns(FunctionSignature) } + sig { params(params: T::Array[AST::Param], return_type: Symbol).returns(FunctionSignature) } def build_lambda_signature(params, return_type) T.bind(self, SemanticAnnotator) rescue nil normalized_params = params.map do |param| @@ -528,7 +528,7 @@ def verify_function_signature!(node, signature) warn_multi_atomic_bare_value_call!(node, atomic_bare_value_args) end - sig { params(arg_node: T.untyped, expected_type_obj: Type, param: T::Hash[Symbol, T.untyped]).returns(T::Boolean) } + sig { params(arg_node: T.untyped, expected_type_obj: Type, param: AST::Param).returns(T::Boolean) } def atomic_cell_to_bare_value_param?(arg_node, expected_type_obj, param) T.bind(self, SemanticAnnotator) rescue nil return false unless arg_node.is_a?(AST::Identifier) @@ -543,7 +543,7 @@ def atomic_cell_to_bare_value_param?(arg_node, expected_type_obj, param) expected_type_obj.primitive? end - sig { params(arg_node: T.untyped, param: T::Hash[Symbol, T.untyped], signature: FunctionSignature).returns(T::Boolean) } + sig { params(arg_node: T.untyped, param: AST::Param, signature: FunctionSignature).returns(T::Boolean) } def atomic_cell_to_atomic_param?(arg_node, param, signature) T.bind(self, SemanticAnnotator) rescue nil return false unless arg_node.is_a?(AST::Identifier) @@ -587,7 +587,7 @@ def warn_multi_atomic_bare_value_call!(node, atomic_args) "will require an explicit @inconsistent call-site annotation.") end - sig { params(arg_node: T.untyped, param: T::Hash[Symbol, T.untyped], signature: FunctionSignature).returns(T.nilable(T::Boolean)) } + sig { params(arg_node: T.untyped, param: AST::Param, signature: FunctionSignature).returns(T.nilable(T::Boolean)) } def verify_param_lifetime!(arg_node, param, signature) T.bind(self, SemanticAnnotator) rescue nil return true if !arg_node.is_a?(AST::Identifier) diff --git a/src/annotator-helpers/function_signature.rb b/src/annotator-helpers/function_signature.rb index f01f71cf0..689424c01 100644 --- a/src/annotator-helpers/function_signature.rb +++ b/src/annotator-helpers/function_signature.rb @@ -13,9 +13,13 @@ class FunctionSignature extend T::Sig # Static signature fields (set at creation) - attr_reader :params, :visibility, :type_params, :reentrant + attr_reader :visibility, :type_params, :reentrant attr_accessor :return_lifetime, :return_strategy + # Always a list of AST::Param (coerced at the seam). No Hash. + sig { returns(T::Array[AST::Param]) } + attr_reader :params + # Seam: a function signature's return is ALWAYS a Type (Void for # "no value"). Coerced here so callers may pass nil/Symbol during # construction or late return-inference assignment without any @@ -90,13 +94,13 @@ def self.sync_from_function_def!(sig, fn) sig end - sig { params(params: T::Array[T::Hash[Symbol, T.untyped]], return_type: T.nilable(Type), return_lifetime: T.untyped, visibility: T.nilable(Symbol), type_params: T.nilable(T::Array[Symbol]), reentrant: T::Boolean, extern: T::Boolean, module_alias: T.nilable(String), extern_effects: T.nilable(T::Hash[Symbol, Symbol]), fn_type_params: T.nilable(T::Array[Symbol]), owner_type: T.nilable(String), owner_type_params: T.nilable(T::Array[T.untyped]), intrinsic: T::Boolean, zig_pattern: T.nilable(String)).void } + sig { params(params: T::Array[T.untyped], return_type: T.nilable(Type), return_lifetime: T.untyped, visibility: T.nilable(Symbol), type_params: T.nilable(T::Array[Symbol]), reentrant: T::Boolean, extern: T::Boolean, module_alias: T.nilable(String), extern_effects: T.nilable(T::Hash[Symbol, Symbol]), fn_type_params: T.nilable(T::Array[Symbol]), owner_type: T.nilable(String), owner_type_params: T.nilable(T::Array[T.untyped]), intrinsic: T::Boolean, zig_pattern: T.nilable(String)).void } def initialize(params:, return_type: nil, return_lifetime: nil, visibility: nil, type_params: nil, reentrant: false, extern: false, module_alias: nil, extern_effects: nil, fn_type_params: nil, owner_type: nil, owner_type_params: nil, intrinsic: false, zig_pattern: nil) - @params = params + @params = params.map { |p| AST::Param.coerce(p) } self.return_type = return_type @return_lifetime = return_lifetime @visibility = visibility diff --git a/src/annotator-helpers/generic_analysis.rb b/src/annotator-helpers/generic_analysis.rb index 5be002996..b6a37b5b3 100644 --- a/src/annotator-helpers/generic_analysis.rb +++ b/src/annotator-helpers/generic_analysis.rb @@ -507,7 +507,7 @@ def shared_call_capability_display(type) def substitute_type_params(signature, subst) T.bind(self, SemanticAnnotator) rescue nil FunctionSignature.new( - params: signature.params.map { |p| p.merge(type: apply_type_subst(p[:type], subst)) }, + params: signature.params.map { |p| p.dup.tap { |np| np.type = apply_type_subst(p.type, subst) } }, return_type: apply_type_subst(signature.return_type, subst), return_lifetime: signature.return_lifetime, visibility: signature.visibility diff --git a/src/ast/ast.rb b/src/ast/ast.rb index 740ce1fe5..1efec877d 100644 --- a/src/ast/ast.rb +++ b/src/ast/ast.rb @@ -11,6 +11,36 @@ module AST extend T::Sig + # A function/lambda/method parameter descriptor. Replaces the loose + # `{ name:, type:, ... }` Hash that flowed through FunctionDef#params + # and FunctionSignature#params. `type` is ALWAYS a Type (coerced; + # nil only when the param is unannotated/inferred — the inference + # signal). Strongly typed; no Hash-style access. + Param = Struct.new(:name, :type, :default, :mutable, :takes, + :comptime, :name_token, :required, :sync, :symbol, + keyword_init: true) do + extend T::Sig + + def initialize(**kw) + super + t = self[:type] + self[:type] = Type.new(t) unless t.nil? || t.is_a?(Type) + end + + sig { params(val: T.untyped).void } + def type=(val) + self[:type] = val.nil? || val.is_a?(Type) ? val : Type.new(val) + end + + # Idempotent normalizer used at the FunctionDef / FunctionSignature + # seams. Accepts a Param (passthrough) or a Hash (legacy producer). + sig { params(p: T.any(Param, T::Hash[Symbol, T.untyped])).returns(Param) } + def self.coerce(p) + return p if p.is_a?(Param) + new(**p.slice(*members)) + end + end + # Walk all statements in a body, recursing into control flow branches. # Yields each statement node. Handles IfStatement, MatchStatement, # WhileLoop, ForRange, ForEach, and generic nodes with .body. @@ -570,6 +600,7 @@ def initialize(*) super rt = self[:return_type] self[:return_type] = Type.new(rt) unless rt.nil? || rt.is_a?(Type) + self[:params] = (self[:params] || []).map { |p| Param.coerce(p) } end sig { params(val: T.untyped).void } @@ -577,6 +608,11 @@ def return_type=(val) self[:return_type] = val.nil? || val.is_a?(Type) ? val : Type.new(val) end + sig { params(val: T.nilable(T::Array[T.untyped])).void } + def params=(val) + self[:params] = (val || []).map { |p| Param.coerce(p) } + end + attr_accessor :type_params # Array of type param name strings, e.g. ["T", "K"], or nil # True when the user wrote RETURNS explicitly; fallible-signature checks # only enforce on user-authored return types. diff --git a/src/mir/escape_analysis.rb b/src/mir/escape_analysis.rb index f5628876c..16426da2a 100644 --- a/src/mir/escape_analysis.rb +++ b/src/mir/escape_analysis.rb @@ -800,13 +800,13 @@ def self.propagate_caller_sync!(fn_nodes) # True when the param's declared type carried explicit sync (so the # entry.sync currently reflects an annotation, not a propagated value). - sig { params(param: T::Hash[Symbol, T.untyped]).returns(T.nilable(T::Boolean)) } + sig { params(param: AST::Param).returns(T.nilable(T::Boolean)) } private_class_method def self.param_sync_was_declared?(param) t = param[:type] t.is_a?(Type) && t.any_sync? end - sig { params(fn_node: AST::FunctionDef, param: T::Hash[Symbol, T.untyped], sync: Symbol).returns(T::Boolean) } + sig { params(fn_node: AST::FunctionDef, param: AST::Param, sync: Symbol).returns(T::Boolean) } private_class_method def self.param_accepts_caller_sync?(fn_node, param, sync) t = param[:type] return true if t.is_a?(Type) && (t.shared? || t.any_sync?) From 087aa82620ebac7a81c75b93b43f81ac03e47d9d Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 01:49:16 +0000 Subject: [PATCH 43/45] Migrate param readers to AST::Param accessors; collapse 18 :type guards (slice 2) LambdaLit gets the same params seam as FunctionDef (initialize + params= coerce to Array), so every node that feeds declare_and_verify_params/build_lambda_signature yields Params. declare_and_verify_params and the param-iterating call sites now use struct accessors (param.type/.name/.sync/.symbol/...) instead of Hash lookup. With Param#type guaranteed Type|nil, 18 defensive `p[:type].is_a?(Type)` discriminators across function_analysis, generic_analysis, annotator, type.rb (accepts_fn_type? / fn-type zig), control_flow, mir_lowering, promotion_plan collapse to .type / &. . union.rb synthesizes fn_params as AST::Param.new. The shared parser comma-seq block stays a Hash on purpose: it is reused by USE-capture parsing (captures are NOT params and have no :type/:storage Param member); params are coerced at the FunctionDef/LambdaLit seam, captures stay Hashes. annotator:265 (program_has_auto? duck-typed walk) and annotator:1606 (union variant payload, not a function param) intentionally left. Spec migration: the hand-rolled fn-sig mock in mir_lowering_spec that bypassed FunctionSignature now passes AST::Param (the tightened contract); FunctionSignature/make_fn-routed mocks are coerced and unchanged. is_a?(Type) in src/: 298 -> 282. Gates: prspec 4773/0, transpile-tests 554/554 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- spec/mir_lowering_spec.rb | 2 +- src/annotator-helpers/function_analysis.rb | 52 +++++++++++----------- src/annotator-helpers/generic_analysis.rb | 4 +- src/annotator-helpers/union.rb | 2 +- src/annotator.rb | 6 +-- src/ast/ast.rb | 13 +++++- src/ast/parser.rb | 3 ++ src/ast/type.rb | 4 +- src/mir/control_flow.rb | 2 +- src/mir/mir_lowering.rb | 14 +++--- src/mir/promotion_plan.rb | 2 +- 11 files changed, 59 insertions(+), 45 deletions(-) diff --git a/spec/mir_lowering_spec.rb b/spec/mir_lowering_spec.rb index d0092f8b1..a909fd00a 100644 --- a/spec/mir_lowering_spec.rb +++ b/spec/mir_lowering_spec.rb @@ -1867,7 +1867,7 @@ def make_fn(name, params: [], return_type: :Void, body: [], visibility: nil, # Structs with no heap provenance live on the stack. Zig/LLVM SROAs them # into registers. Do NOT pass by *const T — that would prevent SROA. sig = Struct.new(:needs_rt, :can_fail, :params, :return_type) - .new(false, false, [{ name: "p", type: :Point, mutable: false, takes: false }], :Int64) + .new(false, false, [AST::Param.new(name: "p", type: :Point, mutable: false, takes: false)], :Int64) l = lowering( fn_sigs: { "sum3" => sig }, struct_schemas: { Point: { x: :Int64, y: :Int64 } } diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index 9de776a18..dff41b846 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -720,13 +720,13 @@ def declare_and_verify_params(node) T.bind(self, SemanticAnnotator) rescue nil node.params.each do |param| # Validate Defaults - if param[:default] - if param[:default].is_a?(AST::DefaultLit) + if param.default + if param.default.is_a?(AST::DefaultLit) # DEFAULT is only valid for struct-type params - param_type_sym = param[:type].is_a?(Symbol) ? param[:type] : param[:type].to_sym rescue nil + param_type_sym = param.type&.resolved schema = lookup_type_schema(param_type_sym) if param_type_sym unless schema.is_a?(Hash) && !schema[:kind] - error!(node, :DEFAULT_NEEDS_STRUCT_PARAM, type: param[:type]) + error!(node, :DEFAULT_NEEDS_STRUCT_PARAM, type: param.type) end # Validate all fields of the struct have defaults field_names = schema.keys.reject { |k| k.is_a?(Symbol) } @@ -734,16 +734,16 @@ def declare_and_verify_params(node) field_defaults = schema[:field_defaults] || {} missing = field_names.reject { |f| field_defaults.key?(f) } if missing.any? - error!(node, :DEFAULT_STRUCT_MISSING_DEFAULTS, name: param[:name], type: param[:type], missing: missing.join(', ')) + error!(node, :DEFAULT_STRUCT_MISSING_DEFAULTS, name: param.name, type: param.type, missing: missing.join(', ')) end end - param[:default].full_type = Type.new((param[:type].to_sym rescue param[:type])) + param.default.full_type = param.type else - visit(param[:default]) - def_type = param[:default].resolved_type - param_type = param[:type] + visit(param.default) + def_type = param.default.resolved_type + param_type = param.type unless is_safe_autocast?(def_type, param_type) - error!(node, :DEFAULT_VALUE_TYPE_MISMATCH, name: param[:name], expected: param_type, got: def_type) + error!(node, :DEFAULT_VALUE_TYPE_MISMATCH, name: param.name, expected: param_type, got: def_type) end end end @@ -751,12 +751,12 @@ def declare_and_verify_params(node) # Seed sync for cross-module helpers where caller-sync propagation # cannot see call sites. Visible callers still override this later. param_sync = nil - if param[:sync] - param_sync = param[:sync] - elsif param[:type].is_a?(Type) && param[:type].any_sync? - param_sync = param[:type].sync + if param.sync + param_sync = param.sync + elsif param.type&.any_sync? + param_sync = param.type.sync elsif node.respond_to?(:requires) && node.requires - families = node.requires[param[:name].to_s] + families = node.requires[param.name.to_s] if families # Polymorphic family seeds are only defaults; visible callers # override them during caller-sync propagation. @@ -779,27 +779,27 @@ def declare_and_verify_params(node) # the bare-cell form. param_layout = nil if param_sync == :atomic - param_t = param[:type].is_a?(Type) ? param[:type] : Type.new(param[:type]) + param_t = param.type param_layout = :indirect if param_t.respond_to?(:struct?) && param_t.struct? end current_scope.declare( - param[:name], nil, param[:type], param[:mutable], false, nil, :stack, + param.name, nil, param.type, param.mutable, false, nil, :stack, Set.new, [], sync: param_sync, layout: param_layout ) - # Stash the SymbolEntry on the param hash so downstream passes don't + # Stash the SymbolEntry on the Param so downstream passes don't # need to find an Identifier reference in the body. - param[:symbol] = current_scope.locals[param[:name]] - param[:symbol].is_param = true - param[:symbol].param_decl_token = param[:name_token] + param.symbol = current_scope.locals[param.name] + param.symbol.is_param = true + param.symbol.param_decl_token = param.name_token # Preserve REQUIRES disjunctions for call-site effect resolution. if node.respond_to?(:requires) && node.requires - fams = node.requires[param[:name].to_s] - param[:symbol].sync_families = fams if fams.is_a?(Set) && !fams.empty? + fams = node.requires[param.name.to_s] + param.symbol.sync_families = fams if fams.is_a?(Set) && !fams.empty? end # TAKES parameters own the data — force :affine so cleanup is emitted. - current_scope.locals[param[:name]].takes = true if param[:takes] - classify_ownership!(current_scope.locals[param[:name]]) - og_declare(param[:name], nil, param[:type]) + current_scope.locals[param.name].takes = true if param.takes + classify_ownership!(current_scope.locals[param.name]) + og_declare(param.name, nil, param.type) # Non-TAKES parameters are implicit borrows. Mark in OG so the # annotator prevents storing borrowed data into owned containers. unless param[:takes] diff --git a/src/annotator-helpers/generic_analysis.rb b/src/annotator-helpers/generic_analysis.rb index b6a37b5b3..954b7cfe8 100644 --- a/src/annotator-helpers/generic_analysis.rb +++ b/src/annotator-helpers/generic_analysis.rb @@ -268,7 +268,7 @@ def infer_generic_type_args!(node, signature, actual_args, type_params) signature.params.each_with_index do |param, i| arg = actual_args[i] next unless arg - param_type = param[:type].is_a?(Type) ? param[:type] : Type.new(param[:type] || :Any) + param_type = param.type || Type.new(:Any) actual_type = if arg.respond_to?(:type_info) && arg.type_info arg.type_info else @@ -292,7 +292,7 @@ def enforce_shared_family_call_sync!(node, signature, actual_args, type_params) signature.params.each_with_index do |param, i| arg = actual_args[i] next unless arg - param_type = param[:type].is_a?(Type) ? param[:type] : Type.new(param[:type] || :Any) + param_type = param.type || Type.new(:Any) next unless generic_shared_family_param?(param_type) && type_params.include?(param_type.resolved) actual_type = if arg.respond_to?(:type_info) && arg.type_info arg.type_info diff --git a/src/annotator-helpers/union.rb b/src/annotator-helpers/union.rb index 9d45a289b..987f4b023 100644 --- a/src/annotator-helpers/union.rb +++ b/src/annotator-helpers/union.rb @@ -36,7 +36,7 @@ def validate_union_methods!(node) if req[:body] # No concrete override — synthesize a top-level function from the default body. fn_params = req[:params].map { |rp| - { name: rp[:name], type: rp[:type], default: nil, mutable: false, takes: false } + AST::Param.new(name: rp[:name], type: rp[:type], default: nil, mutable: false, takes: false) } fn_node = AST::FunctionDef.new( req[:token], req[:name], fn_params, [], req[:return_type], diff --git a/src/annotator.rb b/src/annotator.rb index 2a5d609a2..7fad10849 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -604,7 +604,7 @@ def pre_register_function(node) default: p[:default], mutable: p[:mutable], takes: p[:takes] || false, - sync: (p[:type].is_a?(Type) && p[:type].any_sync?) ? p[:type].sync : nil + sync: p.type&.any_sync? ? p.type.sync : nil }}, return_type: node.return_type || Type.new(:Any), return_lifetime: get_lifetime_path(node), @@ -670,7 +670,7 @@ def visit_FunctionDef(node) validate_type_param_list!(node, node.type_params, "function") if fn_type_params.any? # Make type params visible during type annotation validation - node.params.each { |p| validate_type_annotation!(node, p[:type], is_param: true) if p[:type].is_a?(Type) } + node.params.each { |p| validate_type_annotation!(node, p.type, is_param: true) if p.type } validate_type_annotation!(node, node.return_type) if node.return_type # 3. Pre-declaration (so the function can be recursive) @@ -678,7 +678,7 @@ def visit_FunctionDef(node) params: node.params.map { |p| { name: p[:name], type: p[:type], required: p[:default].nil?, default: p[:default], mutable: p[:mutable], takes: p[:takes], - sync: (p[:type].is_a?(Type) && p[:type].any_sync?) ? p[:type].sync : nil + sync: p.type&.any_sync? ? p.type.sync : nil }}, return_type: node.return_type || Type.new(:Any), return_lifetime: lifetime_paths, visibility: node.visibility, diff --git a/src/ast/ast.rb b/src/ast/ast.rb index 1efec877d..1ac7fc3a9 100644 --- a/src/ast/ast.rb +++ b/src/ast/ast.rb @@ -838,7 +838,18 @@ def coerce!(declared_type) # misspelled field-name for a fixable edit span. attr_accessor :field_tokens end - LambdaLit = Struct.new(:token, :params, :captures, :body, :storage, :deferred_drops) { include Locatable } + LambdaLit = Struct.new(:token, :params, :captures, :body, :storage, :deferred_drops) do + include Locatable + # Same params seam as FunctionDef: always Array. + def initialize(*) + super + self[:params] = (self[:params] || []).map { |p| Param.coerce(p) } + end + + def params=(val) + self[:params] = (val || []).map { |p| Param.coerce(p) } + end + end IfStatement = Struct.new(:token, :condition, :then_branch, :else_branch, :then_drops, :else_drops) do extend T::Sig include Locatable diff --git a/src/ast/parser.rb b/src/ast/parser.rb index cf494eca5..065ebea87 100644 --- a/src/ast/parser.rb +++ b/src/ast/parser.rb @@ -872,6 +872,9 @@ def parse_argument_list() end end + # Plain Hash: this comma-seq block is shared by FN-param and + # USE-capture parsing. Params are coerced to AST::Param at the + # FunctionDef/LambdaLit seam; captures stay Hashes. { name: p_name, type: p_type, default: default_val, mutable: is_mutable, takes: takes, comptime: is_comptime, name_token: name_tok } end .last # always ignore the first token diff --git a/src/ast/type.rb b/src/ast/type.rb index 04a31d727..a09afa95e 100644 --- a/src/ast/type.rb +++ b/src/ast/type.rb @@ -1753,8 +1753,8 @@ def accepts_fn_type?(other_type) return false unless @raw.return_type.accepts?(other_raw.return_type) self_params.zip(other_params).each do |sp, op| - sp_t = sp[:type].is_a?(Type) ? sp[:type] : Type.new(sp[:type] || :Any) - op_t = op[:type].is_a?(Type) ? op[:type] : Type.new(op[:type] || :Any) + sp_t = sp.type || Type.new(:Any) + op_t = op.type || Type.new(:Any) return false unless sp_t.accepts?(op_t) end diff --git a/src/mir/control_flow.rb b/src/mir/control_flow.rb index 497ec0c6a..7890ed604 100644 --- a/src/mir/control_flow.rb +++ b/src/mir/control_flow.rb @@ -509,7 +509,7 @@ def init_entry_state (@fn_node.params || []).each do |p| next unless p[:takes] name = p[:name].to_s - ti = p[:type].is_a?(Type) ? p[:type] : (Type.new(p[:type] || :Any) rescue nil) + ti = p.type || Type.new(:Any) needs = ti ? ti.needs_explicit_cleanup?(:heap, @schema_lookup) : true state[name] = OwnerEntry.new(state: OWNED, allocator: :heap, needs_cleanup: needs) end diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index 6d3a71acf..dbad673cc 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -1167,7 +1167,7 @@ def lower_function_def(node) # INV-CROSS-FRAME-PARAM-ALLOC verifier in mir_checker.rb). mutable_scalar_params = (node.params || []).select { |p| next false unless p[:mutable] - p_type_obj = p[:type].is_a?(Type) ? p[:type] : (Type.new(p[:type] || :Any) rescue nil) + p_type_obj = p.type || Type.new(:Any) next false if p_type_obj && (p_type_obj.collection? || (p_type_obj.respond_to?(:needs_pointer_passing?) && p_type_obj.needs_pointer_passing?)) !transpile_type(p[:type], is_param: true).start_with?("[]", "*") @@ -1181,7 +1181,7 @@ def lower_function_def(node) # callee adds a second `&`, producing `**ArrayList` which Zig's # one-level method auto-deref can't unwrap. @current_fn_collection_params = (node.params || []).select { |p| - p_type_obj = p[:type].is_a?(Type) ? p[:type] : Type.new(p[:type] || :Any) + p_type_obj = p.type || Type.new(:Any) p_type_obj.needs_pointer_passing? || (p[:mutable] && p_type_obj.list_collection?) }.map { |p| p[:name] }.to_set @@ -1192,8 +1192,8 @@ def lower_function_def(node) # Build param list params_mir = (node.params || []).map { |param| p_name = mutable_scalar_params.include?(param[:name]) ? "_m_#{param[:name]}" : param[:name] - p_type_sym = param[:type].is_a?(Type) ? param[:type].resolved : param[:type] - p_type_obj = param[:type].is_a?(Type) ? param[:type] : Type.new(param[:type] || :Any) + p_type_sym = param.type&.resolved + p_type_obj = param.type || Type.new(:Any) is_user_struct = @struct_schemas&.key?(p_type_sym) # Atomic params need `anytype` so call sites pass the cell itself, # allowing WITH MATCH comptime probes to dispatch by actual family. @@ -1377,7 +1377,7 @@ def lower_function_def(node) (node.params || []).select { |p| p[:takes] }.each do |p| entry = @current_bindings[p[:name].to_s] next unless entry && entry[:needs_cleanup] - ti = p[:type].is_a?(Type) ? p[:type] : Type.new(p[:type] || :Any) + ti = p.type || Type.new(:Any) drop_entry = entry.dup build_drop_entry!(drop_entry, ti, nil) takes_mir << MIR::AllocMark.new(p[:name].to_s, entry[:alloc], ti) @@ -1737,7 +1737,7 @@ def lower_func_call(node) callee_param[:type].respond_to?(:list_collection?) && callee_param[:type].list_collection? callee_param_type = if callee_param - callee_param[:type].is_a?(Type) ? callee_param[:type] : (Type.new(callee_param[:type] || :Any) rescue nil) + callee_param.type || Type.new(:Any) end callee_wants_mutable_value = callee_param && callee_param[:mutable] && a.is_a?(AST::Identifier) && @@ -5920,7 +5920,7 @@ def tied_shared_family_return_param(node, mutable_scalar_params) return nil unless ret.is_a?(Type) && ret.polymorphic_shared? return nil unless ret.resolved.to_s.match?(/\A[A-Z]\z/) params = (node.params || []).select do |p| - pt = p[:type].is_a?(Type) ? p[:type] : Type.new(p[:type] || :Any) + pt = p.type || Type.new(:Any) pt.shared? && pt.resolved == ret.resolved end return nil unless params.size == 1 diff --git a/src/mir/promotion_plan.rb b/src/mir/promotion_plan.rb index 7e9ac01e3..05c9da222 100644 --- a/src/mir/promotion_plan.rb +++ b/src/mir/promotion_plan.rb @@ -473,7 +473,7 @@ def self.stamp_field_pre_cleanups!(body, bindings, schema_lookup: nil) sig { params(fn_node: AST::FunctionDef, schema_lookup: Proc, bindings: T::Hash[String, T::Hash[Symbol, T.untyped]]).returns(T.nilable(T::Array[T::Hash[Symbol, T.untyped]])) } private_class_method def self.walk_takes_params(fn_node, schema_lookup, bindings) (fn_node.params || []).select { |p| p[:takes] }.each do |p| - ti = p[:type].is_a?(Type) ? p[:type] : Type.new(p[:type] || :Any) + ti = p.type || Type.new(:Any) name = p[:name].to_s base = takes_param_base_entry(ti, schema_lookup) From d8324c2696b7c9d45a37bbecec07fb21ac2cebd1 Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 02:00:17 +0000 Subject: [PATCH 44/45] Convert all remaining param Hash-lookups to AST::Param accessors (slice 3) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Completes the param migration: every `.params` iteration across the compiler now uses struct accessors (p.name/.type/.mutable/.takes/ .default/.sync/.comptime/.symbol/.name_token) instead of Hash subscript. ExternFnDecl gets the same params seam as FunctionDef / LambdaLit (the three — and only three — AST structs with a :params member), so extern signatures are AST::Param too. annotator.rb:265 (program_has_auto?) collapses now that node.params is always Param post-seam: p[:type].is_a?(Type) && .auto? -> p.type&.auto?. Left as Hashes on purpose (NOT AST::Param): - union.rb `rp` — union-requirement spec hashes, not function params - thunk_transform/emit.rb — MIR-level trampoline param hashes whose `type` is a Zig type STRING ("i64"), a different contract - fsm_transform locals — `{name:, zig_type:}`, not params is_a?(Type) in src/: 282 -> 281 (the residual generic-walk guard). Gates: prspec 4773/0, transpile-tests 554/554 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- src/annotator-helpers/auto_inference.rb | 8 +-- src/annotator-helpers/capabilities.rb | 8 +-- src/annotator-helpers/effects.rb | 4 +- src/annotator-helpers/fixable_helpers.rb | 2 +- src/annotator-helpers/function_analysis.rb | 76 +++++++++++----------- src/annotator-helpers/generic_analysis.rb | 2 +- src/annotator-helpers/pipe_analysis.rb | 6 +- src/annotator-helpers/reentrance.rb | 6 +- src/annotator-helpers/with_match_check.rb | 6 +- src/annotator.rb | 38 +++++------ src/ast/ast.rb | 14 +++- src/ast/scope.rb | 8 +-- src/ast/type.rb | 2 +- src/backends/importer.rb | 2 +- src/mir/concurrency_checks.rb | 4 +- src/mir/control_flow.rb | 4 +- src/mir/escape_analysis.rb | 14 ++-- src/mir/mir_lowering.rb | 68 +++++++++---------- src/mir/promotion_plan.rb | 4 +- 19 files changed, 143 insertions(+), 133 deletions(-) diff --git a/src/annotator-helpers/auto_inference.rb b/src/annotator-helpers/auto_inference.rb index 79604030f..c231c6196 100644 --- a/src/annotator-helpers/auto_inference.rb +++ b/src/annotator-helpers/auto_inference.rb @@ -77,11 +77,11 @@ def collect!(program_node) def register_signature_slots @fn_nodes.each do |name, fn| (fn.params || []).each_with_index do |param, i| - next unless auto?(param[:type]) + next unless auto?(param.type) @slots[[:param, name, i]] = Slot.new( kind: :param, fn_name: name, index: i, decl_node: fn, sources: [], - auto_token: param[:type].auto_token, + auto_token: param.type.auto_token, ) end if auto?(fn.return_type) @@ -150,7 +150,7 @@ def record_call_site(call_node) callee = @fn_nodes[call_node.name] return unless callee (callee.params || []).each_with_index do |param, i| - next unless auto?(param[:type]) + next unless auto?(param.type) arg = call_node.args && call_node.args[i] next unless arg slot = @slots[[:param, callee.name, i]] @@ -714,7 +714,7 @@ def build_name_map(fn) map = {} (fn.params || []).each_with_index do |param, i| slot_id = [:param, fn.name, i] - map[param[:name]] = slot_id if @slots.key?(slot_id) + map[param.name] = slot_id if @slots.key?(slot_id) end walk_for_local_decls(fn.body) do |decl| slot_id = [:local, decl.object_id] diff --git a/src/annotator-helpers/capabilities.rb b/src/annotator-helpers/capabilities.rb index 7365019b4..b70325921 100644 --- a/src/annotator-helpers/capabilities.rb +++ b/src/annotator-helpers/capabilities.rb @@ -510,7 +510,7 @@ def visit_pre_clauses!(fn_node) end end - param_names = (fn_node.params || []).map { |p| p[:name].to_s } + param_names = (fn_node.params || []).map { |p| p.name.to_s } prev_ctx = @current_predicate_context begin pre_clauses.each do |entry| @@ -555,11 +555,11 @@ def visit_post_clauses!(fn_node) error!(fn_node, :DEBUG_POST_NOT_WITH_CATCH) end - param_names = (fn_node.params || []).map { |p| p[:name].to_s } + param_names = (fn_node.params || []).map { |p| p.name.to_s } rejected = (fn_node.params || []).filter_map do |p| - sym = current_scope.locals[p[:name].to_s] + sym = current_scope.locals[p.name.to_s] next unless sym && %i[locked write_locked versioned atomic].include?(sym.sync) - p[:name].to_s + p.name.to_s end.to_set rt = fn_node.return_type diff --git a/src/annotator-helpers/effects.rb b/src/annotator-helpers/effects.rb index 167e43dc7..2d6d24e03 100644 --- a/src/annotator-helpers/effects.rb +++ b/src/annotator-helpers/effects.rb @@ -363,8 +363,8 @@ def compute_needs_rt! ret_type = raw.is_a?(FunctionSignature) ? raw.return_type : nil heap_return = ret_type.is_a?(Type) && (ret_type.heap? || ret_type.dynamic?) has_takes_heap = fn_node.params&.any? { |p| - next unless p[:takes] - ti = Type.new(p[:type] || :Any) + next unless p.takes + ti = Type.new(p.type || :Any) ti.string? || ti.array? || ti.list_collection? || ti.map? } has_catch = fn_node.catch_clauses.is_a?(Array) && fn_node.catch_clauses.any? diff --git a/src/annotator-helpers/fixable_helpers.rb b/src/annotator-helpers/fixable_helpers.rb index 0c527d872..378cc5b56 100644 --- a/src/annotator-helpers/fixable_helpers.rb +++ b/src/annotator-helpers/fixable_helpers.rb @@ -1625,7 +1625,7 @@ def auto_slot_label(slot) case slot.kind when :param param = slot.decl_node.params[slot.index] - "parameter '#{param[:name]}' of `#{slot.fn_name}`" + "parameter '#{param.name}' of `#{slot.fn_name}`" when :return "return type of `#{slot.fn_name}`" when :local diff --git a/src/annotator-helpers/function_analysis.rb b/src/annotator-helpers/function_analysis.rb index dff41b846..e6d4fc457 100644 --- a/src/annotator-helpers/function_analysis.rb +++ b/src/annotator-helpers/function_analysis.rb @@ -69,12 +69,12 @@ def build_lambda_signature(params, return_type) T.bind(self, SemanticAnnotator) rescue nil normalized_params = params.map do |param| { - name: param[:name], - type: param[:type], - required: param[:default].nil?, - default: param[:default], - mutable: param[:mutable] || false, - takes: param[:takes] || false + name: param.name, + type: param.type, + required: param.default.nil?, + default: param.default, + mutable: param.mutable || false, + takes: param.takes || false } end @@ -135,7 +135,7 @@ def resolve_call(node, args) comptime_type_args = [] params = func_type.params || [] params.each_with_index do |p, i| - if p[:comptime] && args[i].is_a?(AST::Identifier) + if p.comptime && args[i].is_a?(AST::Identifier) comptime_type_args << args[i].name.to_sym args[i].full_type = :Type # Mark as type-value, not a variable end @@ -305,7 +305,7 @@ def normalize_intrinsic_signature(config) def verify_function_signature!(node, signature) T.bind(self, SemanticAnnotator) rescue nil params = signature.params - min_args = params.count { |param| param[:required] } + min_args = params.count { |param| param.required } max_args = params.size given_args = node.args.size @@ -319,11 +319,11 @@ def verify_function_signature!(node, signature) if given_args < max_args params[given_args...max_args].each do |param| - next if param[:required] - default = param[:default] + next if param.required + default = param.default injected = case default when AST::DefaultLit - type_name = param[:type].is_a?(Symbol) ? param[:type].to_s : param[:type].to_s + type_name = param.type.to_s AST::StructLit.new(default.token, type_name, {}, nil) else default.dup @@ -339,16 +339,16 @@ def verify_function_signature!(node, signature) node.args.each_with_index do |arg_node, i| param = params[i] - next if param[:comptime] # comptime type params are not type-checked + next if param.comptime # comptime type params are not type-checked verify_param_lifetime!(arg_node, param, signature) - if param[:mutable] + if param.mutable if !arg_node.is_a?(AST::Identifier) - error!(arg_node, :IMMUTABLE_ARG_PASSED_AS_EXPRESSION, index: i+1, param: param[:name]) + error!(arg_node, :IMMUTABLE_ARG_PASSED_AS_EXPRESSION, index: i+1, param: param.name) end if current_scope.is_immutable?(arg_node.name) - emit_immutable_arg_error!(arg_node, current_scope, i + 1, param[:name]) + emit_immutable_arg_error!(arg_node, current_scope, i + 1, param.name) end # Mark only the SymbolEntry as mutated-through-call. The callee receives a mutable reference @@ -368,7 +368,7 @@ def verify_function_signature!(node, signature) is_give = arg_node.is_a?(AST::MoveNode) inner_node = is_give ? arg_node.value : arg_node - if param[:takes] || is_give + if param.takes || is_give # Reject borrowed values passed to TAKES params. # Container index access (arr[i], map[key]) returns a borrow - # you cannot take ownership of data inside a container. @@ -386,7 +386,7 @@ def verify_function_signature!(node, signature) # Ensure @list args to TAKES params are heap-owned (implicit COPY). if inner_node.is_a?(AST::Identifier) - owned = ensure_owned_value!(inner_node, param[:type]) + owned = ensure_owned_value!(inner_node, param.type) node.args[i] = owned if owned end @@ -401,7 +401,7 @@ def verify_function_signature!(node, signature) move_if_not_copyable!( inner_node, action: is_give ? :give : :takes, - consumer_param_type: param[:type], + consumer_param_type: param.type, ) inner_node.was_moved = true arg_node.was_moved = true @@ -414,16 +414,16 @@ def verify_function_signature!(node, signature) # Weak refs must be RESOLVE'd before passing to concrete params. arg_ti = arg_node.respond_to?(:type_info) ? arg_node.type_info : nil - expected_raw = param[:type] + expected_raw = param.type if arg_ti&.link? && expected_raw != :Any param_type_obj = expected_raw.is_a?(Type) ? expected_raw : nil unless param_type_obj&.link? arg_name = arg_node.respond_to?(:name) ? arg_node.name : "Expression" - error!(arg_node, :LINK_NEEDS_RESOLVE_FOR_CALL, name: arg_name, param: param[:name]) + error!(arg_node, :LINK_NEEDS_RESOLVE_FOR_CALL, name: arg_name, param: param.name) end end - expected = param[:type] + expected = param.type actual = arg_node.resolved_type match = false @@ -439,7 +439,7 @@ def verify_function_signature!(node, signature) elsif actual_type_obj.is_a?(Type) && actual_type_obj.fn_type? && actual_type_obj.raw.reentrant && !expected_type_obj.raw.reentrant arg_name = arg_node.respond_to?(:name) ? arg_node.name : "Expression" - error!(arg_node, :REENTRANT_FN_TO_NON_REENTRANT_PARAM, name: arg_name, param: param[:name]) + error!(arg_node, :REENTRANT_FN_TO_NON_REENTRANT_PARAM, name: arg_name, param: param.name) end end @@ -506,7 +506,7 @@ def verify_function_signature!(node, signature) current_path = get_path_to_root(arg_node) next if current_path.nil? - is_mutable = param[:mutable] + is_mutable = param.mutable encountered_args.each_with_index do |prev, prev_index| # Mutable aliases conflict when their root paths overlap. @@ -535,8 +535,8 @@ def atomic_cell_to_bare_value_param?(arg_node, expected_type_obj, param) sym = arg_node.symbol return false unless sym&.sync == :atomic return false if sym.respond_to?(:layout) && sym.layout == :indirect - return false if param[:sync] == :atomic - return false if param[:symbol]&.respond_to?(:sync) && param[:symbol].sync == :atomic + return false if param.sync == :atomic + return false if param.symbol&.respond_to?(:sync) && param.symbol.sync == :atomic return false if expected_type_obj.any? || expected_type_obj.fn_type? return false if expected_type_obj.shared? || expected_type_obj.any_sync? @@ -549,13 +549,13 @@ def atomic_cell_to_atomic_param?(arg_node, param, signature) return false unless arg_node.is_a?(AST::Identifier) sym = arg_node.symbol return false unless sym&.sync == :atomic - ptype = param[:type] + ptype = param.type return true if ptype.is_a?(Type) && ptype.sync == :atomic - return true if param[:sync] == :atomic - return true if param[:symbol]&.respond_to?(:sync) && param[:symbol].sync == :atomic + return true if param.sync == :atomic + return true if param.symbol&.respond_to?(:sync) && param.symbol.sync == :atomic requires = signature.requires - families = requires && requires[param[:name].to_s] + families = requires && requires[param.name.to_s] families.respond_to?(:include?) && families.include?(:ATOMIC) end @@ -593,7 +593,7 @@ def verify_param_lifetime!(arg_node, param, signature) return true if !arg_node.is_a?(AST::Identifier) @og = T.let(@og, T.untyped) - if param[:mutable] && !@og.can_write?(arg_node.name) + if param.mutable && !@og.can_write?(arg_node.name) error!(arg_node, :MUTABLE_ARG_RESTRICTED, name: arg_node.name) end @@ -602,7 +602,7 @@ def verify_param_lifetime!(arg_node, param, signature) lifetime_paths = [lifetime_paths] unless lifetime_paths.is_a?(Array) return true if lifetime_paths.empty? - borrow_type = param[:mutable] ? :mutable : :immutable + borrow_type = param.mutable ? :mutable : :immutable return true if current_scope.is_immutable?(arg_node.name) || current_scope.is_restricted?(arg_node.name) # If `param` is named in the lifetime sources (any of the multi- @@ -613,9 +613,9 @@ def verify_param_lifetime!(arg_node, param, signature) next [:wildcard] if p == :wildcard [p.to_s.split(".").first] end - return true unless base_paths.include?(:wildcard) || base_paths.include?(param[:name]) + return true unless base_paths.include?(:wildcard) || base_paths.include?(param.name) - error!(arg_node, :MUTABLE_PARAM_NEEDS_RESTRICT, name: param[:name]) + error!(arg_node, :MUTABLE_PARAM_NEEDS_RESTRICT, name: param.name) end # `node.return_lifetime` shapes: @@ -682,14 +682,14 @@ def verify_lifetime_source!(node, source_node) T.bind(self, SemanticAnnotator) rescue nil path = get_path_to_root(source_node) root_param_name = T.must(path).first.to_s - param = node.params.find { |p| p[:name] == root_param_name } + param = node.params.find { |p| p.name == root_param_name } if param.nil? error!(node, :LIFETIME_ROOT_NOT_PARAM, name: root_param_name) end # Extract the resolved type name (Type objects from parse_type_annotation) - param_type = param[:type] + param_type = param.type current_type_name = param_type.is_a?(Type) ? param_type.resolved : param_type.to_sym T.must(path).drop(1).each do |field_sym| @@ -802,10 +802,10 @@ def declare_and_verify_params(node) og_declare(param.name, nil, param.type) # Non-TAKES parameters are implicit borrows. Mark in OG so the # annotator prevents storing borrowed data into owned containers. - unless param[:takes] - @og[param[:name]]&.kind = :borrowed + unless param.takes + @og[param.name]&.kind = :borrowed end - param[:type] + param.type end end diff --git a/src/annotator-helpers/generic_analysis.rb b/src/annotator-helpers/generic_analysis.rb index 954b7cfe8..aedc29d7a 100644 --- a/src/annotator-helpers/generic_analysis.rb +++ b/src/annotator-helpers/generic_analysis.rb @@ -301,7 +301,7 @@ def enforce_shared_family_call_sync!(node, signature, actual_args, type_params) end next unless actual_type.shared? shared_args << { - name: param[:name], + name: param.name, type: generic_shared_payload_binding(actual_type) } end diff --git a/src/annotator-helpers/pipe_analysis.rb b/src/annotator-helpers/pipe_analysis.rb index 3d4b5bf60..c3cd0eaa4 100644 --- a/src/annotator-helpers/pipe_analysis.rb +++ b/src/annotator-helpers/pipe_analysis.rb @@ -756,7 +756,7 @@ def analyze_pipe_to_named_function(node, sig, func_name) T.bind(self, SemanticAnnotator) rescue nil # 1. Validate Arity: Must accept exactly 1 argument (the pipe input) params = sig.params - min_args = params.count { |p| p[:required] } + min_args = params.count { |p| p.required } max_args = params.size if min_args < 1 || max_args > 1 @@ -770,12 +770,12 @@ def analyze_pipe_to_named_function(node, sig, func_name) # 2. Validate Type: The Input must match Parameter 1 if max_args >= 1 param = params[0] - expected = param[:type] + expected = param.type actual = node.left.resolved_type # Type.accepts? handles slice coercion (Number[3] -> Number[]) unless is_safe_autocast?(actual, expected) - error!(node.left, :ARGUMENT_TYPE_ERROR, fn: "Pipe Input '#{param[:name]}'", index: 1, expected: expected, got: actual) + error!(node.left, :ARGUMENT_TYPE_ERROR, fn: "Pipe Input '#{param.name}'", index: 1, expected: expected, got: actual) end end diff --git a/src/annotator-helpers/reentrance.rb b/src/annotator-helpers/reentrance.rb index a4266991f..60d30f30c 100644 --- a/src/annotator-helpers/reentrance.rb +++ b/src/annotator-helpers/reentrance.rb @@ -688,9 +688,9 @@ def offer_unconstrained_fn_param_fix!(fn_node) existing = fn_node.requires_clauses || {} candidates = (fn_node.params || []).filter_map do |p| - name = p[:name] + name = p.name next nil if existing.key?(name) - type = p[:type] + type = p.type next nil unless type.respond_to?(:fn_type?) && type.fn_type? raw = type.respond_to?(:raw) ? type.raw : nil next nil if raw.is_a?(FunctionSignature) && raw.reentrant == true @@ -733,7 +733,7 @@ def validate_requires_clauses!(fn_node) return if fn_node.requires_clauses.nil? || fn_node.requires_clauses.empty? # Params come from the parser as hashes ({ name:, type:, default:, ... }). # See pre_register_function in annotator.rb. - param_names = (fn_node.params || []).map { |p| p[:name] }.compact.to_set + param_names = (fn_node.params || []).map { |p| p.name }.compact.to_set fn_node.requires_clauses.each do |bound_name, _kind| next if param_names.include?(bound_name) error!(fn_node, :REQUIRES_NON_REENTRANT_NOT_PARAM, fn: fn_node.name, name: bound_name) diff --git a/src/annotator-helpers/with_match_check.rb b/src/annotator-helpers/with_match_check.rb index 78e744ab5..126676677 100644 --- a/src/annotator-helpers/with_match_check.rb +++ b/src/annotator-helpers/with_match_check.rb @@ -52,7 +52,7 @@ def self.poly_requires?(family_set) def self.check_function!(fn, error_handler, warn_handler: nil, policy_handlers: nil) return unless fn.respond_to?(:body) && fn.body requires_map = (fn.respond_to?(:requires) ? fn.requires : nil) || {} - param_names = fn.params.map { |p| p[:name].to_s }.to_set + param_names = fn.params.map { |p| p.name.to_s }.to_set AST.walk_body(fn.body) do |node| next unless node.is_a?(AST::WithBlock) @@ -231,7 +231,7 @@ def self.check_call_sites!(fn, sig_lookup, error_handler) sig = sig_lookup.call(call_node.name.to_s) next unless sig.is_a?(FunctionSignature) && sig.requires sig.params.each_with_index do |param, idx| - pname = (param[:name] || param["name"]).to_s + pname = param.name.to_s fams = sig.requires[pname] next unless fams && fams.empty? arg = (call_node.args || [])[idx] @@ -252,7 +252,7 @@ def self.check_call_sites!(fn, sig_lookup, error_handler) next unless sig.requires && !sig.requires.empty? sig.params.each_with_index do |param, idx| - param_name = param[:name].to_s + param_name = param.name.to_s disjunction = sig.requires[param_name] next unless disjunction && !disjunction.empty? diff --git a/src/annotator.rb b/src/annotator.rb index 7fad10849..b94be9b2f 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -262,7 +262,7 @@ def program_has_auto?(node) return true if node.respond_to?(:type) && node.type.is_a?(Type) && node.type.auto? return true if node.respond_to?(:return_type) && node.return_type.is_a?(Type) && node.return_type.auto? if node.respond_to?(:params) && node.params.is_a?(Array) - return true if node.params.any? { |p| p[:type].is_a?(Type) && p[:type].auto? } + return true if node.params.any? { |p| p.type&.auto? } end if node.respond_to?(:each_pair) return node.each_pair.any? { |_, v| program_has_auto?(v) } @@ -541,11 +541,11 @@ def visit_RequireNode(node) def visit_ExternFnDecl(node) signature = FunctionSignature.new( params: node.params.map { |p| { - name: p[:name], - type: p[:type], - required: p[:default].nil?, - mutable: p[:mutable] || false, - comptime: p[:comptime] || false + name: p.name, + type: p.type, + required: p.default.nil?, + mutable: p.mutable || false, + comptime: p.comptime || false }}, return_type: node.return_type || Type.new(:Any), visibility: :pub, @@ -598,12 +598,12 @@ def visit_ExternStructDecl(node) def pre_register_function(node) signature = FunctionSignature.new( params: node.params.map { |p| { - name: p[:name], - type: p[:type], - required: p[:default].nil?, - default: p[:default], - mutable: p[:mutable], - takes: p[:takes] || false, + name: p.name, + type: p.type, + required: p.default.nil?, + default: p.default, + mutable: p.mutable, + takes: p.takes || false, sync: p.type&.any_sync? ? p.type.sync : nil }}, return_type: node.return_type || Type.new(:Any), @@ -660,7 +660,7 @@ def visit_FunctionDef(node) )) # 2. Validation & Lifetime - has_mutable_param = node.params.any? { |p| p[:mutable] } + has_mutable_param = node.params.any? { |p| p.mutable } if has_mutable_param && !node.name.end_with?("!") emit_style_mutable_param_needs_bang!(node) end @@ -676,8 +676,8 @@ def visit_FunctionDef(node) # 3. Pre-declaration (so the function can be recursive) signature = FunctionSignature.new( params: node.params.map { |p| { - name: p[:name], type: p[:type], required: p[:default].nil?, - default: p[:default], mutable: p[:mutable], takes: p[:takes], + name: p.name, type: p.type, required: p.default.nil?, + default: p.default, mutable: p.mutable, takes: p.takes, sync: p.type&.any_sync? ? p.type.sync : nil }}, return_type: node.return_type || Type.new(:Any), return_lifetime: lifetime_paths, @@ -950,7 +950,7 @@ def collapse_errors_for_call(sig, args) require_relative 'annotator-helpers/with_match_check' unless defined?(WithMatchCheck) collapsed = Set.new sig.requires.each do |param_name, _families| - idx = sig.params.find_index { |p| p[:name].to_s == param_name } + idx = sig.params.find_index { |p| p.name.to_s == param_name } next unless idx arg = args[idx] next unless arg @@ -2294,7 +2294,7 @@ def infer_implicit_type_params(fn_node) explicit = (fn_node.type_params || []).map(&:to_s) return explicit unless explicit.empty? inferred = [] - ([fn_node.return_type] + (fn_node.params || []).map { |p| p[:type] }).each do |type| + ([fn_node.return_type] + (fn_node.params || []).map { |p| p.type }).each do |type| collect_implicit_type_params(type, inferred, explicit) end (explicit + inferred).uniq @@ -5604,7 +5604,7 @@ def resolve_borrow_source(call_node) return nil if primary == :wildcard primary_root = primary.to_s.split(".").first - param_index = func_type.params&.find_index { |p| p[:name] == primary_root } + param_index = func_type.params&.find_index { |p| p.name == primary_root } return nil unless param_index args = call_node.is_a?(AST::MethodCall) ? [call_node.object] + call_node.args : call_node.args @@ -5969,7 +5969,7 @@ def lookup_source_name(sym) @fn_nodes.each_value do |fn| next unless fn.respond_to?(:params) fn.params.each do |p| - return p[:name].to_s if p[:symbol].equal?(sym) + return p.name.to_s if p.symbol.equal?(sym) end end nil diff --git a/src/ast/ast.rb b/src/ast/ast.rb index 1ac7fc3a9..88690b08e 100644 --- a/src/ast/ast.rb +++ b/src/ast/ast.rb @@ -1141,12 +1141,22 @@ def name; target.respond_to?(:name) ? target.name : nil end # ExternFnDecl: EXTERN FN name(params) RETURNS type [EFFECTS :alloc] FROM "module" # Or method: EXTERN FN TypeName.method(params) RETURNS type FROM "module" # Declares a native Zig/C function importable via @import("module"). - ExternFnDecl = Struct.new(:token, :name, :params, :return_type, :from_module, :effects) { + ExternFnDecl = Struct.new(:token, :name, :params, :return_type, :from_module, :effects) do include Locatable attr_accessor :owner_type # "TypeName" for method declarations (nil for free functions) attr_accessor :owner_type_params # [:T, :U] for TypeName.method attr_accessor :fn_type_params # [:T] for fnName(...) - } + + # Same params seam as FunctionDef/LambdaLit: always Array. + def initialize(*) + super + self[:params] = (self[:params] || []).map { |p| Param.coerce(p) } + end + + def params=(val) + self[:params] = (val || []).map { |p| Param.coerce(p) } + end + end # ExternStructDecl: EXTERN STRUCT Name { fields } [CLOSE "method"] FROM "module" # Declares a native Zig/C struct type for CLEAR type-checking purposes. # CLOSE registers the type as a resource with auto-defer cleanup (RAII). diff --git a/src/ast/scope.rb b/src/ast/scope.rb index ccd6b9681..012e16483 100644 --- a/src/ast/scope.rb +++ b/src/ast/scope.rb @@ -54,7 +54,7 @@ def declare(name, reg, type, is_mutable = true, is_rebindable = false, size = ni # # The cost: storage / sync / type changes that happen AFTER the body # has been visited (notably `EscapeAnalysis.propagate_caller_sync!`, - # which mutates `param[:symbol]`) do NOT propagate to the deep-copied + # which mutates `param.symbol`) do NOT propagate to the deep-copied # entries inside nested scopes. A pass that reads `node.symbol.storage` # off an Identifier inside a nested scope sees the pre-propagation # value. @@ -62,7 +62,7 @@ def declare(name, reg, type, is_mutable = true, is_rebindable = false, size = ni # The rule for any post-annotation pass that needs a param's CURRENT # storage / sync: # - # * mutate `param[:symbol]` (the function-level entry) + # * mutate `param.symbol` (the function-level entry) # * read against `Scope.live_param_syms(fn)` to refresh stale # references # @@ -101,7 +101,7 @@ def initialize_copy(original) # Build a {param_name => live SymbolEntry} map from a FunctionDef. # - # The "live" entry is the one stored on `param[:symbol]` -- the entry + # The "live" entry is the one stored on `param.symbol` -- the entry # that lives at the function scope and that `propagate_caller_sync!` # mutates in place. Any pass that has a `capture_symbols` (or similar) # cache of SymbolEntry references collected during annotation should @@ -114,7 +114,7 @@ def initialize_copy(original) def self.live_param_syms(fn) return {} unless fn.respond_to?(:params) (fn.params || []).each_with_object({}) do |p, h| - h[p[:name].to_s] = p[:symbol] if p[:symbol] + h[p.name.to_s] = p.symbol if p.symbol end end diff --git a/src/ast/type.rb b/src/ast/type.rb index a09afa95e..72081f55f 100644 --- a/src/ast/type.rb +++ b/src/ast/type.rb @@ -2094,7 +2094,7 @@ def compute_zig_type(is_param: false, is_field: false) # 2c. Function type: FN(T, ...) -> R => *const fn(*Runtime, T, ...) anyerror!R if fn_type? param_types_zig = @raw.params.map do |p| - t = p[:type] + t = p.type t.is_a?(Type) ? t.zig_type(is_param: true) : Type.new(t).zig_type(is_param: true) end ret_zig = @raw.return_type.zig_type diff --git a/src/backends/importer.rb b/src/backends/importer.rb index ddfb1bb27..bc8d3f960 100644 --- a/src/backends/importer.rb +++ b/src/backends/importer.rb @@ -140,7 +140,7 @@ def reject_auto_in_public_signatures!(ast, abs_path) offending = [] stmt.params.each do |p| - offending << "param '#{p[:name]}'" if auto_type?(p[:type]) + offending << "param '#{p.name}'" if auto_type?(p.type) end offending << "return type" if auto_type?(stmt.return_type) next if offending.empty? diff --git a/src/mir/concurrency_checks.rb b/src/mir/concurrency_checks.rb index 6457ded04..9c707b84a 100644 --- a/src/mir/concurrency_checks.rb +++ b/src/mir/concurrency_checks.rb @@ -157,7 +157,7 @@ def check_reentrant!(fn, sig_lookup, error_handler) next unless sig.is_a?(FunctionSignature) && sig.requires && !sig.requires.empty? sig.params.each_with_index do |param, idx| - pname = param[:name].to_s + pname = param.name.to_s next unless sig.requires.key?(pname) arg = node.args[idx] next unless arg @@ -229,7 +229,7 @@ def walk_scope_for_nested_with(stmts, &blk) sig { params(with_block: T.untyped, fn: T.untyped).returns(T::Set[T.untyped]) } def collect_held_params(with_block, fn) return Set.new unless fn.respond_to?(:params) - param_names = fn.params.map { |p| p[:name].to_s }.to_set + param_names = fn.params.map { |p| p.name.to_s }.to_set out = Set.new (with_block.capabilities || []).each do |cap| n = cap_var_name(cap[:var_node]) diff --git a/src/mir/control_flow.rb b/src/mir/control_flow.rb index 7890ed604..a802a666a 100644 --- a/src/mir/control_flow.rb +++ b/src/mir/control_flow.rb @@ -507,8 +507,8 @@ def self.analyze(fn_node, can_fail_fns: nil, schema_lookup: nil) def init_entry_state state = {} (@fn_node.params || []).each do |p| - next unless p[:takes] - name = p[:name].to_s + next unless p.takes + name = p.name.to_s ti = p.type || Type.new(:Any) needs = ti ? ti.needs_explicit_cleanup?(:heap, @schema_lookup) : true state[name] = OwnerEntry.new(state: OWNED, allocator: :heap, needs_cleanup: needs) diff --git a/src/mir/escape_analysis.rb b/src/mir/escape_analysis.rb index 16426da2a..18608c09f 100644 --- a/src/mir/escape_analysis.rb +++ b/src/mir/escape_analysis.rb @@ -313,7 +313,7 @@ def self.analyze!(fn_nodes, heap_fns:, promotion_plans: {}) args = call.args || [] callee_fn.params.each_with_index do |param, idx| - next unless param[:takes] + next unless param.takes arg = args[idx] next unless arg @@ -360,8 +360,8 @@ def self.analyze!(fn_nodes, heap_fns:, promotion_plans: {}) args = call.args || [] callee_fn.params.each_with_index do |param, idx| - next unless param[:mutable] - param_t = param[:type] + next unless param.mutable + param_t = param.type param_t = param_t.is_a?(Type) ? param_t : (Type.new(param_t) rescue nil) next unless param_t && param_t.list_collection? @@ -723,7 +723,7 @@ def self.propagate_caller_sync!(fn_nodes) next if sites.empty? callee_fn.params.each_with_index do |param, idx| - entry = param[:symbol] + entry = param.symbol next unless entry # ── sync axis ──────────────────────────────────────────────── @@ -802,18 +802,18 @@ def self.propagate_caller_sync!(fn_nodes) # entry.sync currently reflects an annotation, not a propagated value). sig { params(param: AST::Param).returns(T.nilable(T::Boolean)) } private_class_method def self.param_sync_was_declared?(param) - t = param[:type] + t = param.type t.is_a?(Type) && t.any_sync? end sig { params(fn_node: AST::FunctionDef, param: AST::Param, sync: Symbol).returns(T::Boolean) } private_class_method def self.param_accepts_caller_sync?(fn_node, param, sync) - t = param[:type] + t = param.type return true if t.is_a?(Type) && (t.shared? || t.any_sync?) return true unless sync == :atomic requires = fn_node.respond_to?(:requires) ? fn_node.requires : nil - families = requires && requires[param[:name].to_s] + families = requires && requires[param.name.to_s] return false unless families.respond_to?(:include?) case sync diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index dbad673cc..984754dd7 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -1166,12 +1166,12 @@ def lower_function_def(node) # original name from MIR-level checks (notably the new # INV-CROSS-FRAME-PARAM-ALLOC verifier in mir_checker.rb). mutable_scalar_params = (node.params || []).select { |p| - next false unless p[:mutable] + next false unless p.mutable p_type_obj = p.type || Type.new(:Any) next false if p_type_obj && (p_type_obj.collection? || (p_type_obj.respond_to?(:needs_pointer_passing?) && p_type_obj.needs_pointer_passing?)) - !transpile_type(p[:type], is_param: true).start_with?("[]", "*") - }.map { |p| p[:name] }.to_set + !transpile_type(p.type, is_param: true).start_with?("[]", "*") + }.map { |p| p.name }.to_set @current_fn_mutable_scalar_params = T.let(mutable_scalar_params, T.nilable(T::Set[T.untyped])) # Collection params: already passed by pointer, skip & at recursive @@ -1183,21 +1183,21 @@ def lower_function_def(node) @current_fn_collection_params = (node.params || []).select { |p| p_type_obj = p.type || Type.new(:Any) p_type_obj.needs_pointer_passing? || - (p[:mutable] && p_type_obj.list_collection?) - }.map { |p| p[:name] }.to_set + (p.mutable && p_type_obj.list_collection?) + }.map { |p| p.name }.to_set # All param names: used to distinguish params (slices) from locals (ArrayLists) - @current_fn_param_names = (node.params || []).map { |p| p[:name] }.to_set + @current_fn_param_names = (node.params || []).map { |p| p.name }.to_set # Build param list params_mir = (node.params || []).map { |param| - p_name = mutable_scalar_params.include?(param[:name]) ? "_m_#{param[:name]}" : param[:name] + p_name = mutable_scalar_params.include?(param.name) ? "_m_#{param.name}" : param.name p_type_sym = param.type&.resolved p_type_obj = param.type || Type.new(:Any) is_user_struct = @struct_schemas&.key?(p_type_sym) # Atomic params need `anytype` so call sites pass the cell itself, # allowing WITH MATCH comptime probes to dispatch by actual family. - sym = param[:symbol] + sym = param.symbol atomic_sync = sym && (sym.sync == :atomic || (sym.sync_families && sym.sync_families.include?(:ATOMIC))) zig_t = if p_type_obj.shared? && p_type_obj.resolved.to_s.match?(/\A[A-Z]\z/) @@ -1209,17 +1209,17 @@ def lower_function_def(node) elsif atomic_sync "anytype" else - transpile_type(param[:type], is_param: true) + transpile_type(param.type, is_param: true) end - zig_t = "*#{zig_t}" if mutable_scalar_params.include?(param[:name]) && zig_t != "anytype" + zig_t = "*#{zig_t}" if mutable_scalar_params.include?(param.name) && zig_t != "anytype" # `pointer_passed`: this param's receiver is a pointer-to-T at the # Zig level, so allocations made inside this function on its behalf # outlive the function. Mirrors `@current_fn_collection_params`'s # criteria so the MIR checker can independently verify the # allocator-routing decision (see INV-CROSS-FRAME-PARAM-ALLOC). pointer_passed = p_type_obj.needs_pointer_passing? || - (param[:mutable] && p_type_obj.list_collection?) || - mutable_scalar_params.include?(param[:name]) + (param.mutable && p_type_obj.list_collection?) || + mutable_scalar_params.include?(param.name) MIR::Param.new(p_name, zig_t, pointer_passed) } @@ -1356,8 +1356,8 @@ def lower_function_def(node) # Param suppressions for unused params (node.params || []).each do |p| - next if used_names.include?(p[:name]) - suppress_name = mutable_scalar_params.include?(p[:name]) ? "_m_#{p[:name]}" : p[:name] + next if used_names.include?(p.name) + suppress_name = mutable_scalar_params.include?(p.name) ? "_m_#{p.name}" : p.name prologue << MIR::Suppress.new(suppress_name) end @@ -1374,14 +1374,14 @@ def lower_function_def(node) # Emit AllocMark + Cleanup for TAKES parameters (replaces insert_takes_drops! from MIRPass). # TAKES params own their value from function entry; cleanup is always defer (Cleanup, not ErrCleanup). takes_mir = [] - (node.params || []).select { |p| p[:takes] }.each do |p| - entry = @current_bindings[p[:name].to_s] + (node.params || []).select { |p| p.takes }.each do |p| + entry = @current_bindings[p.name.to_s] next unless entry && entry[:needs_cleanup] ti = p.type || Type.new(:Any) drop_entry = entry.dup build_drop_entry!(drop_entry, ti, nil) - takes_mir << MIR::AllocMark.new(p[:name].to_s, entry[:alloc], ti) - takes_mir << MIR::Cleanup.new(zig_safe_name(p[:name].to_s), drop_entry) + takes_mir << MIR::AllocMark.new(p.name.to_s, entry[:alloc], ti) + takes_mir << MIR::Cleanup.new(zig_safe_name(p.name.to_s), drop_entry) end # Lower body (track snapshot types for catch blocks) @@ -1429,7 +1429,7 @@ def lower_function_def(node) prologue + body_mir, :private, false, comptime_params) # Outer function: calls inner, catches errors - call_args = fn_needs_rt ? ["rt"] + (node.params || []).map { |p| p[:name] } : (node.params || []).map { |p| p[:name] } + call_args = fn_needs_rt ? ["rt"] + (node.params || []).map { |p| p.name } : (node.params || []).map { |p| p.name } inner_call = "#{inner_name}(#{call_args.join(', ')})" catch_zig, catch_clause_bodies = build_catch_clauses(node, fn_can_fail) @@ -1481,9 +1481,9 @@ def build_post_outer_fn(node, params_mir, return_type_str, fn_needs_rt, vis, com # names verbatim. Forwarding the user-level name would produce # "use of undeclared identifier" at the wrapper's call site. mutable_scalar = (node.params || []).select { |p| - p[:mutable] && !transpile_type(p[:type], is_param: true).start_with?("[]", "*") - }.map { |p| p[:name] }.to_set - forward_name = ->(p) { mutable_scalar.include?(p[:name]) ? "_m_#{p[:name]}" : p[:name] } + p.mutable && !transpile_type(p.type, is_param: true).start_with?("[]", "*") + }.map { |p| p.name }.to_set + forward_name = ->(p) { mutable_scalar.include?(p.name) ? "_m_#{p.name}" : p.name } arg_idents = (node.params || []).map { |p| MIR::Ident.new(forward_name.call(p)) } arg_idents = [MIR::Ident.new("rt")] + arg_idents if fn_needs_rt @@ -1709,7 +1709,7 @@ def lower_func_call(node) callee_sig = @fn_sigs&.dig(node.name) || @fn_sigs&.dig(node.name.to_s) args_mir = node.args.each_with_index.map { |a, idx| # The annotator stamps was_moved when the callee takes ownership of this - # arg on success (param[:takes] || GIVE). That is the SINGLE source of + # arg on success (param.takes || GIVE). That is the SINGLE source of # truth for "callee takes" — the lowering must not re-derive it from # CopyNode/MoveNode wrappers (a COPY into a borrow param is NOT a take). takes = a.was_moved @@ -1733,14 +1733,14 @@ def lower_func_call(node) callee_param = params_list[idx] end callee_wants_mutable_list = - callee_param && callee_param[:mutable] && - callee_param[:type].respond_to?(:list_collection?) && - callee_param[:type].list_collection? + callee_param && callee_param.mutable && + callee_param.type.respond_to?(:list_collection?) && + callee_param.type.list_collection? callee_param_type = if callee_param callee_param.type || Type.new(:Any) end callee_wants_mutable_value = - callee_param && callee_param[:mutable] && a.is_a?(AST::Identifier) && + callee_param && callee_param.mutable && a.is_a?(AST::Identifier) && !callee_wants_mutable_list && !(callee_param_type&.respond_to?(:needs_pointer_passing?) && callee_param_type.needs_pointer_passing?) @@ -2115,13 +2115,13 @@ def build_extern_trampoline_call(node) # Skip extern/module functions: their CLEAR types (e.g. String -> []const u8) may differ # from the actual Zig/C types (e.g. [*:0]const u8), breaking implicit coercions. sig = @fn_sigs&.dig(node.name) || @fn_sigs&.dig(node.name.to_sym) || @fn_sigs&.dig(node.name.to_s) - sig_params = (sig&.params || sig&.dig(:params) || []).reject { |p| p[:comptime] } + sig_params = (sig&.params || sig&.dig(:params) || []).reject { |p| p.comptime } arg_field_types = if sig&.module_alias nil else types = sig_params.each_with_index.map do |p, i| next nil unless i < runtime_ast_args.length - pt = p[:type] + pt = p.type pt.is_a?(Type) ? pt.zig_type(is_param: true) : (Type::ZIG_TYPE_MAP[pt] || nil) end types.empty? || types.all?(&:nil?) ? nil : types @@ -2258,12 +2258,12 @@ def lower_lambda(node) params_list = sig.params || [] params_mir = [MIR::Param.new("_rt", "*Runtime", false)] + params_list.map { |p| - p_type = p[:type] + p_type = p.type type_str = p_type.is_a?(Type) ? p_type.zig_type(is_param: true) : transpile_type(p_type || :Any, is_param: true) pt_obj = p_type.is_a?(Type) ? p_type : (Type.new(p_type) rescue nil) pp = !!(pt_obj && (pt_obj.respond_to?(:needs_pointer_passing?) && pt_obj.needs_pointer_passing? || - (p[:mutable] && pt_obj.respond_to?(:list_collection?) && pt_obj.list_collection?))) - MIR::Param.new(p[:name], type_str, pp) + (p.mutable && pt_obj.respond_to?(:list_collection?) && pt_obj.list_collection?))) + MIR::Param.new(p.name, type_str, pp) } ret_zig = sig.return_type.zig_type @@ -2276,7 +2276,7 @@ def lower_lambda(node) # Build body: suppressions + return expr body_mir = [] body_mir << MIR::Suppress.new("_rt") - params_list.each { |p| body_mir << MIR::Suppress.new(p[:name]) } + params_list.each { |p| body_mir << MIR::Suppress.new(p.name) } body_mir << MIR::ReturnStmt.new(lower(node.body)) fn_def = MIR::FnDef.new(fn_name, params_mir, ret_str, body_mir, nil, false, nil) @@ -7285,7 +7285,7 @@ def universal_poly_arg_needs_addr?(arg_node, callee_sig, idx) return false unless callee_sig.requires param = callee_sig.params[idx] return false unless param - pname = (param[:name] || param["name"]).to_s + pname = param.name.to_s fams = callee_sig.requires[pname] # Universal poly: REQUIRES key present AND the family-set is empty. return false unless fams && fams.empty? diff --git a/src/mir/promotion_plan.rb b/src/mir/promotion_plan.rb index 05c9da222..89cdae672 100644 --- a/src/mir/promotion_plan.rb +++ b/src/mir/promotion_plan.rb @@ -472,9 +472,9 @@ def self.stamp_field_pre_cleanups!(body, bindings, schema_lookup: nil) # *T; cleanup must NOT re-apply &. sig { params(fn_node: AST::FunctionDef, schema_lookup: Proc, bindings: T::Hash[String, T::Hash[Symbol, T.untyped]]).returns(T.nilable(T::Array[T::Hash[Symbol, T.untyped]])) } private_class_method def self.walk_takes_params(fn_node, schema_lookup, bindings) - (fn_node.params || []).select { |p| p[:takes] }.each do |p| + (fn_node.params || []).select { |p| p.takes }.each do |p| ti = p.type || Type.new(:Any) - name = p[:name].to_s + name = p.name.to_s base = takes_param_base_entry(ti, schema_lookup) next unless base From 87bb944dd7a2ee2539ca6dc6b3c121755d5f45dd Mon Sep 17 00:00:00 2001 From: Brian Yahn Date: Sun, 17 May 2026 02:19:51 +0000 Subject: [PATCH 45/45] Collapse #51/#53 dead Symbol-vs-Type guards (stale "blocked" classification) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Both were classified "blocked — heterogeneous Symbol|Type" BEFORE the upstream sources were tightened. They are single-producer and their producers now yield Type|nil (never Symbol): #51 b[:unwrapped_type]: sole producer is `ti.wrapped_type`, whose accessor contract was tightened to Type|nil in #54. The lone reader's `unwrapped.is_a?(Type) ? .resolved : unwrapped` collapses to `unwrapped&.resolved` (nil-safe; the old else branch only ever yielded nil since Symbol was impossible). #53 cap[:resolved_type]: sole producer is `cap[:resolved_type] = var_node.full_type`, and full_type is T.nilable(Type) via the #46 seam. mir_lowering's dead `is_a?(Type) ? it : Type.new(it)` collapses to `cap[:resolved_type] || Type.new(:Any)`. The capabilities.rb `|| old_scope.resolve_type || :Any` sites are genuine nil-fallback-to-another-source and are correctly left as-is. is_a?(Type) in src/: 281 -> 279. Gates: prspec 4773/0, transpile-tests 554/554 0 leaks, fuzz 141/141. Co-Authored-By: Claude Opus 4.7 --- src/annotator.rb | 4 +++- src/mir/mir_lowering.rb | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/annotator.rb b/src/annotator.rb index b94be9b2f..57ad4f239 100644 --- a/src/annotator.rb +++ b/src/annotator.rb @@ -1322,7 +1322,9 @@ def visit_IfBind(node) # Declare each binding in the then-scope with the unwrapped type. node.bindings.each do |b| unwrapped = b[:unwrapped_type] - sym = unwrapped.is_a?(Type) ? unwrapped.resolved : unwrapped + # Sole producer sets this from ti.wrapped_type (Type|nil; + # never a Symbol) — see the binding-annotation loop above. + sym = unwrapped&.resolved current_scope.declare(b[:name], nil, unwrapped, false, false, nil, :stack) entry = current_scope.locals[b[:name]] b[:symbol] = entry diff --git a/src/mir/mir_lowering.rb b/src/mir/mir_lowering.rb index 984754dd7..2e6e1425a 100644 --- a/src/mir/mir_lowering.rb +++ b/src/mir/mir_lowering.rb @@ -3365,7 +3365,9 @@ def emit_snapshot_mutable_call(node, with_label) # *Arc, and Arc by value (the BG-capture # case). Mirrors the read-mode SNAPSHOT path. source_unwrap = with_match_unwrap_value(T.must(source_zig)) - st = cap[:resolved_type].is_a?(Type) ? cap[:resolved_type] : Type.new(cap[:resolved_type]) + # cap[:resolved_type] sole producer is var_node.full_type + # (Type|nil via the full_type seam; never a Symbol). + st = cap[:resolved_type] || Type.new(:Any) bare_t_zig = st.bare_data_type.zig_type # AtomicPtr commits surface AtomicConflict; Versioned commits surface # MvccConflict.