Skip to content

perf(hydro_lang/sim): create dedicated fold hook#2852

Draft
Benjscho wants to merge 5 commits intohydro-project:mainfrom
Benjscho:push-utytrmknnpnn
Draft

perf(hydro_lang/sim): create dedicated fold hook#2852
Benjscho wants to merge 5 commits intohydro-project:mainfrom
Benjscho:push-utytrmknnpnn

Conversation

@Benjscho
Copy link
Copy Markdown
Contributor

@Benjscho Benjscho commented May 6, 2026

Addresses #2656

Dedicated fold hooks for the simulator

Replaces the previous ObserveNonDet-before-folds approach with fold hooks that model non-deterministic input ordering and batching for folds in the simulator.

What changed

New TopLevelFoldHook — For top-level (cross-tick) folds, a dedicated hook selects a non-empty subset of buffered inputs and permutes them before releasing to the fold. Unselected elements remain buffered for future ticks, modeling delayed/lossy delivery. This replaces the old pattern of observing non-determinism on the input stream separately.

In-tick fold permutation via StreamOrderHook — For in-tick folds on NoOrder inputs, an inline shuffle hook permutes elements before the fold to catch commutativity bugs. In future we can add a feature to allow sim callers to skip permutations when commutativity proofs are provided.

PassthroughSingletonHook — When a fold output is already controlled by a TopLevelFoldHook, the downstream singleton batch skips the normal SingletonHook (which would add redundant decisions) and uses a passthrough that always releases the latest value.

@Benjscho Benjscho requested a review from shadaj May 6, 2026 22:54
Benjscho and others added 4 commits May 8, 2026 10:40
Previously, folds on unordered streams relied on an ObserveNonDet IR node
(inserted by assume_ordering) to handle ordering non-determinism in simulation.
This was architecturally wrong: the ObserveNonDet's TopLevelStreamOrderHook
released elements one-at-a-time with ordering decisions, which is not the right
abstraction for fold inputs. This change gives folds their own simulation hook
that directly models subset selection and optional permutation.

IR changes:
- Add `commutativity_proven: bool` field to HydroNode::Fold and HydroNode::FoldKeyed.
  Set to true when the input stream is NoOrder (type system guarantees the user
  proved commutativity). When true, the simulator skips permutation exploration.
- Remove the assume_ordering call from Stream::fold() and KeyedStream::fold().
  Only assume_retries is kept (for idempotence on AtLeastOnce streams).
- Add `emit_fold_hook` method to DfirBuilder trait (default no-op). The fold
  compilation calls this before emitting the scan operator.

Simulation runtime:
- Add TopLevelFoldHook<T> which implements SimHook. On each decision:
  - Selects a non-empty subset of buffered inputs (models delayed/lossy delivery)
  - Permutes the selected subset via Fisher-Yates if commutativity is not proven
  - Releases the subset to the fold's scan operator
  - Keeps unselected elements in the buffer for future releases
- SimBuilder::emit_fold_hook registers this hook for top-level NoOrder fold inputs.
  In-tick folds don't need a hook (type system enforces commutativity for NoOrder).

Tests:
- sim_fold_commutative_explores_all_subset_sums: asserts every possible subset
  sum (0..=7 for inputs [1,2,4]) is observed across exhaustive executions
- sim_fold_total_order_no_permutation: asserts only valid prefix-concatenations
  are observed (no permutation for TotalOrder folds)
- sim_fold_commutative_skips_permutations: asserts exact state count (432)
- sim_fold_keyed_no_order: verifies keyed fold correctness with flat subset hook

Note: state count is 432 (vs 270 with old ObserveNonDet) because subset selection
explores more combinations while the downstream SingletonHook still redundantly
selects among scan intermediates. A planned follow-up optimization will replace
scan with fold for batch-only consumers, eliminating this redundancy.

Co-authored-by: Infinity 🤖 <infinity@hydro.run>
Three key changes to the simulator's fold handling:

1. PassthroughSingletonHook: When a fold's output is consumed by a
   Batch (snapshot), use a trivial hook that always releases the latest
   value instead of the full SingletonHook with redundant index selection.
   This eliminates equivalent states reachable through different paths.

2. Always permute in TopLevelFoldHook: Remove the commutativity_proven
   skip — the simulator is conservative and always explores permutations
   to catch false commutativity claims via manual_proof!.

3. Tests for commutativity detection:
   - sim_fold_catches_false_commutativity: proves the simulator catches
     non-commutative top-level folds even with manual_proof!
   - sim_fold_in_tick_does_not_yet_catch_false_commutativity: documents
     that in-tick folds on NoOrder streams are not yet permuted (follow-up)

State count for 3-element commutative fold + snapshot: 432 → 108 (75% reduction).

Co-authored-by: Infinity 🤖 <infinity@hydro.run>
When an in-tick fold has `commutativity_proven` (from `manual_proof!`) and its
input is `NoOrder`, the simulator now emits an inline `StreamOrderHook` that
permutes the batch elements before they reach the fold. This catches bugs where
a fold claims commutativity but actually depends on input order.

Changes:
- hydro_lang/src/sim/builder.rs: Modified `emit_fold_hook` to handle in-tick
  folds (non-top-level) with `commutativity_proven` on `NoOrder` streams by
  emitting a `StreamOrderHook` inline shuffle between the batch and the fold.
- hydro_lang/src/compile/ir/mod.rs: Added a new branch in the fold compilation
  logic for in-tick folds with `commutativity_proven`, calling `emit_fold_hook`
  and using the hooked input ident when one is returned.
- hydro_lang/src/sim/tests/mod.rs: Updated the test to assert both "ab" and
  "ba" are observed, confirming the simulator now catches false commutativity.

Co-authored-by: Infinity 🤖 <infinity@hydro.run>
@Benjscho Benjscho force-pushed the push-utytrmknnpnn branch 2 times, most recently from 25cc66b to 84fcd44 Compare May 8, 2026 18:09
- ir/mod.rs: Extract the FoldKeyed acc closure (identical in hooked/non-hooked branches)
  to a single definition, with only the DFIR pipeline differing between branches.
- sim/builder.rs: Deduplicate Singleton hook setup (PassthroughSingletonHook vs
  SingletonHook) — compute the hook expression first, then share all channel/buffer/DFIR
  boilerplate. Also deduplicate the top-level emit_fold_hook Stream vs KeyedStream
  branches by extracting the debug type first.
- sim/tests/mod.rs: Remove sim_fold_commutative_state_count which was an exact duplicate
  of sim_fold_sample_eager_state_count (same inputs, same assertion, same count=108).
- hydro_lang/src/compile/ir/mod.rs: Collapse the top-level Fold branch's
  hooked vs non-hooked code paths into a single `let (effective_input, acc) = ...`
  followed by one shared `add_dfir` call. Eliminates duplicated pipeline template.
- hydro_lang/src/sim/tests/mod.rs: Rename `sim_fold_in_tick_does_not_yet_catch_false_commutativity`
  to `sim_fold_in_tick_catches_false_commutativity` — the test now asserts the
  simulator *does* catch it (since commit ttqkpunk added the StreamOrderHook).
- hydro_lang/src/sim/builder.rs: Add comment clarifying that `commutativity_proven`
  is intentionally unused in the top-level emit_fold_hook path (always permutes to
  catch false manual_proof! claims).
- Removed the `commutativity_proven: bool` field from `HydroNode::Fold` and
  `HydroNode::FoldKeyed`, along with all related plumbing.
  The shuffle hook now always fires for NoOrder inputs regardless of whether
  commutativity was proven, since we're still permuting within a batch anyway.
- Removed `register_fold_hooked_output` and `is_fold_hooked_output` from the
  `DfirBuilder` trait entirely. These were bookkeeping/query methods that don't
  belong on a code-gen trait. The fold-hooked ident tracking is now a
  `HashSet<String>` threaded through `emit`/`emit_core` as a parameter.
- Removed the default impl from `emit_fold_hook` and added an explicit no-op
  implementation in `SecondaryMap<LocationKey, FlatGraphBuilder>`. This matches
  the existing convention where every trait method is explicitly handled by both
  implementors.
- Added `fold_hooked_idents: &HashSet<String>` parameter to `DfirBuilder::batch`
  so the SimBuilder can query it without storing it as internal state. Removed
  the `fold_hooked_idents` field from `SimBuilder`.

Co-authored-by: Infinity 🤖 <infinity@hydro.run>
@Benjscho Benjscho force-pushed the push-utytrmknnpnn branch from 84fcd44 to 892b349 Compare May 8, 2026 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant