WIP: refactor: allow #var references to any operator, not just stateful ones#2843
Draft
MingweiSamuel wants to merge 3 commits intomingwei/dfir-push-opsfrom
Draft
WIP: refactor: allow #var references to any operator, not just stateful ones#2843MingweiSamuel wants to merge 3 commits intomingwei/dfir-push-opsfrom
MingweiSamuel wants to merge 3 commits intomingwei/dfir-push-opsfrom
Conversation
Deploying hydro with
|
| Latest commit: |
55de3bf
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://04aa5df8.hydroflow.pages.dev |
| Branch Preview URL: | https://mingwei-dfir-mapped-singleto.hydroflow.pages.dev |
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
7202fd3 to
a032182
Compare
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
e1d1285 to
9f45192
Compare
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
9f45192 to
f78a920
Compare
a032182 to
4868eb3
Compare
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
4868eb3 to
902c7e7
Compare
f78a920 to
a30a1e9
Compare
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
902c7e7 to
a688378
Compare
a30a1e9 to
7bffc3b
Compare
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
MingweiSamuel
added a commit
that referenced
this pull request
May 5, 2026
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
7bffc3b to
33f54a9
Compare
a688378 to
227b5ba
Compare
MingweiSamuel
added a commit
that referenced
this pull request
May 6, 2026
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
MingweiSamuel
added a commit
that referenced
this pull request
May 6, 2026
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
33f54a9 to
cec1743
Compare
227b5ba to
5b45963
Compare
MingweiSamuel
added a commit
that referenced
this pull request
May 6, 2026
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
MingweiSamuel
added a commit
that referenced
this pull request
May 6, 2026
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
cec1743 to
2e165ed
Compare
MingweiSamuel
added a commit
that referenced
this pull request
May 6, 2026
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
MingweiSamuel
added a commit
that referenced
this pull request
May 6, 2026
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
88f4415 to
9180e6f
Compare
3a27310 to
d0625d4
Compare
9180e6f to
20442db
Compare
d0625d4 to
c53fbe9
Compare
20442db to
31885e7
Compare
c53fbe9 to
ef7ca08
Compare
31885e7 to
a247155
Compare
- dfir_pipes/src/push/fold_keyed.rs: Add `const` to `new()`, wrap `HashMap::iter()` in `#[expect(clippy::disallowed_methods)]` block with reason. - dfir_pipes/src/push/reduce_keyed.rs: Same fixes as fold_keyed. - dfir_lang/src/graph/graph_write.rs: Add `MonotoneAccum` variant to both match arms (arrow_head gets `">"` like None, link style gets `"#60"` green color). - dfir_lang/src/graph/ops/persist_mut.rs: Remove unused `DelayType` import. - dfir_lang/src/graph/ops/persist_mut_keyed.rs: Remove unused `DelayType` import. - dfir_lang/src/graph/meta_graph.rs: Prefix unused `pivot_fn_ident` with underscore. - dfir_rs/tests/surface_persist_mut_push.rs: Remove unused import and fix unused mutable variable warnings. - dfir_rs/tests/surface_persist_mut_push_expanded.rs: Replace `#![allow(unused)]` with `#![expect(unused, reason = ...)]`, fix unnecessary qualifications. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843 Resolve conflict: restore pivot wrapper function The pivot_fn_ident wrapper function provides type checking at the pull-to-push boundary. It was accidentally removed during borrow conflict debugging but is not related to the borrow issue (which was caused by is_first_run_this_tick if/else, not the pivot wrapper). Resolved the rebase conflict by keeping the wrapper with the updated `.data()` API. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
Step 1 of singleton-by-reference generalization: remove the validation check that restricted #var references to only operators with `has_singleton_output`. Any named variable can now be referenced via #var syntax. Step 2 (subgraph boundary enforcement) requires no code changes — the existing `singleton_barrier_crossers` mechanism in `flat_to_partitioned.rs` already forces the referenced node and consumer into different subgraphs with correct topological ordering. Changes: - Removed the `has_singleton_output` validation check in `flat_graph_builder.rs` - Removed the `surface_singleton_nostate` compile-fail tests (both nightly and stable variants) since referencing non-stateful operators is now intentionally allowed Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented. Referencing a non-stateful operator will currently produce incorrect codegen (the singleton_output_ident variable won't exist). This will be addressed in a follow-up change. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
Adds infrastructure to track which nodes produce exactly one item (are singletons), propagating this property through operators that preserve singleton cardinality. Changes: - Added `preserves_singleton: bool` field to `OperatorConstraints`. Set to `true` for: map, filter_map, inspect, scan, scan_async_blocking, identity, enumerate. All other operators default to `false`. - Added `node_is_singleton` field to `DfirGraph` (SparseSecondaryMap). - Added `compute_node_singletons()` method that propagates singleton status: a node is singleton if it has `has_singleton_output`, OR if all its predecessors are singletons and it has `preserves_singleton`. - Added `node_is_singleton()` accessor method. - Called `compute_node_singletons()` in `process_operator_errors` after operator instances are created. - Updated the validation check: `#var` references now validate against `node_is_singleton()` instead of `has_singleton_output`. This allows referencing derived singletons like `fold -> map`. - Updated compile-fail test error message to reflect new wording. This means `fold -> map(|x| x * 2)` can now be referenced via `#var`, while `source_iter -> flat_map(...)` still correctly errors. Co-authored-by: Infinity 🤖 <infinity@hydro.run> PR: #2843
ef7ca08 to
55de3bf
Compare
a247155 to
cf244f4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Step 1 of singleton-by-reference generalization: remove the validation check
that restricted #var references to only operators with
has_singleton_output.Any named variable can now be referenced via #var syntax.
Step 2 (subgraph boundary enforcement) requires no code changes — the existing
singleton_barrier_crossersmechanism inflat_to_partitioned.rsalreadyforces the referenced node and consumer into different subgraphs with correct
topological ordering.
Changes:
has_singleton_outputvalidation check inflat_graph_builder.rssurface_singleton_nostatecompile-fail tests (both nightly andstable variants) since referencing non-stateful operators is now intentionally
allowed
Note: Step 3 (codegen for non-stateful referenced nodes) is not yet implemented.
Referencing a non-stateful operator will currently produce incorrect codegen
(the singleton_output_ident variable won't exist). This will be addressed in
a follow-up change.
Co-authored-by: Infinity 🤖 infinity@hydro.run