-
Notifications
You must be signed in to change notification settings - Fork 6
Document the RWM pipeline passes #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
leonardoalt
wants to merge
10
commits into
main
Choose a base branch
from
update-docs-rwm-pipeline
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
8e9bc4e
Document the RWM pipeline passes in detail
leonardoalt b682b79
Update src/loader/rwm/FLATTENING.md
leonardoalt 12d0a03
Update src/loader/rwm/FLATTENING.md
leonardoalt de684ac
Update src/loader/rwm/FLATTENING.md
leonardoalt d632ccd
Update src/loader/rwm/JUMP_REMOVAL.md
leonardoalt 106a3b9
Update src/loader/rwm/JUMP_REMOVAL.md
leonardoalt f725ddd
Update src/loader/rwm/REGISTER_ALLOCATION.md
leonardoalt b7317bf
Update src/loader/rwm/REGISTER_ALLOCATION.md
leonardoalt 167c838
Update src/loader/rwm/REGISTER_ALLOCATION.md
leonardoalt 49e621a
Document the common pipeline passes in detail
leonardoalt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,120 @@ | ||
| # Blockless DAG Pass | ||
|
|
||
| **Source:** `blockless_dag.rs` | ||
|
|
||
| **Input:** `DanglingOptDag` (optimized DAG with nested blocks and loops) | ||
| **Output:** `BlocklessDag` (flat DAG with labels; only loops retain sub-DAGs) | ||
|
|
||
| ## Purpose | ||
|
|
||
| This is the last common pipeline pass before the backend-specific stages. It | ||
| flattens the nested block structure into a linear sequence of nodes with labels | ||
| marking jump targets. After this pass, the only nesting that remains is for | ||
| loops — each loop still has its own sub-DAG, because loops represent a separate | ||
| "frame" with its own address space in the final output. | ||
|
|
||
| Non-loop blocks are fully inlined into their parent DAG, with their outputs | ||
| becoming labels that breaks can jump to. This makes the representation much | ||
| closer to assembly: a flat sequence of operations with forward-only jumps to | ||
| labels. | ||
|
|
||
| ## Key Transformation | ||
|
|
||
| ### Blocks Become Labels | ||
|
|
||
| A non-loop block in the input DAG: | ||
| ``` | ||
| Block { | ||
| kind: Block, | ||
| sub_dag: [Inputs, ..., Br(0, outputs)] | ||
| } | ||
| ``` | ||
|
|
||
| is inlined into the parent. The block's input node is suppressed (its outputs | ||
| are remapped to the corresponding inputs in the parent scope), and a `Label` | ||
| node is inserted where the block's outputs would be consumed. Break instructions | ||
| targeting the block become jumps to this label. | ||
|
|
||
| ### Loops Remain Nested | ||
|
|
||
| Loop blocks keep their sub-DAG structure: | ||
| ``` | ||
| Loop { | ||
| sub_dag: BlocklessDag { nodes: [...] }, | ||
| break_targets: [(depth, [target_types])] | ||
| } | ||
| ``` | ||
|
|
||
| The `break_targets` field records all the break targets that the loop body | ||
| uses, relative to the parent frame. This lets the backend know which external | ||
| labels/frames the loop may jump to. | ||
|
|
||
| ## Break Target Resolution | ||
|
|
||
| In the input DAG, break targets are relative depths into the block stack. In the | ||
| blockless DAG, targets are resolved into `BreakTarget { depth, kind }`: | ||
|
|
||
| - **`depth`**: The number of frame levels between the break and the target. At | ||
| the top level, depth 0 means the current function/loop frame. Inside a loop, | ||
| depth 1 means the parent frame, depth 2 the grandparent, etc. | ||
|
|
||
| - **`kind`**: Either `FunctionOrLoop` (targeting the function return or a loop's | ||
| next iteration) or `Label(id)` (targeting a specific label created from an | ||
| inlined block). | ||
|
|
||
| The key property: **jumps to labels are always forward** (labels appear after | ||
| the jumps that target them), while **jumps to loops go backward** (to the loop | ||
| header at the start of the loop's sub-DAG). | ||
|
|
||
| ## Example | ||
|
|
||
| Input DAG (with nested block): | ||
| ``` | ||
| Node 0: Inputs → [x] | ||
| Node 1: Block { | ||
| kind: Block, | ||
| sub_dag: [ | ||
| Node 0: Inputs → [x] | ||
| Node 1: i32.const 10 | ||
| Node 2: i32.gt_s ← [(0,0), (1,0)] | ||
| Node 3: br_if 0 ← [(0,0), (2,0)] ;; exit block if x > 10 | ||
| Node 4: i32.const 0 | ||
| Node 5: br 1 ← [(4,0)] ;; return 0 | ||
| ] | ||
| } → [result] | ||
| Node 2: br 0 ← [(1,0)] ;; return result | ||
| ``` | ||
|
|
||
| Output blockless DAG (flattened): | ||
| ``` | ||
| Node 0: Inputs → [x] | ||
| Node 1: i32.const 10 | ||
| Node 2: i32.gt_s ← [(0,0), (1,0)] | ||
| Node 3: BrIf(Label(42)) ← [(0,0), (2,0)] ;; jump to label if x > 10 | ||
| Node 4: i32.const 0 | ||
| Node 5: Br(Function) ← [(4,0)] ;; return 0 | ||
| Node 6: Label { id: 42 } → [result] ;; target for the br_if | ||
| Node 7: Br(Function) ← [(6,0)] ;; return result | ||
| ``` | ||
|
|
||
| The block's internal input node (its node 0) was suppressed and its references | ||
| were remapped to the parent's node 0. The block itself became a label node. | ||
|
|
||
| ## Node Remapping | ||
|
|
||
| When blocks are inlined, node indices change. The pass maintains an | ||
| `outputs_map: HashMap<ValueOrigin, ValueOrigin>` that translates old | ||
| `(node, output)` pairs to new ones. For inlined block inputs, the map redirects | ||
| through the `input_mapping` to the actual source nodes in the parent. | ||
|
|
||
| ## Design Notes | ||
|
|
||
| - Labels use unique IDs generated by a shared `AtomicU32` counter (the | ||
| `LabelGenerator`), ensuring uniqueness across all functions and all frames. | ||
|
|
||
| - The pass preserves the `NodeInput::Constant` variant, passing inline | ||
| constants through unchanged. | ||
|
|
||
| - Break targets are resolved relative to frame boundaries, not block nesting. | ||
| This is important because the backends allocate registers per-frame (per | ||
| function or per loop body), not per-block. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,131 @@ | ||
| # Block Tree Pass | ||
|
|
||
| **Source:** `block_tree.rs` | ||
|
|
||
| **Input:** Raw WASM function bytecode (`Unparsed`) | ||
| **Output:** `BlockTree` (tree of `Block` and `Instruction` elements) | ||
|
|
||
| ## Purpose | ||
|
|
||
| This is the first pass in the compilation pipeline. It takes the raw stream of | ||
| WASM operators and parses them into a tree structure where control flow is | ||
| represented by nested blocks and loops, and instructions within each block form | ||
| a linear sequence. | ||
|
|
||
| The pass also normalizes several WASM patterns into simpler, more uniform | ||
| representations that are easier for subsequent passes to handle. | ||
|
|
||
| ## Normalizations | ||
|
|
||
| ### If-Else to Block + BrIf | ||
|
|
||
| WASM's `if-else-end` construct is desugared into blocks with conditional | ||
| breaks. This reduces the number of control flow constructs that later passes | ||
| need to handle. | ||
|
|
||
| **If without else:** | ||
| ``` | ||
| ;; Original WASM ;; Normalized BlockTree | ||
| if block (params..., i32) -> (results...) | ||
| <if_body> br_if_zero 0 ;; skip if_body when false | ||
| end <if_body> | ||
| end | ||
| ``` | ||
|
|
||
| **If with else:** | ||
| ``` | ||
| ;; Original WASM ;; Normalized BlockTree | ||
| if block (params..., i32) -> (results...) | ||
| <if_body> block (params..., i32) -> (params...) | ||
| else br_if 0 ;; skip else_body when true | ||
| <else_body> <else_body> | ||
| end br 1 ;; skip if_body | ||
| end | ||
| <if_body> | ||
| end | ||
| ``` | ||
|
|
||
| The condition value is carried as an extra block input and consumed by the | ||
| conditional break at the top. | ||
|
|
||
| ### Return to Br | ||
|
|
||
| WASM `return` is converted to a `br` targeting the outermost block (the | ||
| function body). This makes the function body just another block, simplifying | ||
| break handling. | ||
|
|
||
| ``` | ||
| ;; Original ;; Normalized | ||
| return br <function_depth> | ||
| ``` | ||
|
|
||
| ### Explicit Fallthrough Breaks | ||
|
|
||
| Every block that can fall through gets an explicit `br 0` appended. This | ||
| guarantees that all blocks are exited via a break instruction, which simplifies | ||
| the locals data flow pass (it can assume all values leave blocks through break | ||
| inputs). | ||
|
|
||
| ``` | ||
| ;; Original ;; Normalized | ||
| block block | ||
| i32.const 42 i32.const 42 | ||
| end br 0 ;; explicit fallthrough | ||
| end | ||
| ``` | ||
|
|
||
| ### Loop Wrapping | ||
|
|
||
| When a loop can fall through (i.e., it doesn't always branch back to the loop | ||
| header or exit via a break), an outer block is added around it. The fallthrough | ||
| becomes a break to the outer block. This ensures loops are only exited through | ||
| breaks. | ||
|
|
||
| ``` | ||
| ;; Original ;; Normalized | ||
| loop block -> (results...) | ||
| <body> loop (params...) | ||
| end <body> | ||
| br 1 ;; exit to outer block | ||
| end | ||
| end | ||
| ``` | ||
|
|
||
| ### Dead Code Removal | ||
|
|
||
| After any instruction that unconditionally diverts control flow (`br`, | ||
| `br_table`, `unreachable`, or a non-fallthrough loop), all subsequent | ||
| instructions up to the next `end` or `else` are discarded. | ||
|
|
||
| ``` | ||
| ;; Original ;; Normalized | ||
| br 0 br 0 | ||
| i32.const 1 ;; dead code removed | ||
| i32.add ;; dead code removed | ||
| ``` | ||
|
|
||
| ### Constant Global Inlining | ||
|
|
||
| `global.get` on immutable globals is replaced with the global's constant | ||
| initializer. This is done early because it enables the downstream constant | ||
| optimization passes to work with these values. | ||
|
|
||
| ``` | ||
| ;; Original (global 0 is immutable, initialized to 42) | ||
| global.get 0 ;; Normalized: i32.const 42 | ||
| ``` | ||
|
|
||
| ## Output Structure | ||
|
|
||
| The output `BlockTree` is a `Vec<Element>` where each `Element` is either: | ||
|
|
||
| - **`Instruction`**: A WASM operator, a `BrIfZero`, or a `BrTable`. | ||
| - **`Block`**: A nested block containing: | ||
| - `block_kind`: `Block` or `Loop` | ||
| - `interface_type`: The block's input and output types | ||
| - `elements`: The block's contents (recursively) | ||
| - `input_locals`, `output_locals`, `carried_locals`: Initially empty; filled | ||
| by the next pass | ||
|
|
||
| At this stage, all blocks have well-defined stack-level interfaces (params and | ||
| results), but local variable flow is still implicit. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| # Constant Collapse Pass | ||
|
|
||
| **Source:** `dag/const_collapse.rs` | ||
|
|
||
| **Input:** `PlainDag` (the DAG after construction) | ||
| **Output:** `ConstCollapsedDag` (same DAG, with some constant references replaced by inline constants) | ||
|
|
||
| ## Purpose | ||
|
|
||
| This optional optimization pass identifies constant values that can be folded | ||
| into the instructions that consume them, eliminating the need for a separate | ||
| register to hold the constant. This is driven by the target ISA: if the ISA | ||
| supports immediate operands on certain instructions (e.g., RISC-V's `addi`), | ||
| the constant can be inlined directly. | ||
|
|
||
| ## How It Works | ||
|
|
||
| The pass is gated by `Settings::get_const_collapse_processor()`. If the ISA | ||
| implementor returns `None`, no collapsing is performed and the DAG passes | ||
| through unchanged. | ||
|
|
||
| If a processor function is provided, the pass walks every `WASMOp` node in the | ||
| DAG and checks whether any of its inputs reference constant nodes. For each | ||
| such node, it calls the processor with the operator and a slice of | ||
| `MaybeConstant` values describing each input: | ||
|
|
||
| - **`NonConstant`**: The input is not a constant. | ||
| - **`ReferenceConstant { value, must_collapse }`**: The input references a | ||
| constant node with a known value. The processor can set `must_collapse` to | ||
| `true` to indicate the constant should be inlined. | ||
| - **`CollapsedConstant(value)`**: The input is already an inline constant | ||
| (from a previous pass; not expected in the default pipeline). | ||
|
|
||
| When `must_collapse` is set to `true`, the pass replaces the `NodeInput::Reference` | ||
| with a `NodeInput::Constant`, severing the dependency on the constant node. | ||
|
|
||
| ## Example | ||
|
|
||
| Before collapse: | ||
| ``` | ||
| Node 0: Inputs → [x] | ||
| Node 1: i32.const 5 → [5] | ||
| Node 2: i32.add ← [(0,0), (1,0)] → [result] | ||
| ``` | ||
|
|
||
| If the ISA processor recognizes that `i32.add` with a constant second operand | ||
| can become an "add immediate" instruction, it sets `must_collapse = true` for | ||
| input 1. After collapse: | ||
|
|
||
| ``` | ||
| Node 0: Inputs → [x] | ||
| Node 1: i32.const 5 → [5] (may now be unused) | ||
| Node 2: i32.add ← [(0,0), Constant(5)] → [result] | ||
| ``` | ||
|
|
||
| Node 1 is now potentially dangling (no references to it). The dangling removal | ||
| pass will clean it up later. | ||
|
|
||
| ## Recursion Into Blocks | ||
|
|
||
| The pass recurses into block sub-DAGs. For non-loop blocks, it propagates | ||
| knowledge of which block inputs are constants, so that constants flowing through | ||
| block boundaries can also be collapsed inside the block. | ||
|
|
||
| For loops, constant inputs are **not** propagated, because a loop input might be | ||
| constant on the first iteration but different on subsequent iterations (it could | ||
| be updated by a break back to the loop header). In practice, optimized WASM | ||
| rarely has constant loop inputs anyway. | ||
|
|
||
| ## Statistics | ||
|
|
||
| The pass returns the total count of collapsed constants, which is aggregated in | ||
| `Statistics::constants_collapsed`. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't understand this paragraph. I suggest removing it.