feat(compile): V2.2 on-disk per-module object cache (follow-up to #131, #132)#134
Closed
TheHypnoo wants to merge 2 commits intoPerryTS:mainfrom
Closed
feat(compile): V2.2 on-disk per-module object cache (follow-up to #131, #132)#134TheHypnoo wants to merge 2 commits intoPerryTS:mainfrom
TheHypnoo wants to merge 2 commits intoPerryTS:mainfrom
Conversation
Adds `.perry-cache/objects/<target>/<key:016x>.o`, shared across `perry compile` / `perry run` / `perry dev` invocations. Each rayon codegen worker computes a djb2 key from (source hash, every codegen- affecting `CompileOptions` field, perry version) and reuses the cached bytes instead of re-invoking LLVM on unchanged modules. On a 30-module bench, warm rebuilds drop from ~714 ms → ~509 ms (~29% faster); a single-module edit rebuilds in the same ~512 ms (cost scales with changed modules, not total). Follows v2.1 (PerryTS#132) — v2.1's in-memory AST cache only helps within a single `perry dev` session and didn't pay off against SWC's ~1ms/file parse cost; v2.2 is the real win because it skips the whole LLVM pipeline, not just parsing. Architecture: - `ObjectCache` (thread-safe via AtomicUsize counters, file-per-entry so rayon workers don't contend) with atomic tmp-then-rename writes. IO errors are silently counted and degrade to the uncached codepath — the cache is strictly an optimization. - `compute_object_cache_key` serializes every `CompileOptions` field that affects `compile_module`'s bytes: source hash, target triple, is_entry_module, all import maps/sets (sorted so HashMap iteration order doesn't leak in), imported classes (full signature incl. ids), imported enums, type aliases, enabled features, i18n snapshot, CARGO_PKG_VERSION. Topologically-sorted lists (non_entry_prefixes, native_module_init_names) preserve order so a link-ordering change (the v0.5.127-128 bug class) invalidates consumers. - Disabled automatically in bitcode-link mode (PERRY_LLVM_BITCODE_LINK=1) since compile_module emits .ll text, not object bytes. CLI: - `--no-cache` flag on `perry compile` / `perry run` / `perry dev`. - `PERRY_NO_CACHE=1` env var (CI-friendly override). - `PERRY_CACHE_VERBOSE=1` prints `• object cache: H/T hit (M miss, S store, E store-err)` - `perry cache info` — cache location, total size, per-target breakdown. - `perry cache clean` — wipe `.perry-cache/` for the current project. Tests: - 15 unit tests in `object_cache_tests` covering key stability across HashMap-insertion-order permutations, key divergence on every invalidation axis (source hash, perry version, target, entry flag, non-entry-prefix order, imported class arity, bitcode mode), disabled-cache no-op semantics, cross-target isolation, and store round-trip. - `scripts/run_cache_tests.sh` end-to-end smoke: 4-module project (test-files/module-init-order), asserts cold→warm→partial→rewarm hit/miss shapes and that a source edit is never served stale bytes.
…S#131 spec Issue PerryTS#131 asked for an env-gated one-line cache summary using the same format as V2.1's parse cache, alongside it in `perry dev` verbose mode. The V2.2 PR shipped a different env var (`PERRY_CACHE_VERBOSE`), label (`object cache`), and extra fields (store / store-err counts), and printed from compile.rs so the two lines didn't appear together. Align to the spec: - Env var: `PERRY_CACHE_VERBOSE` → `PERRY_DEV_VERBOSE` (one lever turns on both parse + codegen cache diagnostics). - Label: `• object cache: H/T hit (M miss, S store, E store-err)` → `• codegen cache: H/T hit (M miss)` matching parse-cache format. - Print in dev.rs right after the parse-cache line when `run_with_parse_cache` returns a `Some(codegen_cache_stats)` — so `perry dev` users see both lines together. compile.rs still prints for batch invocations (guarded on `parse_cache.is_none()` to avoid a duplicate line under dev), keeping `perry compile --no-cache=false` observable without having to go through dev. Plumbing: `CompileResult` gains an optional `codegen_cache_stats: Option<(hits, misses, stores, store_errors)>` tuple, populated from `ObjectCache` on the main success path and on the `--no-link` / `is_dylib` early-return paths; `None` everywhere else (widget/web/wasm helpers that don't touch the codegen cache). Integration test (`scripts/run_cache_tests.sh`) updated for the new label and env var. All 5 phases pass (baseline / cold / warm / partial / rewarm) and all 15 `object_cache_tests` unit tests still pass. The over-delivery on scope (cache wired into compile/run/dev, not just dev as the issue scoped it) is preserved — per maintainer feedback, that's fine since `perry run` and CI benefit from the same cache.
proggeramlug
added a commit
that referenced
this pull request
Apr 22, 2026
Three post-review fixes on top of PR #134's V2.2 object cache (hypnoo): 1. Hash PERRY_DEBUG_INIT, PERRY_DEBUG_SYMBOLS, PERRY_LLVM_CLANG into the cache key. These env vars alter compile_module output bytes (debug puts in module init, DWARF sections in .o, clang binary selection) but weren't captured — so running once with PERRY_DEBUG_SYMBOLS=1 would silently poison the cache for subsequent default-env runs. Values (not presence) are hashed so persistent overrides like PERRY_LLVM_CLANG=/opt/llvm/bin/clang in a shell rc still get reuse. 2. Add .perry-cache/ to .gitignore. The cache bakes in host CPU features via clang's -mcpu=native / -march=native, so committing it would ship .o files that SIGILL on other developers' machines. 3. Fix run_cache_tests.sh portability: replace the ([0-9]+)/\1 backref (not supported by BSD grep -E on macOS) with [0-9]+/[0-9]+. Same semantic — "miss count == 0" is what "all hits" actually means. Tests: 16/16 pass (15 original + new key_changes_with_codegen_env_vars). Integration smoke: cold 0/4 → warm 4/4 → partial 3/4 → rewarm 4/4. End-to-end manual: single-file cold/warm/--no-cache/PERRY_NO_CACHE=1, multi-module cold/warm/edit/rewarm, env-var-flip invalidation, cache info/clean subcommands — all green.
Contributor
|
Merged to main as fast-forward at v0.5.160 (commits Thanks @TheHypnoo! Folded in at merge time per CONTRIBUTING:
16/16 unit tests pass, integration smoke passes, manual end-to-end (single-file, multi-module, cache info/clean, env-var-flip invalidation) all green. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the V2.2 on-disk object cache scoped in #131 and follows V2.1's in-memory AST cache from #132. Each
.tsmodule's compiled.obytes now land at.perry-cache/objects/<target>/<key:016x>.o, shared across everyperry compile/perry run/perry devinvocation on that project.The real perf win lives here, not in V2.1. On a 30-module synthetic project on my M1:
--no-cachebaselineCost scales with changed modules, not total — the
perry devwatch loop is the primary beneficiary.Design
ObjectCache(thread-safe viaAtomicUsizecounters, one file per entry so rayon workers don't contend) uses atomic tmp-then-rename writes. IO errors are counted and the codepath degrades to uncached — the cache is strictly an optimization, never a correctness dependency.compute_object_cache_keyis a streaming djb2 (mirroringbuild_optimized_libs's prior-art pattern) over everyCompileOptionsfield that affectscompile_module's output bytes:collect_modulesand stored onCompilationContext::module_source_hashes— no second file read)is_entry_module,output_type, feature flags (needs_stdlib/needs_ui/needs_geisterhand/needs_js_runtime)non_entry_module_prefixes,native_module_init_names) preserve order so a link-ordering change (v0.5.127-128 bug class) invalidates consumers — this is the acceptance criterion @ralphhempel called out in V2 watch mode: incremental compilation cache — scoping (follow-up to #126) #131PERRY_LLVM_BITCODE_LINK=1) sincecompile_moduleemits.lltext, not object bytes.CLI surface
--no-cacheflag oncompile/run/devPERRY_NO_CACHE=1env var (CI-friendly override)PERRY_DEV_VERBOSE=1prints• codegen cache: H/T hit (M miss)per build (same env var V2.1 uses forparse cache:, so one lever turns on both lines inperry dev)perry cache info— location, total size, per-target breakdownperry cache clean— wipe.perry-cache/for the current projectDeviations from issue #131 scoping
Two worth calling out so the maintainer can veto if desired:
perry compile/perry run/perry devall at once. Rationale: the cache has no dev-specific coupling (pureCompileOptions + source_hashkey), andperry run/ CI batch builds benefit from the same warm-hit path. Happy to gate behind a dev-only flag in a follow-up if preferred.CompileResultgained an optionalcodegen_cache_stats: Option<(hits, misses, stores, store_errors)>sodev.rscan print thecodegen cache:line alongsideparse cache:afterrun_with_parse_cachereturns. Widget/web/wasm helper returns getNone(they never touch the codegen cache); the three codegen paths (--no-link,is_dylib, main success) populate stats fromObjectCache.Tests
15 unit tests in
object_cache_tests(all passing):target-aandtarget-bkeyed separately)Integration test (
scripts/run_cache_tests.sh) ontest-files/module-init-order/(the real multi-module fixture that exercises topological init order):--no-cache→ records expected output0/4 hit (4 miss), output matches baseline4/4 hit (0 miss), output matches baseline3/4 hit (1 miss), output reflects the edit (the stale-bytes anti-regression)4/4 hitreturns to full-hit stateperry cache info/perry cache cleansmoke-tested at the endWhat does NOT ship in this PR
Per CONTRIBUTING.md: no
[workspace.package] versionbump, no**Current Version:**edit onCLAUDE.md, no "Recent Changes" entry — maintainer folds those in at merge time.Test plan
Closes (partial) #131.