feat(engine): Phase 11.9 WAL log-record durability + crash recovery (SQLR-22)#130
Merged
Merged
Conversation
…SQLR-22)
Pre-11.9, BEGIN CONCURRENT commits persisted *table state* through the
legacy save_database mirror, but MvStore's version index was reborn
empty on every reopen. That's correct for single-session workloads but
breaks down once cross-process MVCC matters — a second process could
hand out a begin_ts below an already-committed version's end and the
visibility rule would miscategorise one side.
11.9 closes that gap by adding a typed MVCC log-record frame on top of
the existing per-page WAL frames. The MVCC frame is appended before
the legacy save's commit-frame fsync, so a single fsync covers both:
a crash either keeps both writes or loses both — torn-write atomicity
for the whole transaction. On reopen, the WAL replay decodes every
MVCC frame and re-pushes the row versions into MvStore, then seeds
MvccClock past max(header.clock_high_water, max(commit_ts among
replayed batches)) so post-restart transactions can never regress.
Engine changes:
- src/mvcc/log.rs: MvccCommitBatch + MvccLogRecord types and codec
("MVCC0001" magic + commit_ts + record stream, fits one frame body).
- src/sql/pager/wal.rs: WAL_FORMAT_VERSION 2 → 3; MVCC_FRAME_MARKER =
u32::MAX as the page-number discriminator; replay branches the
frame stream into pending_mvcc that promotes onto recovered_mvcc on
each commit barrier; Wal::append_mvcc_batch +
Wal::recovered_mvcc_commits accessors.
- src/sql/pager/pager.rs: Pager proxies (append_mvcc_batch,
recovered_mvcc_commits, clock_high_water, observe_clock_high_water).
- src/sql/pager/mod.rs: replay_mvcc_into_db drains recovered batches
into Database::mv_store and seeds MvccClock at open time.
- src/connection.rs: commit_concurrent encodes the resolved write-set
into an MvccCommitBatch, appends it pre-save_database, and bumps the
in-memory WAL header's clock_high_water; six new tests cover
round-trip persistence, multi-row batches, ROLLBACK-no-frame,
legacy-commit-no-frame, multi-commit replay after unclean close,
and clock seeding past the last commit_ts.
Docs:
- roadmap.md: Phase 11.9 promoted to shipped; remaining checkpoint-
drain half scoped as a follow-up.
- file-format.md: WAL header v3 row + MVCC log-record body diagram.
- concurrent-writes-plan.md: plan-doc §10.5 annotated with what
shipped vs. what's parked.
- design-decisions.md: new §12g — MVCC commits piggyback the legacy
fsync; sentinel choice; clock-seed correctness argument.
- _index.md: phase-summary bullet refreshed.
Workspace: 599/599 Rust tests pass (was 587, +12 = 6 codec + 6
durability). fmt + clippy + doc all clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BEGIN CONCURRENTcommits durably persisted table state through the legacysave_databasemirror, butMvStore's version index was reborn empty on every reopen — fine for single-session workloads, but breaks the visibility rule once cross-process MVCC is in scope.MvccCommitBatchWAL frame (sentinelpage_num = u32::MAX) appended before the legacy save's commit-frame fsync, so one fsync covers both writes — torn-write atomicity for the whole transaction.MvStore, then seedsMvccClockpastmax(header.clock_high_water, max(commit_ts among replayed batches))so post-restart transactions can never regress.Architecture
Changes
src/mvcc/log.rs:MvccCommitBatch+MvccLogRecordtypes and codec (MVCC0001magic +commit_ts+ record stream, fits one frame body). 6 codec tests.src/sql/pager/wal.rs:WAL_FORMAT_VERSION 2 → 3;MVCC_FRAME_MARKER = u32::MAXas the page-number discriminator; replay branches the frame stream into apending_mvcclist that promotes ontorecovered_mvccon each commit barrier;Wal::append_mvcc_batch+Wal::recovered_mvcc_commitsaccessors.src/sql/pager/pager.rs: Pager proxies (append_mvcc_batch,recovered_mvcc_commits,clock_high_water,observe_clock_high_water).src/sql/pager/mod.rs:replay_mvcc_into_dbdrains recovered batches intoDatabase::mv_storeand seedsMvccClockat open time.src/connection.rs:commit_concurrentencodes the resolved write-set into anMvccCommitBatch, appends it pre-save_database, and bumps the in-memory WAL header'sclock_high_water. 6 durability tests cover round-trip persistence, multi-row batches,ROLLBACK-no-frame, legacy-commit-no-frame, multi-commit replay after unclean close, and clock seeding past the lastcommit_ts.Docs
docs/roadmap.md: Phase 11.9 promoted to shipped; remaining checkpoint-drain half scoped as a follow-up.docs/file-format.md: WAL header v3 row + MVCC log-record body diagram + frame-marker convention.docs/concurrent-writes-plan.md: plan-doc §10.5 annotated with what shipped vs. what's parked.docs/design-decisions.md: new §12g covering the piggyback-fsync rationale, sentinel choice, and clock-seed correctness argument.docs/_index.md: phase-summary bullet refreshed.What 11.9 deferred
The other half of plan-doc §10.5 — extending the checkpointer to drain MVCC log records into pager-level updates, and re-enabling the
Mvcc → Walset_journal_modedowngrade — stays parked. Durability on the read path already works through the legacysave_databasemirror, so this gap is foundation work for cross-process MVCC, not a correctness regression. The existing per-commit GC bounds in-memory chain growth.Test plan
cargo build --workspace --exclude sqlrite-desktop --exclude sqlrite-python --exclude sqlrite-nodejs --exclude sqlrite-benchmarks --all-targetscargo test --workspace --exclude sqlrite-desktop --exclude sqlrite-python --exclude sqlrite-nodejs --exclude sqlrite-benchmarks— 599 / 599 (was 587, +12 = 6 codec + 6 durability)cargo fmt --all -- --checkcargo clippy --workspace --exclude sqlrite-desktop --exclude sqlrite-python --exclude sqlrite-nodejs --exclude sqlrite-benchmarks --all-targetscargo doc --workspace --exclude sqlrite-desktop --exclude sqlrite-python --exclude sqlrite-nodejs --exclude sqlrite-benchmarks --no-deps🤖 Generated with Claude Code