fix(modes): enforce rollback-on-failure with an undo-stack guard#19
Merged
Conversation
bring_up cleaned up after failures with hand-rolled blocks repeated at each failure site, and the sites had drifted: spawn_services failures rolled back nothing, a failed compose group left earlier groups' containers running while deleting the overlay files teardown needs, pre_spawn failures rolled back in container mode but not hybrid/host, and post_up failures on a resumed session deleted the session's data volumes. Each provisioning step now registers its inverse with a Rollback guard immediately after it succeeds; any early return unwinds the stack in reverse order (stop containers before deleting their overlay, kill spawned services, remove a freshly created worktree). disarm() keeps everything once the session is fully up. Rollback only removes volumes for sessions it created (not on resume), and reused worktrees are never deleted by rollback. Fixes #2 https://claude.ai/code/session_017UcuvzMKHVfyBCcq8ipAko
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #2
Problem
"Rollback on failure" was enforced by hand-rolled cleanup blocks copy-pasted at each failure site in the three
bring_upimplementations, and the sites had drifted apart:spawn_servicesfailure: no rollback at all (bare?) — containers stayed up, worktree stayed, nothing in state to find them withpre_spawnhook failure: full rollback in container mode, none in hybrid/hostpost_upfailure on a resumed session: rollback rancompose downwithremove_volumes=true, deleting the session's data volumessymlink_env_files/write_env_file/ native port-allocation failures after containers were up: no rollbackFix
New
rollback::Rollbackdrop-guard: every provisioning step pushes its inverse immediately after succeeding; any early return (?) unwinds the stack in reverse order on drop.disarm()once the session is fully up.Ordering and data-safety details:
rollback_volumes = !reuse_worktree: rollback only removes volumes for sessions thisupcreated — a failed re-up of an existing session keeps its dataif/elsearoundspawn_services(both arms made the identical call; only logging differed) is collapsed in host/hybridAll ~6 manual cleanup blocks are deleted; every failure path now goes through the same guard.
Tests
rollback.rs: LIFO order, disarm semantics, push-after-disarm, empty guardhost.rs(end-to-end through realbring_upwith failing hooks):failed_post_up_hook_rolls_back_fresh_worktreefailed_pre_spawn_hook_rolls_back_fresh_worktree— this path previously had no cleanup in host modefailed_post_up_hook_keeps_reused_worktreesuccessful_bring_up_keeps_worktreeDocker-dependent rollback paths (container/hybrid compose groups) use the identical guard mechanism; end-to-end docker coverage is tracked separately in #16.
cargo fmt --check,cargo clippy -- -D warnings(CI invocation),cargo test(377 + 18) green. Note:clippy --all-targetsreports 5 pre-existingitems_after_test_moduleerrors that exist on main as well — untouched here, related to the test-layout note in #16.https://claude.ai/code/session_017UcuvzMKHVfyBCcq8ipAko
Generated by Claude Code