fix(ssh): SaveSSHKeyPair no longer removes the shared key directory#4988
Open
fix(ssh): SaveSSHKeyPair no longer removes the shared key directory#4988
Conversation
The previous implementation called os.RemoveAll on the parent directory (~/.databricks/ssh-tunnel-keys/) before writing new keys. This would silently delete keys belonging to other concurrent sessions on different clusters, causing their authentication to fail. Fix: drop the RemoveAll and let os.WriteFile create or overwrite the session-specific files directly. Add tests covering fresh creation, overwrite, directory auto-creation, and isolation between sessions. Co-authored-by: Isaac
pietern
approved these changes
Apr 16, 2026
anton-107
approved these changes
Apr 16, 2026
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SaveSSHKeyPairwas callingos.RemoveAllon~/.databricks/ssh-tunnel-keys/(the entire shared directory) before writing new keys, which deleted keys belonging
to all other concurrent sessions and broke their SSH authentication.
RemoveAlland letos.WriteFilecreate or overwrite the twosession-specific files directly.
a missing directory, and isolation between concurrent sessions.
Root cause
Initially suspected to be a race between the ProxyCommand writing keys and OpenSSH
reading them. Investigation of the OpenSSH source (ssh.c, sshconnect.c) showed that
ssh_connect() returns immediately after fork+pipe, so load_public_identity_files()
can run concurrently with the proxy — but this is benign, since SSH falls back to
loading the private key during ssh_userauth2(), which only runs after actual I/O
flows on the pipe (by which point the proxy has finished writing). The real bug was
always the RemoveAll nuking the shared directory.
Test plan
go test ./experimental/ssh/internal/keys/...This pull request was AI-assisted by Isaac.