fix: preserve session config on multi-tunnel reuse#229
Conversation
Multi-tunnel reuse sessions return minimal responses without ApiUrl, OtlpUrl, HostMapping, etc. Previously these empty values overwrote the good config from the primary session, causing the hadron to lose its API URL and fail to download deployment code. Now only non-empty values update the stored config. Added debug logging for session config response and effective values.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthrough
Changes
🚥 Pre-merge checks | ✅ 1✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
gravity/grpc_client.go (1)
2582-2583: Use debug level for session-config dumps.These logs are emitted at
Infoand include stable identifiers/URLs. Since this is diagnostic merge tracing,Debugis safer and less noisy in production.💡 Proposed change
-g.logger.Info("[session-config] response: apiUrl=%q otlpUrl=%q env=%d hostMappings=%d machineId=%s server=%s", +g.logger.Debug("[session-config] response: apiUrl=%q otlpUrl=%q env=%d hostMappings=%d machineId=%s server=%s", response.ApiUrl, response.OtlpUrl, len(response.Environment), len(response.HostMapping), response.MachineId, response.GravityServer) -g.logger.Info("[session-config] effective: apiUrl=%q otlpUrl=%q hostMappings=%d", g.apiURL, g.otlpURL, len(g.hostMapping)) +g.logger.Debug("[session-config] effective: apiUrl=%q otlpUrl=%q hostMappings=%d", g.apiURL, g.otlpURL, len(g.hostMapping))Also applies to: 2599-2599
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gravity/grpc_client.go` around lines 2582 - 2583, The session-config diagnostic log is using g.logger.Info and should be lowered to Debug to avoid noisy production logs; change the calls that format the response (the g.logger.Info call that logs "response: apiUrl=%q otlpUrl=%q env=%d hostMappings=%d machineId=%s server=%s" with response.ApiUrl, response.OtlpUrl, len(response.Environment), len(response.HostMapping), response.MachineId, response.GravityServer) and the similar logger invocation around the same session-config handling (the other g.logger.Info at the later occurrence) to g.logger.Debug so these diagnostic URL/ID dumps are only emitted at debug level.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gravity/grpc_client.go`:
- Around line 2587-2589: The code saves the OTLP token to g.otlpToken but
provider.Configure still receives response.OtlpKey, which can overwrite a
preserved token with an empty value; modify the call site in the Configure flow
(where provider.Configure is invoked) to pass the effective token (e.g., use
response.OtlpKey if non-empty, otherwise g.otlpToken) instead of raw
response.OtlpKey so preserved tokens are reused when responses omit the key.
---
Nitpick comments:
In `@gravity/grpc_client.go`:
- Around line 2582-2583: The session-config diagnostic log is using
g.logger.Info and should be lowered to Debug to avoid noisy production logs;
change the calls that format the response (the g.logger.Info call that logs
"response: apiUrl=%q otlpUrl=%q env=%d hostMappings=%d machineId=%s server=%s"
with response.ApiUrl, response.OtlpUrl, len(response.Environment),
len(response.HostMapping), response.MachineId, response.GravityServer) and the
similar logger invocation around the same session-config handling (the other
g.logger.Info at the later occurrence) to g.logger.Debug so these diagnostic
URL/ID dumps are only emitted at debug level.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 87e8dab3-b6a3-4604-a316-fdc28176809d
📒 Files selected for processing (1)
gravity/grpc_client.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build
- GitHub Check: Analyze (go)
🔇 Additional comments (1)
gravity/grpc_client.go (1)
2584-2598: Good non-empty merge guards for reused sessions.Nice fix: these guards prevent minimal multi-tunnel hello responses from clobbering previously valid session config values.
…ponse The provider.Configure call was passing response.OtlpKey directly, bypassing the preservation logic that keeps good values from the primary session when multi-tunnel reuse sessions send empty responses.
Store SubnetRoutes, MachineSubnet, Hostname, OrgID, MachineToken, MachineCertBundle, SSHPublicKey, and SigningPublicKey on the gravity client. Pass preserved values to provider.Configure() instead of raw response values that may be empty from multi-tunnel reuse sessions.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
gravity/grpc_client.go (2)
2585-2624:⚠️ Potential issue | 🟠 MajorPreserve
authorizationTokentogether withMachineToken.Line 2587 still overwrites
g.authorizationTokenfrom the raw response on every hello. On a minimal reuse response, that blanks the bearer token even though the primary token is now preserved ing.machineToken, soGetSecret(),GetDeploymentMetadata(), andGetSandboxMetadata()can start sending an empty token.💡 Suggested fix
g.machineID = response.MachineId - g.authorizationToken = response.MachineToken g.connectionIDs = append(g.connectionIDs, response.MachineId) ... if response.MachineToken != "" { g.machineToken = response.MachineToken + g.authorizationToken = response.MachineToken }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gravity/grpc_client.go` around lines 2585 - 2624, The code unconditionally sets g.authorizationToken = response.MachineToken on every hello, which lets minimal reuse responses clear the bearer token; only update g.authorizationToken when the response actually contains a non-empty MachineToken (i.e., mirror the pattern used for g.machineToken and other fields). Modify the assignment so that g.authorizationToken is only overwritten if response.MachineToken != "" (otherwise leave the existing g.authorizationToken intact) and ensure g.machineToken continues to be updated as before so GetSecret(), GetDeploymentMetadata(), and GetSandboxMetadata() always have a valid token.
2590-2705:⚠️ Potential issue | 🔴 CriticalSnapshot the merged session config before releasing
g.muto prevent concurrent mutations.Lines 2636 unlocks
g.mu, but lines 2664–2705 then readg.machineSubnet,g.apiURL,g.hostMapping,g.hostEnvironment,g.subnetRoutes,g.sessionHostname,g.sessionOrgID,g.machineToken,g.machineID,g.machineCertBundle,g.sshPublicKey, andg.signingPublicKeywithout holding the lock. SincehandleSessionHelloResponseruns concurrently on multiple control-stream goroutines (lines 1864, 1947, 4076), another hello can mutate these fields while this one is validating/configuring/routing, creating a race that feedsprovider.ConfigureorRouteTraffica mixed config from two different responses.Additionally, line 2587 unconditionally assigns
g.authorizationToken = response.MachineToken, so multi-tunnel reuse sessions returning an empty token will blank the auth token used by lines 6034 and 6052 for bearer authentication. Whileg.machineTokenis preserved conditionally (line 2603),g.authorizationTokenshould also be preserved conditionally to maintain auth across reuse sessions.💡 Suggested fix
g.logger.Info("[session-config] effective: apiUrl=%q otlpUrl=%q hostMappings=%d subnetRoutes=%d machineSubnet=%q", g.apiURL, g.otlpURL, len(g.hostMapping), len(g.subnetRoutes), g.machineSubnet) + + effectiveAPIURL := g.apiURL + effectiveOTLPURL := g.otlpURL + effectiveOTLPToken := g.otlpToken + effectiveHostMapping := append([]*pb.HostMapping(nil), g.hostMapping...) + effectiveEnvironment := append([]string(nil), g.hostEnvironment...) + effectiveSubnetRoutes := append([]string(nil), g.subnetRoutes...) + effectiveMachineSubnet := g.machineSubnet + effectiveHostname := g.sessionHostname + effectiveOrgID := g.sessionOrgID + effectiveMachineToken := g.machineToken + effectiveMachineID := g.machineID + effectiveMachineCertBundle := g.machineCertBundle + effectiveSSHPublicKey := append([]byte(nil), g.sshPublicKey...) + effectiveSigningPublicKey := append([]byte(nil), g.signingPublicKey...) g.mu.Unlock() - if g.machineSubnet != "" { - if _, _, err := net.ParseCIDR(g.machineSubnet); err != nil { - g.logger.Error("session hello returned invalid MachineSubnet %q: %v", g.machineSubnet, err) + if effectiveMachineSubnet != "" { + if _, _, err := net.ParseCIDR(effectiveMachineSubnet); err != nil { + g.logger.Error("session hello returned invalid MachineSubnet %q: %v", effectiveMachineSubnet, err) signalConnectionID("") closeSessionReady() return } } if err := g.provider.Configure(provider.Configuration{ Server: g, Context: g.context, Logger: g.logger, - APIURL: g.apiURL, - SSHPublicKey: g.sshPublicKey, - TelemetryURL: g.otlpURL, - TelemetryAPIKey: g.otlpToken, + APIURL: effectiveAPIURL, + SSHPublicKey: effectiveSSHPublicKey, + TelemetryURL: effectiveOTLPURL, + TelemetryAPIKey: effectiveOTLPToken, GravityURL: g.url, AgentuityCACert: g.caCert, - HostMapping: g.hostMapping, - Environment: g.hostEnvironment, - SubnetRoutes: g.subnetRoutes, - Hostname: g.sessionHostname, - OrgID: g.sessionOrgID, - MachineToken: g.machineToken, - MachineID: g.machineID, - MachineCertBundle: g.machineCertBundle, - MachineSubnet: g.machineSubnet, - SigningPublicKey: g.signingPublicKey, + HostMapping: effectiveHostMapping, + Environment: effectiveEnvironment, + SubnetRoutes: effectiveSubnetRoutes, + Hostname: effectiveHostname, + OrgID: effectiveOrgID, + MachineToken: effectiveMachineToken, + MachineID: effectiveMachineID, + MachineCertBundle: effectiveMachineCertBundle, + MachineSubnet: effectiveMachineSubnet, + SigningPublicKey: effectiveSigningPublicKey, }); err != nil { ... } - if g.networkInterface != nil { - if err := g.networkInterface.RouteTraffic(g.subnetRoutes); err != nil { + if g.networkInterface != nil { + if err := g.networkInterface.RouteTraffic(effectiveSubnetRoutes); err != nil { ... } }For the auth token, change line 2587:
- g.authorizationToken = response.MachineToken + if response.MachineToken != "" { + g.authorizationToken = response.MachineToken + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gravity/grpc_client.go` around lines 2590 - 2705, The handler unlocks g.mu too early causing races: snapshot all session-derived fields (apiURL, otlpURL/otlpToken, hostMapping, hostEnvironment, subnetRoutes, machineSubnet, sessionHostname, sessionOrgID, machineToken, machineID, machineCertBundle, sshPublicKey, signingPublicKey, etc.) into local variables while holding g.mu, then release the lock and use those locals for validation, provider.Configure and networkInterface.RouteTraffic to avoid concurrent mutation; also change the unconditional assignment to g.authorizationToken = response.MachineToken so it is only updated when response.MachineToken != "" (preserve existing authorizationToken on reuse sessions).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gravity/grpc_client.go`:
- Around line 2703-2705: The code currently calls
g.networkInterface.RouteTraffic(g.subnetRoutes) on every session hello even when
the incoming message omits SubnetRoutes, reusing the preserved g.subnetRoutes
and risking duplicate route application; update the handshake logic in the
function handling hellos to either 1) compare the new SubnetRoutes payload to
the preserved g.subnetRoutes and skip calling
g.networkInterface.RouteTraffic(...) when they are equal/unchanged, or 2) if
reapplication is required, call
g.networkInterface.UnrouteTraffic(g.subnetRoutes) (or the appropriate unroute
method) before invoking g.networkInterface.RouteTraffic(...) to clear prior
state; use the g.subnetRoutes field and the
networkInterface.RouteTraffic/UnrouteTraffic methods to locate and implement
this change.
---
Outside diff comments:
In `@gravity/grpc_client.go`:
- Around line 2585-2624: The code unconditionally sets g.authorizationToken =
response.MachineToken on every hello, which lets minimal reuse responses clear
the bearer token; only update g.authorizationToken when the response actually
contains a non-empty MachineToken (i.e., mirror the pattern used for
g.machineToken and other fields). Modify the assignment so that
g.authorizationToken is only overwritten if response.MachineToken != ""
(otherwise leave the existing g.authorizationToken intact) and ensure
g.machineToken continues to be updated as before so GetSecret(),
GetDeploymentMetadata(), and GetSandboxMetadata() always have a valid token.
- Around line 2590-2705: The handler unlocks g.mu too early causing races:
snapshot all session-derived fields (apiURL, otlpURL/otlpToken, hostMapping,
hostEnvironment, subnetRoutes, machineSubnet, sessionHostname, sessionOrgID,
machineToken, machineID, machineCertBundle, sshPublicKey, signingPublicKey,
etc.) into local variables while holding g.mu, then release the lock and use
those locals for validation, provider.Configure and
networkInterface.RouteTraffic to avoid concurrent mutation; also change the
unconditional assignment to g.authorizationToken = response.MachineToken so it
is only updated when response.MachineToken != "" (preserve existing
authorizationToken on reuse sessions).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b2572332-689e-4d1c-9434-494fb192d8a2
📒 Files selected for processing (1)
gravity/grpc_client.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Analyze (go)
| g.logger.Debug("configuring subnet routing for routes %v", g.subnetRoutes) | ||
| if g.networkInterface != nil { | ||
| if err := g.networkInterface.RouteTraffic(response.SubnetRoutes); err != nil { | ||
| if err := g.networkInterface.RouteTraffic(g.subnetRoutes); err != nil { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== RouteTraffic declarations and call sites =="
rg -n -C3 '\bRouteTraffic\s*\('
echo
echo "== NetworkInterface definitions =="
rg -n -C3 'type .*NetworkInterface|interface .*RouteTraffic'Repository: agentuity/go-common
Length of output: 3764
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Context around RouteTraffic call in grpc_client.go (lines 2695-2715) =="
sed -n '2695,2715p' gravity/grpc_client.go
echo
echo "== Search for function containing line 2703 (context) =="
sed -n '2650,2750p' gravity/grpc_client.go | head -100
echo
echo "== Check for 'reuse hello' or similar logic =="
rg -n -i 'reuse.*hello|hello.*reuse' gravity/
echo
echo "== Look for actual implementations of NetworkInterface =="
rg -n 'func.*RouteTraffic|type.*NetworkInterface' --glob '!*test*' --glob '!*_test*'Repository: agentuity/go-common
Length of output: 4242
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Where is g.subnetRoutes assigned/updated =="
rg -n 'g\.subnetRoutes\s*=' gravity/
echo
echo "== Session hello response structure =="
rg -n -A5 'type.*SessionHello|struct.*SessionHello' gravity/ --type go
echo
echo "== Check what gets updated from response =="
rg -n 'response\.' gravity/grpc_client.go | grep -i subnet | head -20
echo
echo "== Search for any cleanup or UnrouteTraffic calls =="
rg -n 'UnrouteTraffic' gravity/Repository: agentuity/go-common
Length of output: 4975
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Context around UnrouteTraffic call at line 3461 =="
sed -n '3450,3475p' gravity/grpc_client.go
echo
echo "== Search for actual RouteTraffic implementations (non-test) =="
rg -n -C5 'func.*RouteTraffic' gravity/ --glob '!*test*' --glob '!*.pb.go'
echo
echo "== Search for comments about reuse hello or multi-tunnel =="
rg -n -B2 -A2 'reuse.*session|multi.*tunnel' gravity/ --type go
echo
echo "== Check SessionHelloResponse fields for SubnetRoutes =="
sed -n '790,850p' gravity/proto/gravity_session.pb.go | grep -i subnetRepository: agentuity/go-common
Length of output: 872
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for tests involving reuse hello or SessionHelloResponse =="
rg -n -C3 'SessionHelloResponse|reuse.*hello' gravity/ --type go
echo
echo "== Check network/README for RouteTraffic documentation =="
cat gravity/network/README.md 2>/dev/null || echo "No README found"
echo
echo "== Search for any comments about idempotency or route handling =="
rg -n -i 'idempotent|duplicate.*route|route.*duplicate' gravity/ --type go
echo
echo "== Check if there are integration tests for RouteTraffic =="
fd -e go -path '*/test*' | xargs grep -l 'RouteTraffic' 2>/dev/null | head -5Repository: agentuity/go-common
Length of output: 19740
RouteTraffic is reapplied unconditionally on every session hello, risking duplicate route operations.
On reuse hellos where SubnetRoutes is omitted from the response, the preserved g.subnetRoutes value (lines 2610–2611) is still passed to RouteTraffic at line 2705. If the implementation treats duplicate routes as errors rather than idempotently, the handshake will fail on the second or subsequent hello message (e.g., multi-stream scenarios where multiple endpoints acknowledge their hellos).
Consider either:
- Skipping
RouteTrafficif routes haven't changed since the last call, or - Calling
UnrouteTrafficbefore reapplying to clear prior state
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gravity/grpc_client.go` around lines 2703 - 2705, The code currently calls
g.networkInterface.RouteTraffic(g.subnetRoutes) on every session hello even when
the incoming message omits SubnetRoutes, reusing the preserved g.subnetRoutes
and risking duplicate route application; update the handshake logic in the
function handling hellos to either 1) compare the new SubnetRoutes payload to
the preserved g.subnetRoutes and skip calling
g.networkInterface.RouteTraffic(...) when they are equal/unchanged, or 2) if
reapplication is required, call
g.networkInterface.UnrouteTraffic(g.subnetRoutes) (or the appropriate unroute
method) before invoking g.networkInterface.RouteTraffic(...) to clear prior
state; use the g.subnetRoutes field and the
networkInterface.RouteTraffic/UnrouteTraffic methods to locate and implement
this change.
The reconnection path could panic with 'index out of range [0]' when controlStreams was empty (not yet initialized). Add length checks before accessing index 0.
Propagate the deployment d-hash / sandbox hostname via gossip so peer ions can use it as the TLS SNI for backend connections.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
gravity/grpc_client.go (2)
2592-2593:⚠️ Potential issue | 🟠 MajorDon't clear
authorizationTokenon minimal hello responses.
g.machineTokenis preserved below, butg.authorizationTokenis still overwritten unconditionally here. On reuse hellos with an emptyMachineToken, downstream authenticated calls in the same file still send an empty bearer token at Line [6040] and Line [6058].Suggested fix
// Store session response fields g.machineID = response.MachineId - g.authorizationToken = response.MachineToken g.connectionIDs = append(g.connectionIDs, response.MachineId) @@ - if response.MachineToken != "" { - g.machineToken = response.MachineToken - } + if response.MachineToken != "" { + g.authorizationToken = response.MachineToken + g.machineToken = response.MachineToken + }Also applies to: 2628-2630
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gravity/grpc_client.go` around lines 2592 - 2593, The code unconditionally overwrites g.authorizationToken from response.MachineToken even when the response provides an empty MachineToken; change the assignment so g.authorizationToken is only updated if response.MachineToken is non-empty (i.e., check response.MachineToken != "" before setting g.authorizationToken) while still updating g.machineID and preserving g.machineToken behavior; apply the same conditional update in the other similar block (the assignments around the second occurrence covering the 2628-2630 region) so downstream authenticated calls use the previously valid bearer token instead of an empty string.
2589-2642:⚠️ Potential issue | 🟠 MajorTake a stable config snapshot before unlocking.
handleSessionHelloResponsemerges into sharedg.*state underg.mu, then unlocks and later validates/configures the provider from those same fields. In multi-endpoint mode, another hello can land in between and rewriteg.apiURL,g.machineSubnet,g.hostMapping, etc., soprovider.Configure()can observe a mixed config assembled from multiple responses.Suggested direction
g.mu.Lock() ... + effectiveAPIURL := g.apiURL + effectiveOTLPURL := g.otlpURL + effectiveOTLPToken := g.otlpToken + effectiveHostMapping := append([]*pb.HostMapping(nil), g.hostMapping...) + effectiveEnvironment := append([]string(nil), g.hostEnvironment...) + effectiveSubnetRoutes := append([]string(nil), g.subnetRoutes...) + effectiveMachineSubnet := g.machineSubnet + effectiveHostname := g.sessionHostname + effectiveOrgID := g.sessionOrgID + effectiveMachineToken := g.machineToken + effectiveMachineID := g.machineID + effectiveMachineCertBundle := g.machineCertBundle + effectiveSSHPublicKey := append([]byte(nil), g.sshPublicKey...) + effectiveSigningPublicKey := append([]byte(nil), g.signingPublicKey...) g.mu.Unlock() - if g.machineSubnet != "" { - if _, _, err := net.ParseCIDR(g.machineSubnet); err != nil { + if effectiveMachineSubnet != "" { + if _, _, err := net.ParseCIDR(effectiveMachineSubnet); err != nil { ... } } if err := g.provider.Configure(provider.Configuration{ - APIURL: g.apiURL, - SSHPublicKey: g.sshPublicKey, - TelemetryURL: g.otlpURL, - TelemetryAPIKey: g.otlpToken, - HostMapping: g.hostMapping, - Environment: g.hostEnvironment, - SubnetRoutes: g.subnetRoutes, - Hostname: g.sessionHostname, - OrgID: g.sessionOrgID, - MachineToken: g.machineToken, - MachineID: g.machineID, - MachineCertBundle: g.machineCertBundle, - MachineSubnet: g.machineSubnet, - SigningPublicKey: g.signingPublicKey, + APIURL: effectiveAPIURL, + SSHPublicKey: effectiveSSHPublicKey, + TelemetryURL: effectiveOTLPURL, + TelemetryAPIKey: effectiveOTLPToken, + HostMapping: effectiveHostMapping, + Environment: effectiveEnvironment, + SubnetRoutes: effectiveSubnetRoutes, + Hostname: effectiveHostname, + OrgID: effectiveOrgID, + MachineToken: effectiveMachineToken, + MachineID: effectiveMachineID, + MachineCertBundle: effectiveMachineCertBundle, + MachineSubnet: effectiveMachineSubnet, + SigningPublicKey: effectiveSigningPublicKey, }); err != nil {Also applies to: 2670-2700
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gravity/grpc_client.go` around lines 2589 - 2642, The code currently writes multiple shared fields under g.mu then unlocks, allowing another hello to mutate g.* before provider.Configure() reads them; fix by taking a stable snapshot of all config values you will pass to provider.Configure (e.g., copy g.apiURL, g.otlpURL, g.hostMapping, g.subnetRoutes, g.machineSubnet, g.sessionHostname, g.sessionOrgID, g.machineToken, g.machineCertBundle, g.sshPublicKey, g.signingPublicKey, etc.) into local variables or a small struct while still holding g.mu, then unlock and call provider.Configure using that snapshot; apply the same snapshotting change to the other similar block around the 2670-2700 region so provider.Configure always sees a consistent config.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gravity/grpc_client.go`:
- Around line 2601-2633: The code currently uses proto3 scalar/repeated fields
(e.g., response.Environment, response.HostMapping, response.SubnetRoutes and
string fields assigned to g.otlpURL, g.otlpToken, g.apiURL, g.machineSubnet,
g.sessionHostname, g.sessionOrgID, g.machineToken, g.machineCertBundle) so it
cannot distinguish “empty” vs “unset”; update the protobuf and assignment logic
to carry presence: either add an explicit boolean signal in the
SessionHelloResponse (e.g., apply_full_state or clear_state) and when true apply
fields even if empty, or mark the relevant fields as proto3 optional (or wrapper
types) to get presence semantics, regenerate the Go protos, and then replace the
current zero-value guards with presence checks (use the optional/pointer
presence or the new signal) so empty values will clear previous g.* state when
intended.
---
Outside diff comments:
In `@gravity/grpc_client.go`:
- Around line 2592-2593: The code unconditionally overwrites
g.authorizationToken from response.MachineToken even when the response provides
an empty MachineToken; change the assignment so g.authorizationToken is only
updated if response.MachineToken is non-empty (i.e., check response.MachineToken
!= "" before setting g.authorizationToken) while still updating g.machineID and
preserving g.machineToken behavior; apply the same conditional update in the
other similar block (the assignments around the second occurrence covering the
2628-2630 region) so downstream authenticated calls use the previously valid
bearer token instead of an empty string.
- Around line 2589-2642: The code currently writes multiple shared fields under
g.mu then unlocks, allowing another hello to mutate g.* before
provider.Configure() reads them; fix by taking a stable snapshot of all config
values you will pass to provider.Configure (e.g., copy g.apiURL, g.otlpURL,
g.hostMapping, g.subnetRoutes, g.machineSubnet, g.sessionHostname,
g.sessionOrgID, g.machineToken, g.machineCertBundle, g.sshPublicKey,
g.signingPublicKey, etc.) into local variables or a small struct while still
holding g.mu, then unlock and call provider.Configure using that snapshot; apply
the same snapshotting change to the other similar block around the 2670-2700
region so provider.Configure always sees a consistent config.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 48135857-66eb-4814-9206-73f724e495ce
📒 Files selected for processing (2)
gravity/gossip/types.gogravity/grpc_client.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Analyze (go)
🔇 Additional comments (3)
gravity/gossip/types.go (1)
44-53: Backward-compatibleEndpointMappingextension.
omitemptykeeps legacy payloads unchanged, and older JSON decoders will just ignoresni_hostname, so this is a safe wire-format addition.gravity/grpc_client.go (2)
2169-2174: Nice guard before indexing control-stream state.This turns the reconnection panic into a normal error path when startup hasn't populated the slices yet.
5887-5890: Nice empty/nil guard in the ping path.This closes the remaining
streams[0]panic window during reconnection.
| if response.OtlpUrl != "" { | ||
| g.otlpURL = response.OtlpUrl | ||
| } | ||
| if response.OtlpKey != "" { | ||
| g.otlpToken = response.OtlpKey | ||
| } | ||
| if response.ApiUrl != "" { | ||
| g.apiURL = response.ApiUrl | ||
| } | ||
| if len(response.Environment) > 0 { | ||
| g.hostEnvironment = response.Environment | ||
| } | ||
| if len(response.HostMapping) > 0 { | ||
| g.hostMapping = response.HostMapping | ||
| } | ||
| if len(response.SubnetRoutes) > 0 { | ||
| g.subnetRoutes = response.SubnetRoutes | ||
| } | ||
| if response.MachineSubnet != "" { | ||
| g.machineSubnet = response.MachineSubnet | ||
| } | ||
| if response.Hostname != "" { | ||
| g.sessionHostname = response.Hostname | ||
| } | ||
| if response.OrgId != "" { | ||
| g.sessionOrgID = response.OrgId | ||
| } | ||
| if response.MachineToken != "" { | ||
| g.machineToken = response.MachineToken | ||
| } | ||
| if response.MachineCertBundle != "" { | ||
| g.machineCertBundle = response.MachineCertBundle | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Generated SessionHelloResponse struct =="
sed -n '/type SessionHelloResponse struct {/,/^}/p' gravity/proto/gravity_session.pb.go
echo
echo "== Protobuf source, if present =="
fd 'gravity_session.proto' gravity -x sed -n '/message SessionHelloResponse {/,/^}/p' {}Repository: agentuity/go-common
Length of output: 4564
Cannot distinguish between empty and omitted fields in protobuf response.
The SessionHelloResponse fields being checked (Environment, HostMapping, SubnetRoutes, and all strings) are plain proto3 scalars and repeated fields with no presence metadata. In proto3, an empty value (empty slice, empty string) is indistinguishable from an unset field, collapsing both to zero values. This means the current logic cannot clear stale state from prior sessions when a field is legitimately sent as empty—the code will skip applying empty values, leaving old data intact.
To fix this, either:
- Add an explicit signal field indicating whether the full response should be applied or only non-empty fields preserved, or
- Mark fields as
optionalin the protobuf definition (proto3.12+ feature) to introduce presence semantics.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gravity/grpc_client.go` around lines 2601 - 2633, The code currently uses
proto3 scalar/repeated fields (e.g., response.Environment, response.HostMapping,
response.SubnetRoutes and string fields assigned to g.otlpURL, g.otlpToken,
g.apiURL, g.machineSubnet, g.sessionHostname, g.sessionOrgID, g.machineToken,
g.machineCertBundle) so it cannot distinguish “empty” vs “unset”; update the
protobuf and assignment logic to carry presence: either add an explicit boolean
signal in the SessionHelloResponse (e.g., apply_full_state or clear_state) and
when true apply fields even if empty, or mark the relevant fields as proto3
optional (or wrapper types) to get presence semantics, regenerate the Go protos,
and then replace the current zero-value guards with presence checks (use the
optional/pointer presence or the new signal) so empty values will clear previous
g.* state when intended.
…lients Prevent index-out-of-range panics during reconnection when gRPC connections haven't been established yet. The contexts/cancels slices are allocated from g.connections length, but the loop iterates g.sessionClients which could be longer.
Guard ALL slice accesses (contexts, cancels, controlStreams, controlSendMu) against concurrent modification during reconnection. Log warning when index exceeds capacity to aid debugging.
…d to route requests and unprovision proto
Summary
Multi-tunnel reuse sessions return minimal responses without ApiUrl, OtlpUrl, HostMapping, etc. Previously these empty values overwrote the good config from the primary session, causing the hadron to lose its API URL and fail to download deployment code.
Now only non-empty values update the stored config. Added debug logging for session config response and effective values.
Summary by CodeRabbit
Bug Fixes
New Features