Skip to content

fix: improve macOS gateway configuration bootstrap #1372

@jmpolom

Description

@jmpolom

fix: improve macOS gateway configuration bootstrap

Labels to apply when creating the issue: state:review-ready, spike. If the repo has an existing packaging or docs area label, add it as well.

Problem Statement

Fresh macOS Homebrew installs can start the local gateway without all required gateway configuration items. Users first hit a missing compute driver error, then after setting OPENSHELL_DRIVERS, hit ssh_handshake_secret is required. This creates a poor first-run experience and indicates macOS packaging lacks a single authoritative bootstrap path for required gateway configuration.

Technical Context

Linux packages already have a packaging-level strategy for required runtime config: the RPM service runs init-pki.sh and init-gateway-env.sh before openshell-gateway, then reads ~/.config/openshell/gateway.env through a systemd EnvironmentFile. Debian reads the same gateway.env path and generates certs in an ExecStartPre, but does not currently run the env generator script. Homebrew is different again: the generated formula embeds launchd environment variables and a shell wrapper for Docker TLS mirroring, but historically did not read or generate gateway.env.

The gateway itself treats configuration as runtime input. It requires a database URL, a compute driver or successful auto-detection, and, for non-Docker/non-VM drivers such as Podman and Kubernetes, OPENSHELL_SSH_HANDSHAKE_SECRET. These requirements are currently discovered by users through startup failures rather than by a coherent macOS bootstrap flow.

Affected Components

Component Key Files Role
Homebrew formula generator tasks/scripts/release.py, python/openshell/release_formula_test.py Generates the release formula, service wrapper, launchd environment, postinstall cert generation, and Homebrew packaging tests.
Install script patching install.sh Downloads release formulas, applies compatibility patches, stages the local tap, installs/reinstalls with Homebrew, and starts the service.
Linux package bootstrap deploy/rpm/init-gateway-env.sh, openshell.spec, deploy/deb/openshell-gateway.service Existing patterns for generating gateway.env, TLS material, and service environment.
Gateway validation crates/openshell-server/src/lib.rs, crates/openshell-server/src/cli.rs, crates/openshell-core/src/config.rs Defines required runtime settings and emits startup configuration errors.
Sandbox authentication crates/openshell-server/src/auth/oidc.rs, crates/openshell-sandbox/src/grpc_client.rs, driver crates Uses OPENSHELL_SSH_HANDSHAKE_SECRET for sandbox-to-gateway RPC authentication.
User docs docs/about/installation.mdx, docs/get-started/quickstart.mdx, docs/reference/sandbox-compute-drivers.mdx, deploy/man/openshell-gateway.env.5.md Explains install and configuration behavior to users and operators.

Technical Investigation

Architecture Overview

OpenShell gateway configuration currently comes from CLI flags and OPENSHELL_* environment variables parsed by crates/openshell-server/src/cli.rs. Package managers are responsible for supplying service defaults and generated secrets. The server validates required settings during startup in run_server.

Linux package flows establish a clearer contract:

  • RPM installs an env generator script and runs it as a user service pre-start hook.
  • The generated gateway.env contains a 32-byte hex OPENSHELL_SSH_HANDSHAKE_SECRET plus commented optional settings.
  • The systemd unit reads gateway.env as an override file and separately sets package defaults such as OPENSHELL_DRIVERS=podman.

Homebrew does not have systemd ExecStartPre, so its formula generates a shell wrapper at install time and launchd runs that wrapper. This wrapper is already responsible for one macOS-specific bootstrap task: copying TLS materials into a user-home path Docker Desktop can mount. That makes it the current insertion point for gateway.env bootstrap, but this is still ad hoc and embedded directly in the formula generator.

Code References

Location Description
tasks/scripts/release.py:232 render_homebrew_formula generates the Homebrew formula from release assets.
tasks/scripts/release.py:283 Formula writes openshell-gateway-homebrew-service, the shell wrapper launchd runs.
tasks/scripts/release.py:292 Current wrapper location for ~/.config/openshell/gateway.env sourcing and secret bootstrap work.
tasks/scripts/release.py:309 Wrapper mirrors Docker TLS materials into $HOME/.local/state/openshell/homebrew/tls, showing existing macOS bootstrap behavior.
tasks/scripts/release.py:346 Homebrew service do block sets launchd environment variables for gateway defaults.
install.sh:574 patch_homebrew_formula mutates downloaded release formulas for compatibility fixes.
install.sh:846 install_macos_homebrew downloads, patches, stages, installs, and restarts the Homebrew service.
deploy/rpm/init-gateway-env.sh:1 RPM env generator script and model for first-start gateway.env creation.
deploy/rpm/init-gateway-env.sh:21 Script exits when gateway.env already exists, which may not repair partially-created files.
deploy/rpm/init-gateway-env.sh:29 Generates a 32-byte hex SSH handshake secret.
deploy/rpm/init-gateway-env.sh:42 Writes OPENSHELL_SSH_HANDSHAKE_SECRET and commented optional config into gateway.env.
openshell.spec:152 RPM unit runs PKI bootstrap before gateway start.
openshell.spec:156 RPM unit runs init-gateway-env.sh before gateway start.
openshell.spec:160 RPM unit reads generated gateway.env with EnvironmentFile.
deploy/deb/openshell-gateway.service:29 Debian service reads ~/.config/openshell/gateway.env but has no env-generation pre-start hook.
deploy/deb/openshell-gateway.service:30 Debian service generates TLS certs before start.
crates/openshell-server/src/lib.rs:162 Gateway requires ssh_handshake_secret for non-Docker/non-VM drivers and errors if missing.
crates/openshell-server/src/lib.rs:701 Driver auto-detection error points users to --drivers or OPENSHELL_DRIVERS.
crates/openshell-server/src/auth/oidc.rs:42 Sandbox-to-server RPCs use the shared sandbox secret instead of OIDC bearer tokens.
crates/openshell-sandbox/src/grpc_client.rs:87 Sandbox interceptor injects the shared secret into the x-sandbox-secret metadata header.
deploy/man/openshell-gateway.env.5.md:43 Existing manpage documents the secret as a shared HMAC secret.

Current Behavior

On macOS, the install script downloads openshell.rb, patches it for compatibility, installs it into a local Homebrew tap, and starts the service. The formula installs openshell, openshell-gateway, and the VM driver; postinstall generates TLS certificates and codesigns the VM driver. Launchd starts the generated wrapper script, which then executes openshell-gateway with launchd-provided environment variables.

If OPENSHELL_DRIVERS is unset and auto-detection cannot find Kubernetes, Podman, or Docker, the gateway errors with no compute driver configured. VM is intentionally never auto-detected. If the user sets OPENSHELL_DRIVERS=podman, the gateway proceeds far enough to validate Podman requirements, then errors if OPENSHELL_SSH_HANDSHAKE_SECRET is unset. Docker and VM are exempt from that startup check, which is why this failure appears specifically after selecting Podman or Kubernetes.

The SSH handshake secret is not an SSH key pair. It is a symmetric secret used by sandbox workloads when calling gateway RPCs. The sandbox client adds it as x-sandbox-secret; the gateway validates it for sandbox-only or dual-auth RPC methods.

What Would Need to Change

The project needs a single macOS gateway configuration bootstrap strategy rather than incremental fixes for each missing env var.

Areas to address:

  • Define the canonical Homebrew config file contract: path, format, ownership, permissions, generated values, and override precedence.
  • Move bootstrap logic out of large inline shell snippets where possible. A reusable helper script or generated wrapper function would be easier to test than awk patch fragments in install.sh and a long heredoc in release.py.
  • Decide whether Homebrew should select a default driver, prompt/instruct users to select one, or provide a first-run diagnostic that explains viable local drivers and required services.
  • Ensure required generated values are repaired when gateway.env exists but is incomplete. The RPM generator currently exits if the file exists; Homebrew must handle the partial-file case because users may create gateway.env themselves to set OPENSHELL_DRIVERS.
  • Align Debian and Homebrew with the RPM env-generation pattern where applicable. Debian currently reads gateway.env but does not run init-gateway-env.sh.
  • Improve docs so users understand which settings are generated secrets versus operator choices. OPENSHELL_SSH_HANDSHAKE_SECRET should be documented as a shared sandbox RPC secret, not an SSH public/private key.
  • Add packaging tests that exercise a fresh Homebrew formula, an existing gateway.env containing only OPENSHELL_DRIVERS=podman, and preservation of existing secrets on reinstall.

Alternative Approaches Considered

  1. Keep patching the Homebrew wrapper in place.

    • Pros: low disruption, fast to ship.
    • Cons: grows brittle shell in the generated formula and installer patcher; likely repeats the current whack-a-mole pattern.
  2. Add a package-shared openshell-gateway init-env or openshell-gateway bootstrap-config command.

    • Pros: centralizes required generated config in the gateway binary, usable by Homebrew, Debian, RPM, and docs; easier to test with Rust unit/integration tests.
    • Cons: adds a new user-facing or packaging-facing command and must carefully preserve existing files and secrets.
  3. Add a package-shared shell helper script used by Homebrew/Debian/RPM.

    • Pros: closer to existing RPM pattern and simpler than new Rust surface.
    • Cons: macOS and Linux portability details still live in shell; release formula still needs to carry or generate the script.
  4. Change the gateway to generate missing secrets itself at startup.

    • Pros: removes packaging burden for secrets.
    • Cons: risky because generated secrets need persistence, predictable rotation semantics, and clear separation from operator-supplied config. The gateway may not know the right config path in all deployment modes.

Patterns to Follow

  • RPM init-gateway-env.sh is the clearest existing pattern for generated env files and user-editable configuration.
  • Homebrew already uses a wrapper script for macOS-specific runtime setup, especially Docker TLS mirror paths.
  • Helm uses Kubernetes Secrets for the handshake secret and references them from the gateway pod environment.
  • Tests for the Homebrew formula already live in python/openshell/release_formula_test.py and should be extended rather than replaced.
  • Architecture docs should stay at the boundary/invariant level; detailed user instructions belong in docs/, while packaging details may belong in package READMEs or manpages.

Proposed Approach

Introduce a single gateway configuration bootstrap mechanism that Homebrew, Debian, and RPM can all use or mirror. Prefer a packaging-facing openshell-gateway subcommand or a shared helper script that can create or repair ~/.config/openshell/gateway.env idempotently without overwriting user choices. Homebrew should invoke that bootstrap before launching the gateway, then source gateway.env; Debian should run the same bootstrap before reading the file; RPM can either continue using its current script or migrate to the shared path. Documentation should separate generated secrets from operator-selected runtime choices like compute driver selection.

Scope Assessment

  • Complexity: Medium
  • Confidence: High for diagnosis; Medium for final design because the repo needs a decision between Rust subcommand and shared shell helper.
  • Estimated files to change: 8-12
  • Issue type: fix

Risks & Open Questions

  • Should Homebrew default to vm, leave driver choice explicit, or improve diagnostics without setting a driver?
  • Should the gateway binary own config bootstrap, or should package scripts own it?
  • How should bootstrap repair partial files while preserving user edits and comments?
  • Should RPM init-gateway-env.sh also be updated to repair missing required keys in existing files?
  • Should Debian add an env-generation pre-start hook to match RPM and Homebrew?
  • How should secret rotation be documented for local gateways, especially when sandboxes may already exist?
  • The name ssh_handshake_secret is misleading; consider a follow-up rename or docs clarification to distinguish it from SSH key pairs.

Test Considerations

  • Unit-test the bootstrap function or script with: no file, file with only OPENSHELL_DRIVERS, file with existing secret, and file with user comments.
  • Extend python/openshell/release_formula_test.py to assert Homebrew formula behavior and preservation semantics.
  • Add shell tests for installer formula patching if this logic remains in install.sh.
  • Update release canary to exercise gateway.env with only OPENSHELL_DRIVERS=podman or vm and verify the service reaches openshell status.
  • Run mise run docs after docs updates.
  • Run package-specific release formula tests plus relevant pre-commit checks before PR.

Created by spike investigation. Use build-from-issue to plan and implement.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions