Problem Statement
OpenShell can express generic GPU intent with openshell sandbox create --gpu, but users cannot request a specific GPU count through the public sandbox API.
For Kubernetes-backed gateways, generic GPU intent maps to a single nvidia.com/gpu resource request. This blocks workloads that need multiple GPUs, for example:
openshell sandbox create --gpu-count 4 -- claude
Users can work around this only by injecting Kubernetes-specific resource settings through sandbox templates. That makes a common scheduling requirement driver-specific and bypasses OpenShell's typed sandbox spec layer.
Proposed Design
Add first-class GPU count support across the public sandbox spec, compute-driver spec, CLI, server mapping, and Kubernetes driver.
Public API:
- Add gpu_count to SandboxSpec.
- Use default 0 to mean unspecified/default.
- Use values >0 to request that many GPUs.
- Preserve existing gpu: true behavior.
Compute driver API:
- Add gpu_count to DriverSandboxSpec.
- Copy SandboxSpec.gpu_count into DriverSandboxSpec.gpu_count in the server public-to-driver mapping.
CLI:
- Add openshell sandbox create --gpu-count COUNT.
- Reject --gpu-count 0.
- Treat --gpu-count N as GPU intent, equivalent to setting gpu: true.
- Reject combining --gpu-count with --gpu-device, because count-based scheduling and device-specific selection are different allocation modes.
Kubernetes driver:
- If gpu_count > 0, set the sandbox container resource limit:
resources:
limits:
nvidia.com/gpu: "<count>"
- If gpu_count == 0 and gpu == true, preserve current behavior by requesting one GPU.
- Preserve existing CPU, memory, custom resource, and typed-resource overlay behavior.
- Require clusters to expose allocatable nvidia.com/gpu resources through the NVIDIA device plugin or equivalent.
Compatibility:
- Existing clients omit gpu_count, so it defaults to 0.
- Existing --gpu behavior remains unchanged.
- Docker, Podman, and VM drivers can safely receive the new field and ignore it unless they later add explicit count support.
Acceptance criteria:
- openshell sandbox create --gpu-count 4 -- claude sends SandboxSpec { gpu: true, gpu_count: 4 }.
- --gpu-count 0 is rejected with a clear error.
- --gpu-count cannot be combined with --gpu-device.
- Server mapping copies public gpu_count into the driver spec.
- Kubernetes pod rendering emits limits["nvidia.com/gpu"] == "4" for gpu_count: 4.
- Existing --gpu still emits limits["nvidia.com/gpu"] == "1".
- Docs explain --gpu-count, Kubernetes nvidia.com/gpu scheduling, and the --gpu-device conflict.
Alternatives Considered
- Continue injecting nvidia.com/gpu through raw template resources.
- This works only for users who know the Kubernetes resource model and bypasses OpenShell's typed sandbox API.
- Overload --gpu with an optional value.
- This is ambiguous and risks breaking existing boolean flag behavior.
- Reuse --gpu-device for counts.
- Device-specific selection and count-based scheduling are separate allocation modes, so combining them would make driver behavior unclear.
Agent Investigation
- Inspected the existing proto contracts, CLI sandbox-create path, server compute mapping, and Kubernetes driver rendering path.
- Found that OpenShell already has a public-to-driver sandbox spec mapping layer, so GPU count belongs in typed specs rather than template resource passthrough.
- Found existing Kubernetes GPU behavior maps generic gpu: true to one nvidia.com/gpu limit.
- Identified docs that need updates: sandbox management docs, Kubernetes setup prerequisites, Kubernetes driver README, and compute runtime architecture docs.
Checklist
Problem Statement
OpenShell can express generic GPU intent with
openshell sandbox create --gpu, but users cannot request a specific GPU count through the public sandbox API.For Kubernetes-backed gateways, generic GPU intent maps to a single
nvidia.com/gpuresource request. This blocks workloads that need multiple GPUs, for example:Users can work around this only by injecting Kubernetes-specific resource settings through sandbox templates. That makes a common scheduling requirement driver-specific and bypasses OpenShell's typed sandbox spec layer.
Proposed Design
Add first-class GPU count support across the public sandbox spec, compute-driver spec, CLI, server mapping, and Kubernetes driver.
Public API:
Compute driver API:
CLI:
Kubernetes driver:
Compatibility:
Acceptance criteria:
Alternatives Considered
Agent Investigation
Checklist