Skip to content

Add /system/standby API for out-of-band scale-to-zero pin#237

Open
sjmiller609 wants to merge 5 commits into
mainfrom
hypeship/scaletozero-pin-api
Open

Add /system/standby API for out-of-band scale-to-zero pin#237
sjmiller609 wants to merge 5 commits into
mainfrom
hypeship/scaletozero-pin-api

Conversation

@sjmiller609
Copy link
Copy Markdown
Contributor

@sjmiller609 sjmiller609 commented May 11, 2026

Summary

Adds two new endpoints to the kernel-images server:

  • POST /system/standby/disable — pins scale-to-zero off until released
  • POST /system/standby/enable — releases the pin

The pin lives alongside the existing request-driven middleware refcount inside DebouncedController:

  • scale-to-zero stays disabled while either request holders remain inflight or the pin is held
  • request-driven Enable (from middleware) does not release the pin
  • releasing the pin while requests are inflight defers the underlying enable until the last request completes
  • the configured re-enable cooldown applies on pin release exactly as it does on request release

DebouncedController and NoopController now also implement a new PinnedController sub-interface (Controller + DisablePin / EnablePin). The pin is a boolean — DisablePin / EnablePin are idempotent.

Why

This is the in-VM surface needed for a future control-plane integration: an external system (e.g. a hot-pool controller) needs to hold a VM out of standby while it sits idle in a pool, then release the hold when the VM is claimed.

The existing middleware refcount only works for inflight HTTP requests, so it can't hold a VM disabled across an idle period.

Notes for reviewers

  • All existing middleware-driven flows are byte-identical (the pin defaults to false; all current call sites take the same code paths).
  • The constructor return type for NewDebouncedController* widened from Controller to *DebouncedController so callers can access the pin methods. Only existing caller is cmd/api/main.go, which is unaffected since *DebouncedController still satisfies Controller for recorder.NewFFmpegRecorderFactory and scaletozero.Middleware.
  • Control-plane wiring (metro-api proxy + API server) is intentionally not in this PR.

Test plan

  • go test -race ./lib/scaletozero/... passes (6 new tests covering pin semantics)
  • go test -race ./cmd/api/... passes
  • go vet ./... clean
  • go build ./... clean
  • Manual review of openapi.yaml + regenerated lib/oapi/oapi.go

Note

Medium Risk
Updates core scale-to-zero coordination logic (new refcounted Acquire/Release plus persistent Disable/Enable) and threads it through request middleware and ffmpeg recording, so regressions could affect VM standby behavior and recording reliability.

Overview
Adds new out-of-band API endpoints POST /scaletozero/disable and POST /scaletozero/enable (OpenAPI + generated oapi client/server + ApiService handlers) to persistently pin scale-to-zero off/on.

Refactors scale-to-zero control to separate a low-level Toggler from a higher-level Controller that now supports both refcounted holds (Acquire/Release, used by HTTP middleware and ffmpeg) and an idempotent persistent override (Disable/Enable, used by the new API), including cooldown-aware re-enable logic.

Updates the ffmpeg recorder and scaletozero middleware to use Acquire/Release instead of directly toggling, and adds unit + e2e coverage to validate idempotency and that normal requests continue to work while pinned.

Reviewed by Cursor Bugbot for commit 7d26ce6. Bugbot is set up for automated code reviews on this repo. Configure here.

sjmiller609 and others added 5 commits May 8, 2026 22:48
Adds two new endpoints to the kernel-images server:
- POST /system/standby/disable — pins scale-to-zero off until released
- POST /system/standby/enable  — releases the pin

The pin lives alongside the existing request-driven middleware refcount in
DebouncedController: scale-to-zero stays disabled while either holders are
inflight requests OR the pin is held. Request-driven Enable calls do not
release the pin, so a pinned VM survives idle periods. Releasing the pin
honors any configured re-enable cooldown.

This is the in-VM surface for future control-plane integrations (e.g. a
hot-pool controller reserving a VM until it is claimed). Control-plane
wiring will follow in metro-api and the API server.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Spins up the headless image via testcontainers and exercises:
- Idempotent disable (two consecutive 204s)
- A normal request flows while pinned (middleware coexistence)
- Idempotent enable (two consecutive 204s)

The unikraft control file does not exist inside the docker test
container, so the underlying scale-to-zero write is a no-op. The
test validates HTTP wiring and handler/middleware coexistence; the
deep pin semantics are covered by unit tests against DebouncedController.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Address review feedback:
- Path /system/standby/* implied VM-state mutation; rename to
  /scaletozero/{pin,unpin} so the operation is specific to the
  scale-to-zero gate.
- Interface methods DisablePin/EnablePin read as inverted; rename to
  Pin/Unpin for clarity.
- Rewrite openapi summary/description to be caller-focused (what it
  does, when to call, what pairs with).
Match user-facing terminology to the action ("disable scale to zero")
rather than the internal pin mechanism. Internal PinnedController.Pin/Unpin
methods retain pin/unpin naming since they're distinct from the refcounted
Controller.Disable/Enable.
Rename refcounted hold methods to Acquire/Release so that Disable/Enable
can carry the idempotent persistent-toggle semantics defined by the
/scaletozero/{disable,enable} API. Split the low-level direct toggle out
into a separate Toggler interface (unikraftCloudToggler) wrapped by
DebouncedController.
@sjmiller609 sjmiller609 marked this pull request as ready for review May 11, 2026 19:02
@firetiger-agent
Copy link
Copy Markdown

Monitoring Plan: /system/standby API for Out-of-Band Scale-to-Zero Pin

This PR adds two new HTTP endpoints (POST /system/standby/disable and POST /system/standby/enable) to the kernel-images browser VM server, enabling explicit lifecycle control of Unikraft/Hypeman scale-to-zero independently of the per-request middleware refcount. The core change introduces a PinnedController interface and a pinned boolean in DebouncedController that blocks re-enable calls from the HTTP middleware while the pin is held. DebouncedController's maybeReenableLocked helper is now shared between Enable and EnablePin. The change is well-tested with 6 new unit tests and a new e2e test.

Key risks to watch: (1) mutex contention or a logic error in the refactored maybeReenableLocked helper causing pool drain/fill regressions — watch for drops in kernel_browser_pool_pop_total or spikes in kernel_browser_pool_empty_pop_total vs. baseline (~50K–90K metric data points/hr pop rate, ~420–1,140/hr empty-pop count); (2) any ERROR logs mentioning "failed to disable standby" or "failed to release scale-to-zero pin" (expected baseline: 0); (3) pin state leaks if callers don't pair disable/enable. Blast radius is limited to individual VM processes — a bug would not affect the central API service.

Status updates will be posted automatically on this PR as monitoring progresses.

View agent

@sjmiller609 sjmiller609 requested review from Sayan- and hiroTamada and removed request for hiroTamada May 11, 2026 19:17
Comment thread server/openapi.yaml Outdated
Comment on lines +1337 to +1357
/system/standby/disable:
post:
summary: Pin scale-to-zero off until /system/standby/enable is called
description: >
Disables scale-to-zero out-of-band of the request-driven middleware,
holding it disabled across idle periods. The pin is independent of the
inflight-request refcount: request-driven Enable calls will not release
it. Idempotent — repeated calls have no additional effect.
operationId: disableStandby
responses:
"204":
description: Standby pinned disabled
"500":
$ref: "#/components/responses/InternalError"
/system/standby/enable:
post:
summary: Release the standby pin set by /system/standby/disable
description: >
Releases the out-of-band scale-to-zero pin. If no request-driven
holders remain, scale-to-zero re-enables (honoring any configured
cooldown). Idempotent — calling without a held pin has no effect.
Copy link
Copy Markdown
Contributor

@Sayan- Sayan- May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls deslop descriptions. these read like internal implementation notes rather than information a caller needs: what does calling this endpoint do for me, when do I call it, and what pairs with it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote both summary/description to be caller-focused — what the endpoint does, when to call it, and what pairs with it. See a41fed3.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hold on, I'm manually reviewing these now

Comment thread server/openapi.yaml Outdated
text/event-stream:
schema:
$ref: "#/components/schemas/PublishedEnvelope"
/system/standby/disable:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would recommend making the path/operation more specific to scaletozero since the current naming could imply it's actually mutating the state of the vm, which is not what we're doin

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed paths to /scaletozero/pin and /scaletozero/unpin, operationIds to pinScaleToZero/unpinScaleToZero. a41fed3.

Comment thread server/lib/scaletozero/scaletozero.go Outdated
Comment on lines +33 to +38
// DisablePin pins scale-to-zero disabled until EnablePin is called. The
// pin is a boolean, not a counter: repeated calls are idempotent.
DisablePin(ctx context.Context) error
// EnablePin releases the pin. If no request-driven holders remain,
// scale-to-zero is re-enabled (honoring any configured cooldown).
EnablePin(ctx context.Context) error
Copy link
Copy Markdown
Contributor

@Sayan- Sayan- May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these operations seem inverted. I had to re-read this a few times to wrap my head around it. would recommend revisiting semantics here to simplify. even Pin / Unpin would be clearer

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to Pin/Unpin on the PinnedController interface, DebouncedController, and NoopController. a41fed3.

Comment thread server/openapi.yaml Outdated
text/event-stream:
schema:
$ref: "#/components/schemas/PublishedEnvelope"
/scaletozero/pin:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the path like enable / disable is more clear

Comment thread server/cmd/api/api/api.go Outdated
factory recorder.FFmpegRecorderFactory,
upstreamMgr *devtoolsproxy.UpstreamManager,
stz scaletozero.Controller,
stz scaletozero.PinnedController,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting rid of "pin" terminology

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants