Skip to content

fix(serverless-auth): fail closed on unexpected upstream status#2238

Open
rafel-roboflow wants to merge 1 commit intomainfrom
feat/dg-462-check_authorization_serverless-unhandled-upstream-status
Open

fix(serverless-auth): fail closed on unexpected upstream status#2238
rafel-roboflow wants to merge 1 commit intomainfrom
feat/dg-462-check_authorization_serverless-unhandled-upstream-status

Conversation

@rafel-roboflow
Copy link
Copy Markdown
Contributor

@rafel-roboflow rafel-roboflow commented Apr 17, 2026

What does this PR do?

Fixes a silent-authorization bug in check_authorization_serverless. When get_serverless_usage_check_async returned any status other than 200/401/402 (e.g. 429, 500, 503), the if/elif chain fell through and the request was served with workspace_id = None, bypassing billing verification.

Now any unrecognized upstream status:

  • Returns HTTP 503 ("Authorization service temporarily unavailable. Please retry.")
  • Logs a warning with the unexpected status
  • Is deliberately not cached, so the next request retries the upstream once it recovers

Related Issue(s): N/A

Type of Change

  • Bug fix (non-breaking change that fixes an issue)

Testing

  • I have tested this change locally
  • I have added/updated tests for this change

Test details:
Added test_serverless_auth_middleware_fails_closed_on_unexpected_upstream_status in tests/inference/unit_tests/core/interfaces/http/test_http_api.py. It mocks the usage check to return status 503 and asserts the middleware:

  • Responds with 503 and the expected message
  • Does not set WORKSPACE_ID_HEADER
  • Does not invoke model_manager.add_model / infer_from_request_sync
  • Re-queries the upstream on every request (no caching of the failure)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have updated the documentation accordingly (if applicable)

Additional Context

Fails closed rather than open on transient upstream errors. The previous behavior also caused OTel spans to report auth.result="authorized" for these requests, masking upstream failures in traces.


Note

High Risk
Changes serverless authorization/billing gate behavior: unexpected upstream statuses now block requests with a 503, which could impact availability but closes a potential authorization bypass.

Overview
Prevents serverless requests from being treated as authorized when get_serverless_usage_check_async returns an unrecognized status code (e.g. 429/5xx).

check_authorization_serverless now logs a warning and returns 503 with a retryable message, and deliberately does not cache these failures so each request re-checks upstream; adds a unit test asserting this fail-closed, no-caching behavior.

Reviewed by Cursor Bugbot for commit fe521ce. Bugbot is set up for automated code reviews on this repo. Configure here.

Unknown status codes from get_serverless_usage_check_async now return
503 without caching, instead of silently authorizing the request.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant