Skip to content

feat: add AutoRouter, Responses API support, and provider detection#6

Merged
jhaynie merged 5 commits intomainfrom
feature/autorouter-and-responses-api
Apr 14, 2026
Merged

feat: add AutoRouter, Responses API support, and provider detection#6
jhaynie merged 5 commits intomainfrom
feature/autorouter-and-responses-api

Conversation

@jhaynie
Copy link
Copy Markdown
Member

@jhaynie jhaynie commented Apr 14, 2026

Summary

This PR adds automatic routing capabilities and OpenAI Responses API support:

  • AutoRouter: Single endpoint that auto-detects provider and API type
  • Responses API: Full support for OpenAI's new Responses API (/v1/responses)
  • Provider Detection: Automatic routing based on model name or X-Provider header
  • Provider Prefix Stripping: Models like openai/gpt-4 are stripped to gpt-4 before forwarding

Changes

New Features

  • AutoRouter - HTTP handler that routes to correct provider based on:
    • X-Provider header (explicit override)
    • Model name patterns (gpt-* → OpenAI, claude-* → Anthropic, etc.)
    • Provider prefix in model name (anthropic/claude-3-opus)
  • OpenAI Responses API support (/v1/responses) with parser, extractor, resolver
  • API type detection from request body or path
  • Provider prefix stripping (only for known providers)

Improvements

  • Simplified example using AutoRouter (~200 lines vs 411)
  • Billing result stored in response metadata for automatic header injection
  • Comprehensive test coverage (18+ new test cases)

Usage

# Auto-detect from model name
curl -X POST http://localhost:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"gpt-4","messages":[{"role":"user","content":"Hello"}]}'

# With provider prefix
curl -X POST http://localhost:8080/v1/responses \
  -H 'Content-Type: application/json' \
  -d '{"model":"openai/gpt-4o","input":"Hello"}'

# Explicit provider override
curl -X POST http://localhost:8080/v1/messages \
  -H 'Content-Type: application/json' \
  -H 'X-Provider: anthropic' \
  -d '{"model":"claude-3-5-sonnet","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'

Test plan

  • All existing tests pass
  • New unit tests for provider detection
  • New unit tests for API type detection
  • New unit tests for provider prefix stripping
  • New unit tests for AutoRouter
  • New unit tests for Responses API parser/extractor
  • Build succeeds

Summary by CodeRabbit

  • New Features

    • Auto-routing proxy that auto-detects provider and API shape and forwards requests.
    • OpenAI multi-API handling (chat, responses, legacy completions) with adaptive parsing/extraction.
  • Improvements

    • Enhanced provider detection (model patterns, headers, path) and model-prefix stripping; optional model→provider lookup.
    • Billing metadata surfaced as response headers.
  • Documentation

    • Quick start and design docs updated with AutoRouter usage and detection rules.
  • Tests

    • Expanded unit tests for detection, routing, parsing, and multi-API behavior.

- Add AutoRouter for automatic provider/API routing from single endpoint
- Add OpenAI Responses API parser, extractor, and resolver
- Add API type detection (chat completions, responses, completions, messages)
- Add provider detection from model name patterns and X-Provider header
- Strip provider prefix from model names (e.g., openai/gpt-4 -> gpt-4)
- Update example to use AutoRouter with simplified setup
- Store billing result in response metadata for header injection
- Comprehensive test coverage for new functionality
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds API-type detection and provider detection primitives, a new AutoRouter HTTP handler that auto-selects providers, strips model prefixes, parses requests, resolves upstream URLs, and forwards requests. Introduces OpenAI multi‑API parsing/extraction (Responses + chat/completions), billing metadata propagation, model→provider lookup, extensive tests, and updated docs/examples.

Changes

Cohort / File(s) Summary
API & Detection
apitype.go, detection.go, detection_test.go
Adds APIType and JSON/path/provider-aware API detection (DetectAPIType, DetectAPITypeFromPath, DetectAPITypeFromBodyAndProvider) and provider-detection primitives (ProviderHint, ProviderDetector, DefaultProviderDetector, DetectProviderFromModel) with unit tests.
AutoRouter & Routing
autorouter.go, autorouter_test.go
Implements AutoRouter HTTP handler with provider registry, detector, optional model→provider lookup, interceptor chain, request forwarding, model-prefix stripping, error type ProviderError/ErrNoProvider, functional options, and tests for routing, fallback, and prefix handling.
OpenAI-compatible multi-API
providers/openai_compatible/multiapi.go, providers/openai_compatible/provider.go, providers/openai_compatible/resolver.go
Adds MultiAPI parser/extractor and NewMultiAPI constructor; Resolver gains APIType-aware URL resolution to select /v1/responses, /v1/completions, or /v1/chat/completions.
OpenAI Responses parser & extractor
providers/openai_compatible/responses_parser.go, providers/openai_compatible/responses_extractor.go, providers/openai_compatible/responses_test.go
New Responses-style request parser and response extractor: parse input/max_output_tokens/instructions/tools into BodyMetadata (messages/custom), extract output/usage/cache into ResponseMetadata, and tests for parsing/extraction and multi-API routing.
OpenAI provider wrapper
providers/openai/provider.go
New(apiKey) now constructs a MultiAPI-backed provider via NewMultiAPI(...) to enable multi-API handling while preserving the same public signature.
Interceptors / Billing
interceptors/billing.go, interceptors/billing_test.go
Billing interceptor now prefers provider from meta.Custom["provider"], uses shared DetectProviderFromModel helper, and always stores computed billing_result into response metadata when pricing is found; tests updated for provider expectations.
Examples & Docs
examples/basic/main.go, README.md, DESIGN.md
Replaces per-endpoint proxies with a single AutoRouter mounted at /, registers providers and model lookup, centralizes interceptors at router level, updates quick start and design docs to describe auto-routing and API/provider detection.
Model-to-provider lookup
pricing/modelsdev/adapter.go
Adds FindProviderForModel(model string) to return a provider ID for a given model from the adapter cache, with auto-load on cache miss/expiry.
Misc tests & plumbing
providers/.../responses_test.go, autorouter_test.go, detection_test.go, other _test.go files
Adds comprehensive tests covering API/provider detection, AutoRouter forwarding/ServeHTTP, model-prefix stripping, multi-API parsing/extraction, and many edge cases (invalid JSON, unknown paths, provider-specific payloads).
🚥 Pre-merge checks | ✅ 1
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

- Add DetectAPITypeFromBodyAndProvider for body+provider detection
- Path detection now returns empty for unknown paths (routes to body detection)
- AutoRouter now detects provider first, then uses body+provider for API type
- Update example to show POST to / capability
- Add tests for new detection function
@jhaynie jhaynie added the enhancement New feature or request label Apr 14, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
interceptors/billing.go (1)

28-53: ⚠️ Potential issue | 🟠 Major

Routing and billing now disagree on provider IDs.

provider still comes from the file-local detector, but the new router detector in llmproxy/detection.go uses a different canonical set (googleai vs google, plus xai/perplexity/azure/bedrock). Requests can route successfully and still miss pricing here, which means respMeta.Custom["billing_result"] never gets set and autorouter.go cannot emit the new X-Gateway-* headers. Please reuse one shared provider-normalization path for both routing and billing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interceptors/billing.go` around lines 28 - 53, The billing interceptor uses a
local detectProvider(meta.Model) that produces a different provider id set than
the router; replace the file-local detector call with the shared
normalizer/detector from llmproxy/detection.go (e.g. call
llmproxy.DetectProvider or llmproxy.NormalizeProvider) so both routing and
billing use the exact same canonical provider id before calling i.Lookup and
llmproxy.CalculateCost; ensure you update the import if necessary and propagate
the normalized provider variable into Lookup(provider, meta.Model) and
CalculateCost(provider, ...) so respMeta.Custom["billing_result"] is
consistently set.
🧹 Nitpick comments (6)
providers/openai_compatible/multiapi.go (2)

40-60: Replace custom byteReader with bytes.Reader from standard library.

The custom byteReader type duplicates bytes.Reader functionality. Using bytes.NewReader(data) directly would simplify the code and reduce maintenance surface.

♻️ Proposed simplification
-type byteReader struct {
-	data []byte
-	pos  int
-}
-
-func NewBytesReader(data []byte) *byteReader {
-	return &byteReader{data: data}
-}
-
-func (r *byteReader) Read(p []byte) (n int, err error) {
-	if r.pos >= len(r.data) {
-		return 0, io.EOF
-	}
-	n = copy(p, r.data[r.pos:])
-	r.pos += n
-	return n, nil
-}
-
-func (r *byteReader) Close() error {
-	return nil
-}

Then update the Parse method:

 	switch apiType {
 	case llmproxy.APITypeResponses:
-		return p.responsesParser.Parse(io.NopCloser(NewBytesReader(data)))
+		return p.responsesParser.Parse(io.NopCloser(bytes.NewReader(data)))
 	default:
-		return p.chatCompletionsParser.Parse(io.NopCloser(NewBytesReader(data)))
+		return p.chatCompletionsParser.Parse(io.NopCloser(bytes.NewReader(data)))
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@providers/openai_compatible/multiapi.go` around lines 40 - 60, Replace the
custom byteReader type and its constructor NewBytesReader with the standard
library bytes.Reader: remove the byteReader struct, Read and Close
implementations, and any calls to NewBytesReader; instead create readers with
bytes.NewReader(data). Update usages in the Parse method (and any other places
referencing byteReader or NewBytesReader) to accept or use io.Reader from
bytes.NewReader, ensuring signatures/types match (e.g., io.ReadCloser ->
io.Reader or wrap bytes.NewReader with ioutil.NopCloser if a Close is required).

74-92: Extractor detection logic differs from parser detection - potential mismatch.

MultiAPIParser.Parse uses llmproxy.DetectAPIType(data) based on the presence of input field, while MultiAPIExtractor.Extract uses a different heuristic (presence of output + absence of choices).

If a provider returns an unexpected response format, the extractor could route differently than expected. Consider aligning the detection logic or documenting this intentional asymmetry (request detection by input field, response detection by output field is reasonable given different payloads).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@providers/openai_compatible/multiapi.go` around lines 74 - 92, The extractor
detection in MultiAPIExtractor.Extract currently routes by checking "output" and
absence of "choices", which can diverge from MultiAPIParser.Parse's use of
llmproxy.DetectAPIType (which inspects the presence of "input"); update Extract
to use the same detection function (llmproxy.DetectAPIType) on the unmarshaled
response body or explicitly document the intentional difference so both paths
align; specifically, modify MultiAPIExtractor.Extract to call
llmproxy.DetectAPIType(data) and branch to e.responsesExtractor.Extract or
e.chatCompletionsExtractor.Extract based on the returned API type, ensuring
resp.Body is restored (io.NopCloser(bytes.NewReader(body))) before delegating.
autorouter.go (1)

233-240: Consider using strings.Index instead of custom indexOfSlash.

The standard library strings.Index(s, "/") provides the same functionality:

♻️ Proposed simplification
+import "strings"
+
 func stripProviderPrefix(model string) (stripped string, hasPrefix bool) {
-	idx := indexOfSlash(model)
+	idx := strings.Index(model, "/")
 	if idx < 0 {
 		return model, false
 	}
 	prefix := model[:idx]
 	if knownProviderPrefixes[prefix] {
 		return model[idx+1:], true
 	}
 	return model, false
 }
-
-func indexOfSlash(s string) int {
-	for i := 0; i < len(s); i++ {
-		if s[i] == '/' {
-			return i
-		}
-	}
-	return -1
-}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autorouter.go` around lines 233 - 240, Replace the custom indexOfSlash
function with the standard library call strings.Index(s, "/") wherever
indexOfSlash is used; remove the indexOfSlash implementation and add/ensure the
"strings" import is present so callers (e.g., any code invoking indexOfSlash)
call strings.Index(s, "/") instead.
examples/basic/main.go (1)

225-246: Consider making endpoint list dynamic based on registered providers.

The startup logging shows a static list of supported endpoints regardless of which providers are actually registered. For example, /v1/messages (Anthropic) is listed even if only OpenAI is configured.

This could confuse users who see endpoints advertised that won't work without the corresponding provider. Consider either removing this static list or generating it dynamically based on registered providers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/basic/main.go` around lines 225 - 246, The startup logging currently
prints a hard-coded list via repeated logr.Info calls (the block that prints
"Supported endpoints:" and the subsequent endpoint lines); change this to emit
endpoints based on which providers are actually registered at startup: query
your provider registry (e.g., whatever slice/map you populate when registering
providers such as providers, providerRegistry or during
setupProviders/RegisterProvider) and build the supported-endpoints strings
dynamically (iterate registered providers and append their supported routes like
/v1/messages for Anthropic or /v1/chat/completions for OpenAI) before logging,
replacing the static logr.Info lines with a loop that logs only the endpoints
for active providers.
providers/openai_compatible/responses_extractor.go (1)

62-73: Consider handling multiple content items and content types.

The function only returns the first output_text content from the first message output item. If a response contains multiple content items (e.g., text plus tool calls) or multiple message outputs, they are silently ignored.

This may be intentional for simplicity, but downstream consumers expecting full response content may miss data. Consider whether aggregating all text content or exposing additional content types is needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@providers/openai_compatible/responses_extractor.go` around lines 62 - 73, The
current extractResponsesContent function only returns the first c.Text for the
first ResponsesOutputItem with Type == "message" and c.Type == "output_text";
change it to aggregate all relevant text content by iterating all output items
and all item.Content entries, collecting every c.Text where c.Type ==
"output_text" (and optionally support other c.Type values you need), then join
them (e.g., with "\n" or a configurable separator) and return the combined
string; update extractResponsesContent to return the aggregated result instead
of exiting on the first match so multiple message outputs and multiple content
items are preserved.
providers/openai_compatible/responses_parser.go (1)

36-51: Empty messages may be appended when input array items lack role/content.

When input is an array, items that are map[string]interface{} but lack role or content fields will result in llmproxy.Message{} with empty strings being appended. This could cause issues downstream if consumers expect non-empty messages.

Consider either skipping items that don't have valid content or adding validation:

♻️ Proposed validation
 			for _, item := range v {
 				if m, ok := item.(map[string]interface{}); ok {
 					msg := llmproxy.Message{}
 					if role, ok := m["role"].(string); ok {
 						msg.Role = role
 					}
 					if content, ok := m["content"].(string); ok {
 						msg.Content = content
 					}
+					if msg.Role != "" || msg.Content != "" {
-					msgs = append(msgs, msg)
+						msgs = append(msgs, msg)
+					}
 				}
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@providers/openai_compatible/responses_parser.go` around lines 36 - 51, The
current loop that builds msgs can append empty llmproxy.Message when an item
lacks role or content; update the []interface{} case to validate each item (the
variables item, m, role, content and the constructed llmproxy.Message) and only
append to msgs if both role and content are present and non-empty (optionally
trim whitespace) before assigning meta.Messages; skip (or record) items missing
required fields to avoid injecting empty messages downstream.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autorouter.go`:
- Around line 100-106: The json.Marshal error is being ignored when re-encoding
raw after stripProviderPrefix; update the logic around
stripProviderPrefix/raw/body so you capture and handle the error from
json.Marshal(raw) instead of discarding it — call json.Marshal(raw), check the
returned error, and either return that error or log it and abort the request so
we don't send the old body with the provider prefix; modify the block in
autorouter.go where stripProviderPrefix is used (the code that sets raw["model"]
and assigns body) to handle the marshal error and propagate or log it
appropriately.
- Around line 189-196: Move the body read before committing response headers:
call io.ReadAll(resp.Body) first (checking err) and if it fails use
http.Error(w, err.Error(), http.StatusInternalServerError) while headers are
still writable; only after successfully reading the body copy headers from resp
to w and call w.WriteHeader(resp.StatusCode). Also handle the error returned by
w.Write(body) (e.g., log it or return) instead of ignoring it. Ensure you
reference the existing resp.Body, io.ReadAll, w.WriteHeader, and w.Write call
sites when making the change.

In `@detection.go`:
- Around line 27-58: Currently DefaultProviderDetector calls
detectProviderFromHeaders first so heuristics like the generic "api-key" ->
"azure" can override an explicit model; update the logic so X-Provider remains
an immediate override but all other header heuristics run only after model-based
detection (i.e., check hint.Model and call DetectProviderFromModel(hint.Model)
before running non-X-Provider header heuristics), or alternatively restrict
detectProviderFromHeaders to only return a provider for the explicit X-Provider
header and move the other header checks into a secondary function that is
invoked only if DetectProviderFromModel returns empty; touch
DefaultProviderDetector, detectProviderFromHeaders, ProviderHint usage and the
call site to enforce the new precedence.

---

Outside diff comments:
In `@interceptors/billing.go`:
- Around line 28-53: The billing interceptor uses a local
detectProvider(meta.Model) that produces a different provider id set than the
router; replace the file-local detector call with the shared normalizer/detector
from llmproxy/detection.go (e.g. call llmproxy.DetectProvider or
llmproxy.NormalizeProvider) so both routing and billing use the exact same
canonical provider id before calling i.Lookup and llmproxy.CalculateCost; ensure
you update the import if necessary and propagate the normalized provider
variable into Lookup(provider, meta.Model) and CalculateCost(provider, ...) so
respMeta.Custom["billing_result"] is consistently set.

---

Nitpick comments:
In `@autorouter.go`:
- Around line 233-240: Replace the custom indexOfSlash function with the
standard library call strings.Index(s, "/") wherever indexOfSlash is used;
remove the indexOfSlash implementation and add/ensure the "strings" import is
present so callers (e.g., any code invoking indexOfSlash) call strings.Index(s,
"/") instead.

In `@examples/basic/main.go`:
- Around line 225-246: The startup logging currently prints a hard-coded list
via repeated logr.Info calls (the block that prints "Supported endpoints:" and
the subsequent endpoint lines); change this to emit endpoints based on which
providers are actually registered at startup: query your provider registry
(e.g., whatever slice/map you populate when registering providers such as
providers, providerRegistry or during setupProviders/RegisterProvider) and build
the supported-endpoints strings dynamically (iterate registered providers and
append their supported routes like /v1/messages for Anthropic or
/v1/chat/completions for OpenAI) before logging, replacing the static logr.Info
lines with a loop that logs only the endpoints for active providers.

In `@providers/openai_compatible/multiapi.go`:
- Around line 40-60: Replace the custom byteReader type and its constructor
NewBytesReader with the standard library bytes.Reader: remove the byteReader
struct, Read and Close implementations, and any calls to NewBytesReader; instead
create readers with bytes.NewReader(data). Update usages in the Parse method
(and any other places referencing byteReader or NewBytesReader) to accept or use
io.Reader from bytes.NewReader, ensuring signatures/types match (e.g.,
io.ReadCloser -> io.Reader or wrap bytes.NewReader with ioutil.NopCloser if a
Close is required).
- Around line 74-92: The extractor detection in MultiAPIExtractor.Extract
currently routes by checking "output" and absence of "choices", which can
diverge from MultiAPIParser.Parse's use of llmproxy.DetectAPIType (which
inspects the presence of "input"); update Extract to use the same detection
function (llmproxy.DetectAPIType) on the unmarshaled response body or explicitly
document the intentional difference so both paths align; specifically, modify
MultiAPIExtractor.Extract to call llmproxy.DetectAPIType(data) and branch to
e.responsesExtractor.Extract or e.chatCompletionsExtractor.Extract based on the
returned API type, ensuring resp.Body is restored
(io.NopCloser(bytes.NewReader(body))) before delegating.

In `@providers/openai_compatible/responses_extractor.go`:
- Around line 62-73: The current extractResponsesContent function only returns
the first c.Text for the first ResponsesOutputItem with Type == "message" and
c.Type == "output_text"; change it to aggregate all relevant text content by
iterating all output items and all item.Content entries, collecting every c.Text
where c.Type == "output_text" (and optionally support other c.Type values you
need), then join them (e.g., with "\n" or a configurable separator) and return
the combined string; update extractResponsesContent to return the aggregated
result instead of exiting on the first match so multiple message outputs and
multiple content items are preserved.

In `@providers/openai_compatible/responses_parser.go`:
- Around line 36-51: The current loop that builds msgs can append empty
llmproxy.Message when an item lacks role or content; update the []interface{}
case to validate each item (the variables item, m, role, content and the
constructed llmproxy.Message) and only append to msgs if both role and content
are present and non-empty (optionally trim whitespace) before assigning
meta.Messages; skip (or record) items missing required fields to avoid injecting
empty messages downstream.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 44c78c73-a542-40af-bc5f-5a012f13ffc1

📥 Commits

Reviewing files that changed from the base of the PR and between e1eebce and 3fdc6b3.

📒 Files selected for processing (14)
  • apitype.go
  • autorouter.go
  • autorouter_test.go
  • detection.go
  • detection_test.go
  • examples/basic/main.go
  • interceptors/billing.go
  • providers/openai/provider.go
  • providers/openai_compatible/multiapi.go
  • providers/openai_compatible/provider.go
  • providers/openai_compatible/resolver.go
  • providers/openai_compatible/responses_extractor.go
  • providers/openai_compatible/responses_parser.go
  • providers/openai_compatible/responses_test.go
📜 Review details
🔇 Additional comments (11)
providers/openai/provider.go (1)

22-23: Nice drop-in upgrade path for OpenAI users.

This keeps the public openai.New(...) API stable while enabling /v1/responses support through the shared OpenAI-compatible stack.

providers/openai_compatible/responses_extractor.go (2)

13-60: LGTM - Extract method is well-structured.

The extraction logic correctly maps OpenAI Responses API fields to the common ResponseMetadata structure. Good use of conditional cache usage handling and proper error propagation.


75-119: LGTM - Struct definitions align with OpenAI Responses API.

The response structures are well-defined with appropriate JSON tags and optional field handling.

examples/basic/main.go (2)

49-153: LGTM - Provider registration is clean and well-organized.

Each provider is conditionally created based on environment variables, with appropriate error handling via log.Fatalf. The empty providers check at line 151 ensures at least one provider is available before proceeding.


200-221: LGTM - AutoRouter configuration is well-structured.

The interceptor chain is properly ordered (retry first, then tracing/logging, then metrics), and the fallback provider setup with providers[0] is safe since the empty check occurs earlier.

providers/openai_compatible/responses_parser.go (1)

69-114: LGTM - Request struct and custom unmarshaling.

The ResponsesRequest struct properly captures both known fields and extra/custom fields via UnmarshalJSON. The known fields list is comprehensive for the Responses API.

providers/openai_compatible/responses_test.go (3)

12-58: LGTM - Comprehensive parser test coverage.

Tests cover the primary ResponsesParser functionality including model extraction, max_tokens mapping, message conversion from string input, custom fields (instructions), and api_type tagging.


87-174: LGTM - Thorough extractor test coverage.

The test validates all key extraction paths: ID, model, token usage (prompt/completion/total), choices with message content, cache_usage in Custom, and status field. Good edge case testing with cached tokens.


176-262: LGTM - Multi-API and resolver integration tests.

Tests verify that MultiAPIParser correctly routes to chat-completions vs responses parsers, and that Resolver generates correct URLs based on api_type. Good coverage of the routing logic.

autorouter.go (2)

12-52: LGTM - AutoRouter struct and initialization.

The functional options pattern is well-implemented, with sensible defaults (new registry, default detector, default HTTP client). The design allows flexible configuration while providing good defaults.


209-219: No action needed—prefixes are already aligned.

The knownProviderPrefixes list matches the explicit provider prefixes checked in DetectProviderFromModel. Both recognize the same nine providers: openai, anthropic, googleai, groq, fireworks, xai, perplexity, bedrock, and azure. Models with explicit prefixes (e.g., openai/gpt-4) are correctly validated and stripped, while pattern-based detection handles models without explicit prefixes independently.

Comment thread autorouter.go
Comment thread autorouter.go Outdated
Comment thread detection.go
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
autorouter.go (2)

95-102: ⚠️ Potential issue | 🟡 Minor

Silently ignoring json.Marshal error could hide data corruption.

If json.Marshal(raw) fails on line 100, the error is discarded and the original body (with the provider prefix still in the model name) is forwarded upstream. This could cause unexpected behavior.

,

🛡️ Proposed fix
 	if raw != nil {
 		if strippedModel, hasPrefix := stripProviderPrefix(model); hasPrefix {
 			raw["model"] = strippedModel
 			model = strippedModel
-			body, _ = json.Marshal(raw)
+			if newBody, err := json.Marshal(raw); err == nil {
+				body = newBody
+			}
+			// If marshal fails, continue with original body (prefix intact)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autorouter.go` around lines 95 - 102, The code currently calls
json.Marshal(raw) and discards the error, which can let a failed marshal result
in forwarding the original body with the unstripped provider prefix; change this
so you check the error return from json.Marshal(raw), only assign body and
update model when err == nil, and propagate or return the marshal error (or log
and return an upstream error) instead of ignoring it; update the block around
stripProviderPrefix/raw/model/body (the json.Marshal call) to handle the error
(e.g., if err != nil { return fmt.Errorf("marshal modified request body: %w",
err) } ) so callers won't receive a stale body.

173-199: ⚠️ Potential issue | 🟠 Major

Error handling after WriteHeader is ineffective.

After w.WriteHeader(resp.StatusCode) on line 191, the response headers are committed. If io.ReadAll fails on line 193, calling http.Error on line 195 won't change the status code—it will append error text to the response body, potentially corrupting output. Additionally, w.Write(body) error is ignored on line 198.

,

🐛 Proposed fix - read body before writing headers
 func (a *AutoRouter) ServeHTTP(w http.ResponseWriter, r *http.Request) {
 	resp, meta, err := a.Forward(r.Context(), r)
 	if err != nil {
 		http.Error(w, err.Error(), http.StatusInternalServerError)
 		return
 	}
 	defer resp.Body.Close()

+	body, err := io.ReadAll(resp.Body)
+	if err != nil {
+		http.Error(w, err.Error(), http.StatusInternalServerError)
+		return
+	}
+
 	for k, v := range resp.Header {
 		w.Header()[k] = v
 	}

 	if billing, ok := meta.Custom["billing_result"].(BillingResult); ok {
 		w.Header().Set("X-Gateway-Cost", fmt.Sprintf("%.6f", billing.TotalCost))
 		w.Header().Set("X-Gateway-Prompt-Tokens", fmt.Sprintf("%d", billing.PromptTokens))
 		w.Header().Set("X-Gateway-Completion-Tokens", fmt.Sprintf("%d", billing.CompletionTokens))
 	}

 	w.WriteHeader(resp.StatusCode)
-
-	body, err := io.ReadAll(resp.Body)
-	if err != nil {
-		http.Error(w, err.Error(), http.StatusInternalServerError)
-		return
-	}
-	w.Write(body)
+	_, _ = w.Write(body)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autorouter.go` around lines 173 - 199, AutoRouter.ServeHTTP commits headers
too early: move the io.ReadAll(resp.Body) call (and its error handling) to
before calling w.WriteHeader(resp.StatusCode), and only call
w.WriteHeader(resp.StatusCode) after successfully reading the body and setting
any meta headers (e.g., X-Gateway-Cost from BillingResult). If io.ReadAll fails,
write an error status (e.g., http.StatusBadGateway or
http.StatusInternalServerError) before any headers are committed and return;
also check and handle the error from w.Write(body) (log or return) so write
failures aren't ignored. Ensure resp.Body is still closed via the existing defer
and that header copying from resp.Header remains intact.
🧹 Nitpick comments (1)
autorouter.go (1)

211-242: Consider using strings.Index instead of custom indexOfSlash.

The indexOfSlash function reimplements strings.Index(s, "/"). Using the standard library is more idiomatic.

♻️ Proposed refactor
+import "strings"
+
 func stripProviderPrefix(model string) (stripped string, hasPrefix bool) {
-	idx := indexOfSlash(model)
+	idx := strings.Index(model, "/")
 	if idx < 0 {
 		return model, false
 	}
 	prefix := model[:idx]
 	if knownProviderPrefixes[prefix] {
 		return model[idx+1:], true
 	}
 	return model, false
 }
-
-func indexOfSlash(s string) int {
-	for i := 0; i < len(s); i++ {
-		if s[i] == '/' {
-			return i
-		}
-	}
-	return -1
-}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autorouter.go` around lines 211 - 242, Replace the custom indexOfSlash
implementation with the standard library call strings.Index in
stripProviderPrefix: remove indexOfSlash and call strings.Index(model, "/")
instead, update imports to include "strings", and ensure stripProviderPrefix
still returns the same (stripped, hasPrefix) semantics using the returned index;
keep knownProviderPrefixes and the rest unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@autorouter.go`:
- Around line 95-102: The code currently calls json.Marshal(raw) and discards
the error, which can let a failed marshal result in forwarding the original body
with the unstripped provider prefix; change this so you check the error return
from json.Marshal(raw), only assign body and update model when err == nil, and
propagate or return the marshal error (or log and return an upstream error)
instead of ignoring it; update the block around
stripProviderPrefix/raw/model/body (the json.Marshal call) to handle the error
(e.g., if err != nil { return fmt.Errorf("marshal modified request body: %w",
err) } ) so callers won't receive a stale body.
- Around line 173-199: AutoRouter.ServeHTTP commits headers too early: move the
io.ReadAll(resp.Body) call (and its error handling) to before calling
w.WriteHeader(resp.StatusCode), and only call w.WriteHeader(resp.StatusCode)
after successfully reading the body and setting any meta headers (e.g.,
X-Gateway-Cost from BillingResult). If io.ReadAll fails, write an error status
(e.g., http.StatusBadGateway or http.StatusInternalServerError) before any
headers are committed and return; also check and handle the error from
w.Write(body) (log or return) so write failures aren't ignored. Ensure resp.Body
is still closed via the existing defer and that header copying from resp.Header
remains intact.

---

Nitpick comments:
In `@autorouter.go`:
- Around line 211-242: Replace the custom indexOfSlash implementation with the
standard library call strings.Index in stripProviderPrefix: remove indexOfSlash
and call strings.Index(model, "/") instead, update imports to include "strings",
and ensure stripProviderPrefix still returns the same (stripped, hasPrefix)
semantics using the returned index; keep knownProviderPrefixes and the rest
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 62ae91fd-5bf0-4d3c-8de8-f9ec9c2a5993

📥 Commits

Reviewing files that changed from the base of the PR and between 3fdc6b3 and 1a6673a.

📒 Files selected for processing (4)
  • apitype.go
  • autorouter.go
  • detection_test.go
  • examples/basic/main.go
✅ Files skipped from review due to trivial changes (1)
  • detection_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • apitype.go
📜 Review details
🔇 Additional comments (11)
examples/basic/main.go (4)

49-56: LGTM!

Provider collection pattern is clean. The slice approach simplifies the subsequent registration with AutoRouter.


60-149: LGTM!

Provider initialization follows a consistent pattern with appropriate error handling via log.Fatalf for creation failures.


200-221: LGTM!

AutoRouter configuration is well-structured. The interceptor ordering (retry → tracing → logging → metrics → headers) is sensible, and the fallback provider access is safe due to the earlier empty-providers check at line 151.


223-258: LGTM!

The single handler at "/" cleanly demonstrates the auto-routing capability, and the startup logging provides helpful guidance for users.

autorouter.go (7)

12-52: LGTM!

The AutoRouter struct and functional options pattern follow Go conventions. Default initialization with NewRegistry(), DefaultProviderDetector, and http.DefaultClient provides sensible defaults while allowing customization.


63-77: LGTM!

Reading the full body and extracting the model field is appropriate. Silently continuing on JSON unmarshal failure is acceptable since the body might not be JSON, and provider detection can fall back to header-based detection.


78-93: LGTM!

Provider detection logic correctly prioritizes detected provider, then fallback, with a clear sentinel error when no provider is available.


104-118: LGTM!

API type detection with path precedence over body detection is logical. The defensive nil check for meta.Custom before assignment prevents potential panics.


120-157: LGTM!

Upstream request construction is correct: URL resolution, header propagation, request enrichment, and context value attachment are all properly handled. The interceptor chain wrapping pattern is clean.


159-171: LGTM!

The roundTrip method correctly delegates to the HTTP client and extracts response metadata. Per the ResponseExtractor interface contract, the body is consumed and raw bytes returned for re-attachment.


201-209: LGTM!

The ProviderError type and ErrNoProvider sentinel provide clear error semantics for callers to identify when no provider is available.

- Handle json.Marshal error when stripping provider prefix
- Read response body before writing headers in ServeHTTP
- Fix provider detection precedence (X-Provider > model > headers)
- Use shared DetectProviderFromModel in billing interceptor
- Replace custom indexOfSlash with strings.Index
- Replace custom byteReader with bytes.NewReader
- Use DetectAPIType in MultiAPIExtractor for consistency
- Aggregate all output_text content in Responses extractor
- Validate role/content before appending messages in Responses parser
- Add FindProviderForModel to models.dev adapter
- Add ModelProviderLookup option to AutoRouter
- Update example to use models.dev for provider detection
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
interceptors/billing.go (1)

27-35: ⚠️ Potential issue | 🟠 Major

Billing should use the resolved provider, not the rewritten model.

AutoRouter.Forward strips provider prefixes before parsing the request body, so meta.Model loses explicit routing info. Requests like azure/gpt-4o, fireworks/llama-3-70b, or X-Provider: azure will be billed under the heuristic provider from the stripped model instead of the provider that actually handled the request.

💡 Suggested direction
-	// Use shared provider detection for consistency with routing
-	provider := llmproxy.DetectProviderFromModel(meta.Model)
+	provider, _ := meta.Custom["provider"].(string)
+	if provider == "" {
+		provider = llmproxy.DetectProviderFromModel(meta.Model)
+	}

AutoRouter should populate meta.Custom["provider"] with the resolved provider before the interceptor chain runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interceptors/billing.go` around lines 27 - 35, The billing interceptor is
using llmproxy.DetectProviderFromModel(meta.Model) which uses the rewritten
model string and loses the actual resolved router; change provider resolution to
prefer the router-populated value in meta.Custom["provider"] (use it if present
and non-empty), falling back to llmproxy.DetectProviderFromModel(meta.Model)
only if meta.Custom["provider"] is absent, and then call i.Lookup(provider,
meta.Model) (with the same fallback to i.Lookup("", meta.Model) if not found) so
billing uses the resolved provider; update references to provider in the
billing.go interceptor accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autorouter.go`:
- Around line 95-104: The code currently falls back to a.fallbackProvider when
a.registry.Get(providerName) returns nil even though providerName was explicitly
provided; change the logic so that when providerName != "" you try to get the
provider via a.registry.Get(providerName) and if that lookup fails (nil/false)
you immediately return ErrNoProvider instead of using a.fallbackProvider; only
use a.fallbackProvider when providerName is empty (i.e., detection failed).
Reference: providerName, a.registry.Get, provider, a.fallbackProvider,
ErrNoProvider.

In `@DESIGN.md`:
- Around line 99-106: Add documentation for the new AutoRouter option
WithAutoRouterModelProviderLookup to the Configuration options list: describe
its purpose (hook used to lookup model -> provider mapping for models.dev-backed
detection), its expected signature/type (a function that accepts model
identifier/context and returns a Provider or nil/error), and how it interacts
with existing options like WithAutoRouterDetector and
WithAutoRouterFallbackProvider; place this entry alongside the other options and
include a short usage note showing that supplying
WithAutoRouterModelProviderLookup enables models.dev-backed detection path for
AutoRouter.
- Around line 86-89: The fenced code block containing the signature sketch for
Forward(ctx, req) -> (resp, meta, err) and ServeHTTP(w, r) should include a
language tag to satisfy markdownlint MD040; update the triple-backtick fence to
use "text" (i.e., ```text) so the snippet is treated as plain text rather than
code.

In `@pricing/modelsdev/adapter.go`:
- Around line 192-206: FindProviderForModel currently returns "" when the cache
is nil, causing silent failures on cold start; mirror the lazy-load pattern in
Lookup by checking the same expiry condition (a.data == nil ||
time.Since(a.lastLoad) > a.ttl), releasing the read lock, calling a.Load(), then
re-acquiring the read lock before searching. Ensure you use the same error
handling/return behavior as Lookup when Load() fails, and preserve correct
locking (unlock before calling Load and re-lock afterward). Locate the
implementation in Adapter.FindProviderForModel and use the same fields/methods
referenced in Lookup (a.data, a.lastLoad, a.ttl, Load()).

In `@providers/openai_compatible/multiapi.go`:
- Around line 51-65: In MultiAPIExtractor.Extract, after io.ReadAll(resp.Body)
returns and before you overwrite resp.Body with
io.NopCloser(bytes.NewReader(body)), close the original resp.Body to avoid
leaking the upstream HTTP connection; call resp.Body.Close() (and optionally
handle/log its error) right after reading the body and before branching to
responsesExtractor.Extract or chatCompletionsExtractor.Extract so the original
transport is released.
- Around line 57-65: DetectAPIType currently inspects request-only fields and
will mislabel /v1/responses replies so responsesExtractor.Extract gets bypassed;
fix by changing the routing in this block to detect response payloads (inspect
response body for response-specific keys like "output", "status", or "usage") or
use a separate response-detection helper and branch on that before falling back
to DetectAPIType — update the apiType logic (variable apiType, function
llmproxy.DetectAPIType) so that when response-specific fields are present you
call e.responsesExtractor.Extract(resp) and otherwise call
e.chatCompletionsExtractor.Extract(resp).

---

Outside diff comments:
In `@interceptors/billing.go`:
- Around line 27-35: The billing interceptor is using
llmproxy.DetectProviderFromModel(meta.Model) which uses the rewritten model
string and loses the actual resolved router; change provider resolution to
prefer the router-populated value in meta.Custom["provider"] (use it if present
and non-empty), falling back to llmproxy.DetectProviderFromModel(meta.Model)
only if meta.Custom["provider"] is absent, and then call i.Lookup(provider,
meta.Model) (with the same fallback to i.Lookup("", meta.Model) if not found) so
billing uses the resolved provider; update references to provider in the
billing.go interceptor accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d376346b-1808-4bc3-a894-90a91edcad2c

📥 Commits

Reviewing files that changed from the base of the PR and between 1a6673a and 60e799e.

📒 Files selected for processing (11)
  • DESIGN.md
  • README.md
  • autorouter.go
  • detection.go
  • examples/basic/main.go
  • interceptors/billing.go
  • interceptors/billing_test.go
  • pricing/modelsdev/adapter.go
  • providers/openai_compatible/multiapi.go
  • providers/openai_compatible/responses_extractor.go
  • providers/openai_compatible/responses_parser.go
✅ Files skipped from review due to trivial changes (1)
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/basic/main.go
📜 Review details
🧰 Additional context used
🪛 markdownlint-cli2 (0.22.0)
DESIGN.md

[warning] 86-86: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 335-335: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

Comment thread autorouter.go
Comment thread DESIGN.md Outdated
Comment thread DESIGN.md
Comment on lines +99 to +106
**Configuration options:**

- `WithAutoRouterRegistry(r)` — Use custom registry
- `WithAutoRouterDetector(d)` — Custom provider detection logic
- `WithAutoRouterInterceptor(i)` — Add interceptor to chain
- `WithAutoRouterHTTPClient(c)` — Custom HTTP client
- `WithAutoRouterFallbackProvider(p)` — Provider when detection fails

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Document WithAutoRouterModelProviderLookup.

The AutoRouter option list omits the new model-provider lookup hook, so the models.dev-backed detection path added in this PR is not discoverable from the public design doc.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DESIGN.md` around lines 99 - 106, Add documentation for the new AutoRouter
option WithAutoRouterModelProviderLookup to the Configuration options list:
describe its purpose (hook used to lookup model -> provider mapping for
models.dev-backed detection), its expected signature/type (a function that
accepts model identifier/context and returns a Provider or nil/error), and how
it interacts with existing options like WithAutoRouterDetector and
WithAutoRouterFallbackProvider; place this entry alongside the other options and
include a short usage note showing that supplying
WithAutoRouterModelProviderLookup enables models.dev-backed detection path for
AutoRouter.

Comment on lines +192 to +206
func (a *Adapter) FindProviderForModel(model string) string {
a.mu.RLock()
defer a.mu.RUnlock()

if a.data == nil {
return ""
}

for providerID, provider := range a.data {
if _, exists := provider.Models[model]; exists {
return providerID
}
}
return ""
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check the Adapter struct definition and Lookup method
cd pricing/modelsdev
cat -n adapter.go | head -100

Repository: agentuity/llmproxy

Length of output: 3349


🏁 Script executed:

# Find the Lookup method to understand the lazy-load pattern
cat -n pricing/modelsdev/adapter.go | sed -n '158,187p'

Repository: agentuity/llmproxy

Length of output: 960


🏁 Script executed:

# Search for how FindProviderForModel is used
rg -n "FindProviderForModel" --type go

Repository: agentuity/llmproxy

Length of output: 518


🏁 Script executed:

# Check the AutoRouter integration mentioned in the review
rg -n -A 5 "FindProviderForModel" pricing/autorouter.go

Repository: agentuity/llmproxy

Length of output: 123


🏁 Script executed:

find . -name "*autorouter*" -o -name "*auto_router*"

Repository: agentuity/llmproxy

Length of output: 98


🏁 Script executed:

# Check the examples/basic/main.go to understand how FindProviderForModel is used
rg -n -B 5 -A 5 "FindProviderForModel" examples/basic/main.go

Repository: agentuity/llmproxy

Length of output: 1213


🏁 Script executed:

# Search for AutoRouter type/interface to understand the contract
rg -n "type AutoRouter" --type go

Repository: agentuity/llmproxy

Length of output: 160


🏁 Script executed:

# Check if there are any tests for FindProviderForModel
rg -n "FindProviderForModel" --type go

Repository: agentuity/llmproxy

Length of output: 518


🏁 Script executed:

# Check the autorouter.go file to see how modelProviderLookup is used
cat -n autorouter.go | head -150

Repository: agentuity/llmproxy

Length of output: 4895


🏁 Script executed:

# Look for where modelProviderLookup is called
rg -n "modelProviderLookup" autorouter.go -A 3 -B 3

Repository: agentuity/llmproxy

Length of output: 844


🏁 Script executed:

# Check the actual function signature and see what the callback signature is
rg -n "modelProviderLookup\s*=" autorouter.go -B 10

Repository: agentuity/llmproxy

Length of output: 482


🏁 Script executed:

# Check the ModelProviderLookup type definition
rg -n "type ModelProviderLookup" --type go

Repository: agentuity/llmproxy

Length of output: 128


🏁 Script executed:

# Look for any tests of FindProviderForModel
find . -name "*test*.go" -exec rg -l "FindProviderForModel" {} \;

Repository: agentuity/llmproxy

Length of output: 44


🏁 Script executed:

# Check if Lookup is called anywhere and how it handles expiry
rg -n "\.Lookup\(" --type go -B 2 -A 2

Repository: agentuity/llmproxy

Length of output: 480


🏁 Script executed:

# Double-check: does Lookup have any tests showing sorting expectations?
rg -n "Lookup" pricing/modelsdev/adapter_test.go -B 3 -A 3

Repository: agentuity/llmproxy

Length of output: 135


🏁 Script executed:

# Check if there are any comments or docs about deterministic provider selection
rg -n "deterministic|order|sort" pricing/modelsdev/ --type go

Repository: agentuity/llmproxy

Length of output: 44


Add lazy-load mechanism to avoid silent failures on cold start.

FindProviderForModel returns "" immediately when the cache is not loaded (line 196), unlike Lookup which checks for expiry and calls Load() before searching. This causes AutoRouter's model lookup to silently fail on cold start unless Load() was called separately. Follow the same lazy-load pattern as Lookup (lines 158–165).

Suggested fix
 func (a *Adapter) FindProviderForModel(model string) string {
+	if model == "" {
+		return ""
+	}
+
 	a.mu.RLock()
-	defer a.mu.RUnlock()
-
-	if a.data == nil {
+	needLoad := len(a.data) == 0 || (!a.expires.IsZero() && time.Now().After(a.expires))
+	a.mu.RUnlock()
+	if needLoad {
+		_ = a.Load(context.Background())
+	}
+
+	a.mu.RLock()
+	defer a.mu.RUnlock()
+
+	for providerID, provider := range a.data {
+		if _, exists := provider.Models[model]; exists {
+			return providerID
+		}
 	}
-
-	for providerID, provider := range a.data {
-		if _, exists := provider.Models[model]; exists {
-			return providerID
-		}
-	}
 	return ""
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (a *Adapter) FindProviderForModel(model string) string {
a.mu.RLock()
defer a.mu.RUnlock()
if a.data == nil {
return ""
}
for providerID, provider := range a.data {
if _, exists := provider.Models[model]; exists {
return providerID
}
}
return ""
}
func (a *Adapter) FindProviderForModel(model string) string {
if model == "" {
return ""
}
a.mu.RLock()
needLoad := len(a.data) == 0 || (!a.expires.IsZero() && time.Now().After(a.expires))
a.mu.RUnlock()
if needLoad {
_ = a.Load(context.Background())
}
a.mu.RLock()
defer a.mu.RUnlock()
for providerID, provider := range a.data {
if _, exists := provider.Models[model]; exists {
return providerID
}
}
return ""
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pricing/modelsdev/adapter.go` around lines 192 - 206, FindProviderForModel
currently returns "" when the cache is nil, causing silent failures on cold
start; mirror the lazy-load pattern in Lookup by checking the same expiry
condition (a.data == nil || time.Since(a.lastLoad) > a.ttl), releasing the read
lock, calling a.Load(), then re-acquiring the read lock before searching. Ensure
you use the same error handling/return behavior as Lookup when Load() fails, and
preserve correct locking (unlock before calling Load and re-lock afterward).
Locate the implementation in Adapter.FindProviderForModel and use the same
fields/methods referenced in Lookup (a.data, a.lastLoad, a.ttl, Load()).

Comment thread providers/openai_compatible/multiapi.go Outdated
Comment thread providers/openai_compatible/multiapi.go Outdated
- Return ErrNoProvider when explicit provider lookup fails (don't use fallback)
- Store resolved provider in meta.Custom for billing interceptor
- Add language tag to code fence in DESIGN.md (MD040)
- Document WithAutoRouterModelProviderLookup in DESIGN.md
- Add lazy-load support to FindProviderForModel (matching Lookup behavior)
- Close original response body before restoring in MultiAPIExtractor
- Detect Responses API by response-specific fields (output/status) not request fields
- Prefer router-resolved provider in billing interceptor over model detection
@jhaynie jhaynie merged commit b1ae20e into main Apr 14, 2026
1 of 2 checks passed
@jhaynie jhaynie deleted the feature/autorouter-and-responses-api branch April 14, 2026 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant