feat: add AutoRouter, Responses API support, and provider detection#6
feat: add AutoRouter, Responses API support, and provider detection#6
Conversation
- Add AutoRouter for automatic provider/API routing from single endpoint - Add OpenAI Responses API parser, extractor, and resolver - Add API type detection (chat completions, responses, completions, messages) - Add provider detection from model name patterns and X-Provider header - Strip provider prefix from model names (e.g., openai/gpt-4 -> gpt-4) - Update example to use AutoRouter with simplified setup - Store billing result in response metadata for header injection - Comprehensive test coverage for new functionality
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds API-type detection and provider detection primitives, a new AutoRouter HTTP handler that auto-selects providers, strips model prefixes, parses requests, resolves upstream URLs, and forwards requests. Introduces OpenAI multi‑API parsing/extraction (Responses + chat/completions), billing metadata propagation, model→provider lookup, extensive tests, and updated docs/examples. Changes
🚥 Pre-merge checks | ✅ 1✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
- Add DetectAPITypeFromBodyAndProvider for body+provider detection - Path detection now returns empty for unknown paths (routes to body detection) - AutoRouter now detects provider first, then uses body+provider for API type - Update example to show POST to / capability - Add tests for new detection function
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
interceptors/billing.go (1)
28-53:⚠️ Potential issue | 🟠 MajorRouting and billing now disagree on provider IDs.
providerstill comes from the file-local detector, but the new router detector inllmproxy/detection.gouses a different canonical set (googleaivsxai/perplexity/azure/bedrock). Requests can route successfully and still miss pricing here, which meansrespMeta.Custom["billing_result"]never gets set andautorouter.gocannot emit the newX-Gateway-*headers. Please reuse one shared provider-normalization path for both routing and billing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interceptors/billing.go` around lines 28 - 53, The billing interceptor uses a local detectProvider(meta.Model) that produces a different provider id set than the router; replace the file-local detector call with the shared normalizer/detector from llmproxy/detection.go (e.g. call llmproxy.DetectProvider or llmproxy.NormalizeProvider) so both routing and billing use the exact same canonical provider id before calling i.Lookup and llmproxy.CalculateCost; ensure you update the import if necessary and propagate the normalized provider variable into Lookup(provider, meta.Model) and CalculateCost(provider, ...) so respMeta.Custom["billing_result"] is consistently set.
🧹 Nitpick comments (6)
providers/openai_compatible/multiapi.go (2)
40-60: Replace custombyteReaderwithbytes.Readerfrom standard library.The custom
byteReadertype duplicatesbytes.Readerfunctionality. Usingbytes.NewReader(data)directly would simplify the code and reduce maintenance surface.♻️ Proposed simplification
-type byteReader struct { - data []byte - pos int -} - -func NewBytesReader(data []byte) *byteReader { - return &byteReader{data: data} -} - -func (r *byteReader) Read(p []byte) (n int, err error) { - if r.pos >= len(r.data) { - return 0, io.EOF - } - n = copy(p, r.data[r.pos:]) - r.pos += n - return n, nil -} - -func (r *byteReader) Close() error { - return nil -}Then update the Parse method:
switch apiType { case llmproxy.APITypeResponses: - return p.responsesParser.Parse(io.NopCloser(NewBytesReader(data))) + return p.responsesParser.Parse(io.NopCloser(bytes.NewReader(data))) default: - return p.chatCompletionsParser.Parse(io.NopCloser(NewBytesReader(data))) + return p.chatCompletionsParser.Parse(io.NopCloser(bytes.NewReader(data))) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/openai_compatible/multiapi.go` around lines 40 - 60, Replace the custom byteReader type and its constructor NewBytesReader with the standard library bytes.Reader: remove the byteReader struct, Read and Close implementations, and any calls to NewBytesReader; instead create readers with bytes.NewReader(data). Update usages in the Parse method (and any other places referencing byteReader or NewBytesReader) to accept or use io.Reader from bytes.NewReader, ensuring signatures/types match (e.g., io.ReadCloser -> io.Reader or wrap bytes.NewReader with ioutil.NopCloser if a Close is required).
74-92: Extractor detection logic differs from parser detection - potential mismatch.
MultiAPIParser.Parseusesllmproxy.DetectAPIType(data)based on the presence ofinputfield, whileMultiAPIExtractor.Extractuses a different heuristic (presence ofoutput+ absence ofchoices).If a provider returns an unexpected response format, the extractor could route differently than expected. Consider aligning the detection logic or documenting this intentional asymmetry (request detection by
inputfield, response detection byoutputfield is reasonable given different payloads).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/openai_compatible/multiapi.go` around lines 74 - 92, The extractor detection in MultiAPIExtractor.Extract currently routes by checking "output" and absence of "choices", which can diverge from MultiAPIParser.Parse's use of llmproxy.DetectAPIType (which inspects the presence of "input"); update Extract to use the same detection function (llmproxy.DetectAPIType) on the unmarshaled response body or explicitly document the intentional difference so both paths align; specifically, modify MultiAPIExtractor.Extract to call llmproxy.DetectAPIType(data) and branch to e.responsesExtractor.Extract or e.chatCompletionsExtractor.Extract based on the returned API type, ensuring resp.Body is restored (io.NopCloser(bytes.NewReader(body))) before delegating.autorouter.go (1)
233-240: Consider usingstrings.Indexinstead of customindexOfSlash.The standard library
strings.Index(s, "/")provides the same functionality:♻️ Proposed simplification
+import "strings" + func stripProviderPrefix(model string) (stripped string, hasPrefix bool) { - idx := indexOfSlash(model) + idx := strings.Index(model, "/") if idx < 0 { return model, false } prefix := model[:idx] if knownProviderPrefixes[prefix] { return model[idx+1:], true } return model, false } - -func indexOfSlash(s string) int { - for i := 0; i < len(s); i++ { - if s[i] == '/' { - return i - } - } - return -1 -}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autorouter.go` around lines 233 - 240, Replace the custom indexOfSlash function with the standard library call strings.Index(s, "/") wherever indexOfSlash is used; remove the indexOfSlash implementation and add/ensure the "strings" import is present so callers (e.g., any code invoking indexOfSlash) call strings.Index(s, "/") instead.examples/basic/main.go (1)
225-246: Consider making endpoint list dynamic based on registered providers.The startup logging shows a static list of supported endpoints regardless of which providers are actually registered. For example,
/v1/messages(Anthropic) is listed even if only OpenAI is configured.This could confuse users who see endpoints advertised that won't work without the corresponding provider. Consider either removing this static list or generating it dynamically based on registered providers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/basic/main.go` around lines 225 - 246, The startup logging currently prints a hard-coded list via repeated logr.Info calls (the block that prints "Supported endpoints:" and the subsequent endpoint lines); change this to emit endpoints based on which providers are actually registered at startup: query your provider registry (e.g., whatever slice/map you populate when registering providers such as providers, providerRegistry or during setupProviders/RegisterProvider) and build the supported-endpoints strings dynamically (iterate registered providers and append their supported routes like /v1/messages for Anthropic or /v1/chat/completions for OpenAI) before logging, replacing the static logr.Info lines with a loop that logs only the endpoints for active providers.providers/openai_compatible/responses_extractor.go (1)
62-73: Consider handling multiple content items and content types.The function only returns the first
output_textcontent from the firstmessageoutput item. If a response contains multiple content items (e.g., text plus tool calls) or multiple message outputs, they are silently ignored.This may be intentional for simplicity, but downstream consumers expecting full response content may miss data. Consider whether aggregating all text content or exposing additional content types is needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/openai_compatible/responses_extractor.go` around lines 62 - 73, The current extractResponsesContent function only returns the first c.Text for the first ResponsesOutputItem with Type == "message" and c.Type == "output_text"; change it to aggregate all relevant text content by iterating all output items and all item.Content entries, collecting every c.Text where c.Type == "output_text" (and optionally support other c.Type values you need), then join them (e.g., with "\n" or a configurable separator) and return the combined string; update extractResponsesContent to return the aggregated result instead of exiting on the first match so multiple message outputs and multiple content items are preserved.providers/openai_compatible/responses_parser.go (1)
36-51: Empty messages may be appended when input array items lack role/content.When
inputis an array, items that aremap[string]interface{}but lackroleorcontentfields will result inllmproxy.Message{}with empty strings being appended. This could cause issues downstream if consumers expect non-empty messages.Consider either skipping items that don't have valid content or adding validation:
♻️ Proposed validation
for _, item := range v { if m, ok := item.(map[string]interface{}); ok { msg := llmproxy.Message{} if role, ok := m["role"].(string); ok { msg.Role = role } if content, ok := m["content"].(string); ok { msg.Content = content } + if msg.Role != "" || msg.Content != "" { - msgs = append(msgs, msg) + msgs = append(msgs, msg) + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/openai_compatible/responses_parser.go` around lines 36 - 51, The current loop that builds msgs can append empty llmproxy.Message when an item lacks role or content; update the []interface{} case to validate each item (the variables item, m, role, content and the constructed llmproxy.Message) and only append to msgs if both role and content are present and non-empty (optionally trim whitespace) before assigning meta.Messages; skip (or record) items missing required fields to avoid injecting empty messages downstream.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@autorouter.go`:
- Around line 100-106: The json.Marshal error is being ignored when re-encoding
raw after stripProviderPrefix; update the logic around
stripProviderPrefix/raw/body so you capture and handle the error from
json.Marshal(raw) instead of discarding it — call json.Marshal(raw), check the
returned error, and either return that error or log it and abort the request so
we don't send the old body with the provider prefix; modify the block in
autorouter.go where stripProviderPrefix is used (the code that sets raw["model"]
and assigns body) to handle the marshal error and propagate or log it
appropriately.
- Around line 189-196: Move the body read before committing response headers:
call io.ReadAll(resp.Body) first (checking err) and if it fails use
http.Error(w, err.Error(), http.StatusInternalServerError) while headers are
still writable; only after successfully reading the body copy headers from resp
to w and call w.WriteHeader(resp.StatusCode). Also handle the error returned by
w.Write(body) (e.g., log it or return) instead of ignoring it. Ensure you
reference the existing resp.Body, io.ReadAll, w.WriteHeader, and w.Write call
sites when making the change.
In `@detection.go`:
- Around line 27-58: Currently DefaultProviderDetector calls
detectProviderFromHeaders first so heuristics like the generic "api-key" ->
"azure" can override an explicit model; update the logic so X-Provider remains
an immediate override but all other header heuristics run only after model-based
detection (i.e., check hint.Model and call DetectProviderFromModel(hint.Model)
before running non-X-Provider header heuristics), or alternatively restrict
detectProviderFromHeaders to only return a provider for the explicit X-Provider
header and move the other header checks into a secondary function that is
invoked only if DetectProviderFromModel returns empty; touch
DefaultProviderDetector, detectProviderFromHeaders, ProviderHint usage and the
call site to enforce the new precedence.
---
Outside diff comments:
In `@interceptors/billing.go`:
- Around line 28-53: The billing interceptor uses a local
detectProvider(meta.Model) that produces a different provider id set than the
router; replace the file-local detector call with the shared normalizer/detector
from llmproxy/detection.go (e.g. call llmproxy.DetectProvider or
llmproxy.NormalizeProvider) so both routing and billing use the exact same
canonical provider id before calling i.Lookup and llmproxy.CalculateCost; ensure
you update the import if necessary and propagate the normalized provider
variable into Lookup(provider, meta.Model) and CalculateCost(provider, ...) so
respMeta.Custom["billing_result"] is consistently set.
---
Nitpick comments:
In `@autorouter.go`:
- Around line 233-240: Replace the custom indexOfSlash function with the
standard library call strings.Index(s, "/") wherever indexOfSlash is used;
remove the indexOfSlash implementation and add/ensure the "strings" import is
present so callers (e.g., any code invoking indexOfSlash) call strings.Index(s,
"/") instead.
In `@examples/basic/main.go`:
- Around line 225-246: The startup logging currently prints a hard-coded list
via repeated logr.Info calls (the block that prints "Supported endpoints:" and
the subsequent endpoint lines); change this to emit endpoints based on which
providers are actually registered at startup: query your provider registry
(e.g., whatever slice/map you populate when registering providers such as
providers, providerRegistry or during setupProviders/RegisterProvider) and build
the supported-endpoints strings dynamically (iterate registered providers and
append their supported routes like /v1/messages for Anthropic or
/v1/chat/completions for OpenAI) before logging, replacing the static logr.Info
lines with a loop that logs only the endpoints for active providers.
In `@providers/openai_compatible/multiapi.go`:
- Around line 40-60: Replace the custom byteReader type and its constructor
NewBytesReader with the standard library bytes.Reader: remove the byteReader
struct, Read and Close implementations, and any calls to NewBytesReader; instead
create readers with bytes.NewReader(data). Update usages in the Parse method
(and any other places referencing byteReader or NewBytesReader) to accept or use
io.Reader from bytes.NewReader, ensuring signatures/types match (e.g.,
io.ReadCloser -> io.Reader or wrap bytes.NewReader with ioutil.NopCloser if a
Close is required).
- Around line 74-92: The extractor detection in MultiAPIExtractor.Extract
currently routes by checking "output" and absence of "choices", which can
diverge from MultiAPIParser.Parse's use of llmproxy.DetectAPIType (which
inspects the presence of "input"); update Extract to use the same detection
function (llmproxy.DetectAPIType) on the unmarshaled response body or explicitly
document the intentional difference so both paths align; specifically, modify
MultiAPIExtractor.Extract to call llmproxy.DetectAPIType(data) and branch to
e.responsesExtractor.Extract or e.chatCompletionsExtractor.Extract based on the
returned API type, ensuring resp.Body is restored
(io.NopCloser(bytes.NewReader(body))) before delegating.
In `@providers/openai_compatible/responses_extractor.go`:
- Around line 62-73: The current extractResponsesContent function only returns
the first c.Text for the first ResponsesOutputItem with Type == "message" and
c.Type == "output_text"; change it to aggregate all relevant text content by
iterating all output items and all item.Content entries, collecting every c.Text
where c.Type == "output_text" (and optionally support other c.Type values you
need), then join them (e.g., with "\n" or a configurable separator) and return
the combined string; update extractResponsesContent to return the aggregated
result instead of exiting on the first match so multiple message outputs and
multiple content items are preserved.
In `@providers/openai_compatible/responses_parser.go`:
- Around line 36-51: The current loop that builds msgs can append empty
llmproxy.Message when an item lacks role or content; update the []interface{}
case to validate each item (the variables item, m, role, content and the
constructed llmproxy.Message) and only append to msgs if both role and content
are present and non-empty (optionally trim whitespace) before assigning
meta.Messages; skip (or record) items missing required fields to avoid injecting
empty messages downstream.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 44c78c73-a542-40af-bc5f-5a012f13ffc1
📒 Files selected for processing (14)
apitype.goautorouter.goautorouter_test.godetection.godetection_test.goexamples/basic/main.gointerceptors/billing.goproviders/openai/provider.goproviders/openai_compatible/multiapi.goproviders/openai_compatible/provider.goproviders/openai_compatible/resolver.goproviders/openai_compatible/responses_extractor.goproviders/openai_compatible/responses_parser.goproviders/openai_compatible/responses_test.go
📜 Review details
🔇 Additional comments (11)
providers/openai/provider.go (1)
22-23: Nice drop-in upgrade path for OpenAI users.This keeps the public
openai.New(...)API stable while enabling/v1/responsessupport through the shared OpenAI-compatible stack.providers/openai_compatible/responses_extractor.go (2)
13-60: LGTM - Extract method is well-structured.The extraction logic correctly maps OpenAI Responses API fields to the common
ResponseMetadatastructure. Good use of conditional cache usage handling and proper error propagation.
75-119: LGTM - Struct definitions align with OpenAI Responses API.The response structures are well-defined with appropriate JSON tags and optional field handling.
examples/basic/main.go (2)
49-153: LGTM - Provider registration is clean and well-organized.Each provider is conditionally created based on environment variables, with appropriate error handling via
log.Fatalf. The empty providers check at line 151 ensures at least one provider is available before proceeding.
200-221: LGTM - AutoRouter configuration is well-structured.The interceptor chain is properly ordered (retry first, then tracing/logging, then metrics), and the fallback provider setup with
providers[0]is safe since the empty check occurs earlier.providers/openai_compatible/responses_parser.go (1)
69-114: LGTM - Request struct and custom unmarshaling.The
ResponsesRequeststruct properly captures both known fields and extra/custom fields viaUnmarshalJSON. The known fields list is comprehensive for the Responses API.providers/openai_compatible/responses_test.go (3)
12-58: LGTM - Comprehensive parser test coverage.Tests cover the primary
ResponsesParserfunctionality including model extraction, max_tokens mapping, message conversion from string input, custom fields (instructions), and api_type tagging.
87-174: LGTM - Thorough extractor test coverage.The test validates all key extraction paths: ID, model, token usage (prompt/completion/total), choices with message content, cache_usage in Custom, and status field. Good edge case testing with cached tokens.
176-262: LGTM - Multi-API and resolver integration tests.Tests verify that
MultiAPIParsercorrectly routes to chat-completions vs responses parsers, and thatResolvergenerates correct URLs based onapi_type. Good coverage of the routing logic.autorouter.go (2)
12-52: LGTM - AutoRouter struct and initialization.The functional options pattern is well-implemented, with sensible defaults (new registry, default detector, default HTTP client). The design allows flexible configuration while providing good defaults.
209-219: No action needed—prefixes are already aligned.The
knownProviderPrefixeslist matches the explicit provider prefixes checked inDetectProviderFromModel. Both recognize the same nine providers: openai, anthropic, googleai, groq, fireworks, xai, perplexity, bedrock, and azure. Models with explicit prefixes (e.g.,openai/gpt-4) are correctly validated and stripped, while pattern-based detection handles models without explicit prefixes independently.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
autorouter.go (2)
95-102:⚠️ Potential issue | 🟡 MinorSilently ignoring
json.Marshalerror could hide data corruption.If
json.Marshal(raw)fails on line 100, the error is discarded and the originalbody(with the provider prefix still in the model name) is forwarded upstream. This could cause unexpected behavior.,
🛡️ Proposed fix
if raw != nil { if strippedModel, hasPrefix := stripProviderPrefix(model); hasPrefix { raw["model"] = strippedModel model = strippedModel - body, _ = json.Marshal(raw) + if newBody, err := json.Marshal(raw); err == nil { + body = newBody + } + // If marshal fails, continue with original body (prefix intact) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autorouter.go` around lines 95 - 102, The code currently calls json.Marshal(raw) and discards the error, which can let a failed marshal result in forwarding the original body with the unstripped provider prefix; change this so you check the error return from json.Marshal(raw), only assign body and update model when err == nil, and propagate or return the marshal error (or log and return an upstream error) instead of ignoring it; update the block around stripProviderPrefix/raw/model/body (the json.Marshal call) to handle the error (e.g., if err != nil { return fmt.Errorf("marshal modified request body: %w", err) } ) so callers won't receive a stale body.
173-199:⚠️ Potential issue | 🟠 MajorError handling after
WriteHeaderis ineffective.After
w.WriteHeader(resp.StatusCode)on line 191, the response headers are committed. Ifio.ReadAllfails on line 193, callinghttp.Erroron line 195 won't change the status code—it will append error text to the response body, potentially corrupting output. Additionally,w.Write(body)error is ignored on line 198.,
🐛 Proposed fix - read body before writing headers
func (a *AutoRouter) ServeHTTP(w http.ResponseWriter, r *http.Request) { resp, meta, err := a.Forward(r.Context(), r) if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return } defer resp.Body.Close() + body, err := io.ReadAll(resp.Body) + if err != nil { + http.Error(w, err.Error(), http.StatusInternalServerError) + return + } + for k, v := range resp.Header { w.Header()[k] = v } if billing, ok := meta.Custom["billing_result"].(BillingResult); ok { w.Header().Set("X-Gateway-Cost", fmt.Sprintf("%.6f", billing.TotalCost)) w.Header().Set("X-Gateway-Prompt-Tokens", fmt.Sprintf("%d", billing.PromptTokens)) w.Header().Set("X-Gateway-Completion-Tokens", fmt.Sprintf("%d", billing.CompletionTokens)) } w.WriteHeader(resp.StatusCode) - - body, err := io.ReadAll(resp.Body) - if err != nil { - http.Error(w, err.Error(), http.StatusInternalServerError) - return - } - w.Write(body) + _, _ = w.Write(body) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autorouter.go` around lines 173 - 199, AutoRouter.ServeHTTP commits headers too early: move the io.ReadAll(resp.Body) call (and its error handling) to before calling w.WriteHeader(resp.StatusCode), and only call w.WriteHeader(resp.StatusCode) after successfully reading the body and setting any meta headers (e.g., X-Gateway-Cost from BillingResult). If io.ReadAll fails, write an error status (e.g., http.StatusBadGateway or http.StatusInternalServerError) before any headers are committed and return; also check and handle the error from w.Write(body) (log or return) so write failures aren't ignored. Ensure resp.Body is still closed via the existing defer and that header copying from resp.Header remains intact.
🧹 Nitpick comments (1)
autorouter.go (1)
211-242: Consider usingstrings.Indexinstead of customindexOfSlash.The
indexOfSlashfunction reimplementsstrings.Index(s, "/"). Using the standard library is more idiomatic.♻️ Proposed refactor
+import "strings" + func stripProviderPrefix(model string) (stripped string, hasPrefix bool) { - idx := indexOfSlash(model) + idx := strings.Index(model, "/") if idx < 0 { return model, false } prefix := model[:idx] if knownProviderPrefixes[prefix] { return model[idx+1:], true } return model, false } - -func indexOfSlash(s string) int { - for i := 0; i < len(s); i++ { - if s[i] == '/' { - return i - } - } - return -1 -}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autorouter.go` around lines 211 - 242, Replace the custom indexOfSlash implementation with the standard library call strings.Index in stripProviderPrefix: remove indexOfSlash and call strings.Index(model, "/") instead, update imports to include "strings", and ensure stripProviderPrefix still returns the same (stripped, hasPrefix) semantics using the returned index; keep knownProviderPrefixes and the rest unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@autorouter.go`:
- Around line 95-102: The code currently calls json.Marshal(raw) and discards
the error, which can let a failed marshal result in forwarding the original body
with the unstripped provider prefix; change this so you check the error return
from json.Marshal(raw), only assign body and update model when err == nil, and
propagate or return the marshal error (or log and return an upstream error)
instead of ignoring it; update the block around
stripProviderPrefix/raw/model/body (the json.Marshal call) to handle the error
(e.g., if err != nil { return fmt.Errorf("marshal modified request body: %w",
err) } ) so callers won't receive a stale body.
- Around line 173-199: AutoRouter.ServeHTTP commits headers too early: move the
io.ReadAll(resp.Body) call (and its error handling) to before calling
w.WriteHeader(resp.StatusCode), and only call w.WriteHeader(resp.StatusCode)
after successfully reading the body and setting any meta headers (e.g.,
X-Gateway-Cost from BillingResult). If io.ReadAll fails, write an error status
(e.g., http.StatusBadGateway or http.StatusInternalServerError) before any
headers are committed and return; also check and handle the error from
w.Write(body) (log or return) so write failures aren't ignored. Ensure resp.Body
is still closed via the existing defer and that header copying from resp.Header
remains intact.
---
Nitpick comments:
In `@autorouter.go`:
- Around line 211-242: Replace the custom indexOfSlash implementation with the
standard library call strings.Index in stripProviderPrefix: remove indexOfSlash
and call strings.Index(model, "/") instead, update imports to include "strings",
and ensure stripProviderPrefix still returns the same (stripped, hasPrefix)
semantics using the returned index; keep knownProviderPrefixes and the rest
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 62ae91fd-5bf0-4d3c-8de8-f9ec9c2a5993
📒 Files selected for processing (4)
apitype.goautorouter.godetection_test.goexamples/basic/main.go
✅ Files skipped from review due to trivial changes (1)
- detection_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- apitype.go
📜 Review details
🔇 Additional comments (11)
examples/basic/main.go (4)
49-56: LGTM!Provider collection pattern is clean. The slice approach simplifies the subsequent registration with AutoRouter.
60-149: LGTM!Provider initialization follows a consistent pattern with appropriate error handling via
log.Fatalffor creation failures.
200-221: LGTM!AutoRouter configuration is well-structured. The interceptor ordering (retry → tracing → logging → metrics → headers) is sensible, and the fallback provider access is safe due to the earlier empty-providers check at line 151.
223-258: LGTM!The single handler at "/" cleanly demonstrates the auto-routing capability, and the startup logging provides helpful guidance for users.
autorouter.go (7)
12-52: LGTM!The
AutoRouterstruct and functional options pattern follow Go conventions. Default initialization withNewRegistry(),DefaultProviderDetector, andhttp.DefaultClientprovides sensible defaults while allowing customization.
63-77: LGTM!Reading the full body and extracting the model field is appropriate. Silently continuing on JSON unmarshal failure is acceptable since the body might not be JSON, and provider detection can fall back to header-based detection.
78-93: LGTM!Provider detection logic correctly prioritizes detected provider, then fallback, with a clear sentinel error when no provider is available.
104-118: LGTM!API type detection with path precedence over body detection is logical. The defensive
nilcheck formeta.Custombefore assignment prevents potential panics.
120-157: LGTM!Upstream request construction is correct: URL resolution, header propagation, request enrichment, and context value attachment are all properly handled. The interceptor chain wrapping pattern is clean.
159-171: LGTM!The
roundTripmethod correctly delegates to the HTTP client and extracts response metadata. Per theResponseExtractorinterface contract, the body is consumed and raw bytes returned for re-attachment.
201-209: LGTM!The
ProviderErrortype andErrNoProvidersentinel provide clear error semantics for callers to identify when no provider is available.
- Handle json.Marshal error when stripping provider prefix - Read response body before writing headers in ServeHTTP - Fix provider detection precedence (X-Provider > model > headers) - Use shared DetectProviderFromModel in billing interceptor - Replace custom indexOfSlash with strings.Index - Replace custom byteReader with bytes.NewReader - Use DetectAPIType in MultiAPIExtractor for consistency - Aggregate all output_text content in Responses extractor - Validate role/content before appending messages in Responses parser - Add FindProviderForModel to models.dev adapter - Add ModelProviderLookup option to AutoRouter - Update example to use models.dev for provider detection
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
interceptors/billing.go (1)
27-35:⚠️ Potential issue | 🟠 MajorBilling should use the resolved provider, not the rewritten model.
AutoRouter.Forwardstrips provider prefixes before parsing the request body, someta.Modelloses explicit routing info. Requests likeazure/gpt-4o,fireworks/llama-3-70b, orX-Provider: azurewill be billed under the heuristic provider from the stripped model instead of the provider that actually handled the request.💡 Suggested direction
- // Use shared provider detection for consistency with routing - provider := llmproxy.DetectProviderFromModel(meta.Model) + provider, _ := meta.Custom["provider"].(string) + if provider == "" { + provider = llmproxy.DetectProviderFromModel(meta.Model) + }
AutoRoutershould populatemeta.Custom["provider"]with the resolved provider before the interceptor chain runs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interceptors/billing.go` around lines 27 - 35, The billing interceptor is using llmproxy.DetectProviderFromModel(meta.Model) which uses the rewritten model string and loses the actual resolved router; change provider resolution to prefer the router-populated value in meta.Custom["provider"] (use it if present and non-empty), falling back to llmproxy.DetectProviderFromModel(meta.Model) only if meta.Custom["provider"] is absent, and then call i.Lookup(provider, meta.Model) (with the same fallback to i.Lookup("", meta.Model) if not found) so billing uses the resolved provider; update references to provider in the billing.go interceptor accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@autorouter.go`:
- Around line 95-104: The code currently falls back to a.fallbackProvider when
a.registry.Get(providerName) returns nil even though providerName was explicitly
provided; change the logic so that when providerName != "" you try to get the
provider via a.registry.Get(providerName) and if that lookup fails (nil/false)
you immediately return ErrNoProvider instead of using a.fallbackProvider; only
use a.fallbackProvider when providerName is empty (i.e., detection failed).
Reference: providerName, a.registry.Get, provider, a.fallbackProvider,
ErrNoProvider.
In `@DESIGN.md`:
- Around line 99-106: Add documentation for the new AutoRouter option
WithAutoRouterModelProviderLookup to the Configuration options list: describe
its purpose (hook used to lookup model -> provider mapping for models.dev-backed
detection), its expected signature/type (a function that accepts model
identifier/context and returns a Provider or nil/error), and how it interacts
with existing options like WithAutoRouterDetector and
WithAutoRouterFallbackProvider; place this entry alongside the other options and
include a short usage note showing that supplying
WithAutoRouterModelProviderLookup enables models.dev-backed detection path for
AutoRouter.
- Around line 86-89: The fenced code block containing the signature sketch for
Forward(ctx, req) -> (resp, meta, err) and ServeHTTP(w, r) should include a
language tag to satisfy markdownlint MD040; update the triple-backtick fence to
use "text" (i.e., ```text) so the snippet is treated as plain text rather than
code.
In `@pricing/modelsdev/adapter.go`:
- Around line 192-206: FindProviderForModel currently returns "" when the cache
is nil, causing silent failures on cold start; mirror the lazy-load pattern in
Lookup by checking the same expiry condition (a.data == nil ||
time.Since(a.lastLoad) > a.ttl), releasing the read lock, calling a.Load(), then
re-acquiring the read lock before searching. Ensure you use the same error
handling/return behavior as Lookup when Load() fails, and preserve correct
locking (unlock before calling Load and re-lock afterward). Locate the
implementation in Adapter.FindProviderForModel and use the same fields/methods
referenced in Lookup (a.data, a.lastLoad, a.ttl, Load()).
In `@providers/openai_compatible/multiapi.go`:
- Around line 51-65: In MultiAPIExtractor.Extract, after io.ReadAll(resp.Body)
returns and before you overwrite resp.Body with
io.NopCloser(bytes.NewReader(body)), close the original resp.Body to avoid
leaking the upstream HTTP connection; call resp.Body.Close() (and optionally
handle/log its error) right after reading the body and before branching to
responsesExtractor.Extract or chatCompletionsExtractor.Extract so the original
transport is released.
- Around line 57-65: DetectAPIType currently inspects request-only fields and
will mislabel /v1/responses replies so responsesExtractor.Extract gets bypassed;
fix by changing the routing in this block to detect response payloads (inspect
response body for response-specific keys like "output", "status", or "usage") or
use a separate response-detection helper and branch on that before falling back
to DetectAPIType — update the apiType logic (variable apiType, function
llmproxy.DetectAPIType) so that when response-specific fields are present you
call e.responsesExtractor.Extract(resp) and otherwise call
e.chatCompletionsExtractor.Extract(resp).
---
Outside diff comments:
In `@interceptors/billing.go`:
- Around line 27-35: The billing interceptor is using
llmproxy.DetectProviderFromModel(meta.Model) which uses the rewritten model
string and loses the actual resolved router; change provider resolution to
prefer the router-populated value in meta.Custom["provider"] (use it if present
and non-empty), falling back to llmproxy.DetectProviderFromModel(meta.Model)
only if meta.Custom["provider"] is absent, and then call i.Lookup(provider,
meta.Model) (with the same fallback to i.Lookup("", meta.Model) if not found) so
billing uses the resolved provider; update references to provider in the
billing.go interceptor accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d376346b-1808-4bc3-a894-90a91edcad2c
📒 Files selected for processing (11)
DESIGN.mdREADME.mdautorouter.godetection.goexamples/basic/main.gointerceptors/billing.gointerceptors/billing_test.gopricing/modelsdev/adapter.goproviders/openai_compatible/multiapi.goproviders/openai_compatible/responses_extractor.goproviders/openai_compatible/responses_parser.go
✅ Files skipped from review due to trivial changes (1)
- README.md
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/basic/main.go
📜 Review details
🧰 Additional context used
🪛 markdownlint-cli2 (0.22.0)
DESIGN.md
[warning] 86-86: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 335-335: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
| **Configuration options:** | ||
|
|
||
| - `WithAutoRouterRegistry(r)` — Use custom registry | ||
| - `WithAutoRouterDetector(d)` — Custom provider detection logic | ||
| - `WithAutoRouterInterceptor(i)` — Add interceptor to chain | ||
| - `WithAutoRouterHTTPClient(c)` — Custom HTTP client | ||
| - `WithAutoRouterFallbackProvider(p)` — Provider when detection fails | ||
|
|
There was a problem hiding this comment.
Document WithAutoRouterModelProviderLookup.
The AutoRouter option list omits the new model-provider lookup hook, so the models.dev-backed detection path added in this PR is not discoverable from the public design doc.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@DESIGN.md` around lines 99 - 106, Add documentation for the new AutoRouter
option WithAutoRouterModelProviderLookup to the Configuration options list:
describe its purpose (hook used to lookup model -> provider mapping for
models.dev-backed detection), its expected signature/type (a function that
accepts model identifier/context and returns a Provider or nil/error), and how
it interacts with existing options like WithAutoRouterDetector and
WithAutoRouterFallbackProvider; place this entry alongside the other options and
include a short usage note showing that supplying
WithAutoRouterModelProviderLookup enables models.dev-backed detection path for
AutoRouter.
| func (a *Adapter) FindProviderForModel(model string) string { | ||
| a.mu.RLock() | ||
| defer a.mu.RUnlock() | ||
|
|
||
| if a.data == nil { | ||
| return "" | ||
| } | ||
|
|
||
| for providerID, provider := range a.data { | ||
| if _, exists := provider.Models[model]; exists { | ||
| return providerID | ||
| } | ||
| } | ||
| return "" | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check the Adapter struct definition and Lookup method
cd pricing/modelsdev
cat -n adapter.go | head -100Repository: agentuity/llmproxy
Length of output: 3349
🏁 Script executed:
# Find the Lookup method to understand the lazy-load pattern
cat -n pricing/modelsdev/adapter.go | sed -n '158,187p'Repository: agentuity/llmproxy
Length of output: 960
🏁 Script executed:
# Search for how FindProviderForModel is used
rg -n "FindProviderForModel" --type goRepository: agentuity/llmproxy
Length of output: 518
🏁 Script executed:
# Check the AutoRouter integration mentioned in the review
rg -n -A 5 "FindProviderForModel" pricing/autorouter.goRepository: agentuity/llmproxy
Length of output: 123
🏁 Script executed:
find . -name "*autorouter*" -o -name "*auto_router*"Repository: agentuity/llmproxy
Length of output: 98
🏁 Script executed:
# Check the examples/basic/main.go to understand how FindProviderForModel is used
rg -n -B 5 -A 5 "FindProviderForModel" examples/basic/main.goRepository: agentuity/llmproxy
Length of output: 1213
🏁 Script executed:
# Search for AutoRouter type/interface to understand the contract
rg -n "type AutoRouter" --type goRepository: agentuity/llmproxy
Length of output: 160
🏁 Script executed:
# Check if there are any tests for FindProviderForModel
rg -n "FindProviderForModel" --type goRepository: agentuity/llmproxy
Length of output: 518
🏁 Script executed:
# Check the autorouter.go file to see how modelProviderLookup is used
cat -n autorouter.go | head -150Repository: agentuity/llmproxy
Length of output: 4895
🏁 Script executed:
# Look for where modelProviderLookup is called
rg -n "modelProviderLookup" autorouter.go -A 3 -B 3Repository: agentuity/llmproxy
Length of output: 844
🏁 Script executed:
# Check the actual function signature and see what the callback signature is
rg -n "modelProviderLookup\s*=" autorouter.go -B 10Repository: agentuity/llmproxy
Length of output: 482
🏁 Script executed:
# Check the ModelProviderLookup type definition
rg -n "type ModelProviderLookup" --type goRepository: agentuity/llmproxy
Length of output: 128
🏁 Script executed:
# Look for any tests of FindProviderForModel
find . -name "*test*.go" -exec rg -l "FindProviderForModel" {} \;Repository: agentuity/llmproxy
Length of output: 44
🏁 Script executed:
# Check if Lookup is called anywhere and how it handles expiry
rg -n "\.Lookup\(" --type go -B 2 -A 2Repository: agentuity/llmproxy
Length of output: 480
🏁 Script executed:
# Double-check: does Lookup have any tests showing sorting expectations?
rg -n "Lookup" pricing/modelsdev/adapter_test.go -B 3 -A 3Repository: agentuity/llmproxy
Length of output: 135
🏁 Script executed:
# Check if there are any comments or docs about deterministic provider selection
rg -n "deterministic|order|sort" pricing/modelsdev/ --type goRepository: agentuity/llmproxy
Length of output: 44
Add lazy-load mechanism to avoid silent failures on cold start.
FindProviderForModel returns "" immediately when the cache is not loaded (line 196), unlike Lookup which checks for expiry and calls Load() before searching. This causes AutoRouter's model lookup to silently fail on cold start unless Load() was called separately. Follow the same lazy-load pattern as Lookup (lines 158–165).
Suggested fix
func (a *Adapter) FindProviderForModel(model string) string {
+ if model == "" {
+ return ""
+ }
+
a.mu.RLock()
- defer a.mu.RUnlock()
-
- if a.data == nil {
+ needLoad := len(a.data) == 0 || (!a.expires.IsZero() && time.Now().After(a.expires))
+ a.mu.RUnlock()
+ if needLoad {
+ _ = a.Load(context.Background())
+ }
+
+ a.mu.RLock()
+ defer a.mu.RUnlock()
+
+ for providerID, provider := range a.data {
+ if _, exists := provider.Models[model]; exists {
+ return providerID
+ }
}
-
- for providerID, provider := range a.data {
- if _, exists := provider.Models[model]; exists {
- return providerID
- }
- }
return ""
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (a *Adapter) FindProviderForModel(model string) string { | |
| a.mu.RLock() | |
| defer a.mu.RUnlock() | |
| if a.data == nil { | |
| return "" | |
| } | |
| for providerID, provider := range a.data { | |
| if _, exists := provider.Models[model]; exists { | |
| return providerID | |
| } | |
| } | |
| return "" | |
| } | |
| func (a *Adapter) FindProviderForModel(model string) string { | |
| if model == "" { | |
| return "" | |
| } | |
| a.mu.RLock() | |
| needLoad := len(a.data) == 0 || (!a.expires.IsZero() && time.Now().After(a.expires)) | |
| a.mu.RUnlock() | |
| if needLoad { | |
| _ = a.Load(context.Background()) | |
| } | |
| a.mu.RLock() | |
| defer a.mu.RUnlock() | |
| for providerID, provider := range a.data { | |
| if _, exists := provider.Models[model]; exists { | |
| return providerID | |
| } | |
| } | |
| return "" | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pricing/modelsdev/adapter.go` around lines 192 - 206, FindProviderForModel
currently returns "" when the cache is nil, causing silent failures on cold
start; mirror the lazy-load pattern in Lookup by checking the same expiry
condition (a.data == nil || time.Since(a.lastLoad) > a.ttl), releasing the read
lock, calling a.Load(), then re-acquiring the read lock before searching. Ensure
you use the same error handling/return behavior as Lookup when Load() fails, and
preserve correct locking (unlock before calling Load and re-lock afterward).
Locate the implementation in Adapter.FindProviderForModel and use the same
fields/methods referenced in Lookup (a.data, a.lastLoad, a.ttl, Load()).
- Return ErrNoProvider when explicit provider lookup fails (don't use fallback) - Store resolved provider in meta.Custom for billing interceptor - Add language tag to code fence in DESIGN.md (MD040) - Document WithAutoRouterModelProviderLookup in DESIGN.md - Add lazy-load support to FindProviderForModel (matching Lookup behavior) - Close original response body before restoring in MultiAPIExtractor - Detect Responses API by response-specific fields (output/status) not request fields - Prefer router-resolved provider in billing interceptor over model detection
Summary
This PR adds automatic routing capabilities and OpenAI Responses API support:
/v1/responses)X-Providerheaderopenai/gpt-4are stripped togpt-4before forwardingChanges
New Features
AutoRouter- HTTP handler that routes to correct provider based on:X-Providerheader (explicit override)gpt-*→ OpenAI,claude-*→ Anthropic, etc.)anthropic/claude-3-opus)/v1/responses) with parser, extractor, resolverImprovements
Usage
Test plan
Summary by CodeRabbit
New Features
Improvements
Documentation
Tests