feat: [OpenAI] PoC - Responses API support with OpenAI SDK Adapter#794
feat: [OpenAI] PoC - Responses API support with OpenAI SDK Adapter#794
Conversation
| </dependency> | ||
| <dependency> | ||
| <groupId>com.openai</groupId> | ||
| <artifactId>openai-java-core</artifactId> |
There was a problem hiding this comment.
(Major)
I would argue either new module, or set this dependency as optional
There was a problem hiding this comment.
Why? we are going to deprecate the 2024 generated API
There was a problem hiding this comment.
I have marked it optional
There was a problem hiding this comment.
I want to highlight current api limitation
import static com.sap.ai.sdk.foundationmodels.openai.OpenAiModel;
import com.openai.models.ChatModel;
// Get the client for a deployment by model name and version
OpenAiModel ourAiModel = OpenAiModel.GPT_5
OpenAiClient client = AiCoreOpenAiClient.forModel(ourAiModel) //
// Supply model again for request payload. Throws without model.
ChatModel openAiModel = ChatModel.GPT_5
var request = ResponseCreateParams.builder().input(input).model(openAiModel).build()Two sources of truth.
In the current api behaviour, the model in selected deployment takes precedence over the one in payload. But, this behaviour is not apparent to the user.
| * @throws DeploymentResolutionException If no running deployment is found for the model. | ||
| */ | ||
| @Nonnull | ||
| public static OpenAIClient forModel(@Nonnull final AiModel model) { |
There was a problem hiding this comment.
You could modify the client to not be instantiated but instead be created at the request level and cached.
No strong preference but it is better API
There was a problem hiding this comment.
yes. This is one of the ways along with few other, each with important caveats
-
Sniffing request: As Charles mentioned we can parse the body in
HttpRequestback toJsonNodeor request type to infer the model to fetch deployment for - at request time.- As you can imagine, means this deserializing already serialized response.
- You will only find model in
create()calls. But not inretrieve(),delete()or any other operation. Then how should we fetch a deployment ? just any deployment underfoundation-modelscenario? - At request time, we can't reliably infer version out of values like "gpt-5-nano", "gpt-5.2", "o3-2025-04-16". AiCore expects distinct fields for model name and model version to match with a deployment. We will have to rework our deployment resolution logic.
-
Wrapper API: We draft our own wrapper instead of directly returning an object of
com.openai.client.OpenAIClient// Our wrapper client AiCoreBoundOpenAiClient client = AiCoreOpenAiClient.forModel(OpenAiModel.GPT_41); ResponseCreateParams params = ResponseCreateParams.builder() .input("Hello") // .model(...) is optional. We inject or validate for match .build(); Response response = client.responses().create(params);
public interface AiCoreBoundOpenAiClient { AiCoreResponsesService responses(); AiCoreChatCompletionsService chatCompletions(); OpenAIClient raw(); // escape hatch }Basically, inject model into
params, or validate existing model for match with the one in deployment within in our wrapper api.- Maintenance burden is much higher, but we will be able to active choose UX.
| final ClientOptions clientOptions = | ||
| ClientOptions.builder().baseUrl(baseUrl).httpClient(httpClient).apiKey("unused").build(); |
There was a problem hiding this comment.
I found a way to propagate the model information to the request body.
But it's super ugly :( and you would need to find a way to pass on model information.
(View code suggestion)
| final ClientOptions clientOptions = | |
| ClientOptions.builder().baseUrl(baseUrl).httpClient(httpClient).apiKey("unused").build(); | |
| final var m = new SimpleModule() {{ | |
| setSerializerModifier(new BeanSerializerModifier() { | |
| @Override | |
| @SuppressWarnings("unchecked") | |
| public JsonSerializer<?> modifySerializer(SerializationConfig config, BeanDescription desc, JsonSerializer<?> serializer) { | |
| if (!ResponseCreateParams.Body.class.isAssignableFrom(desc.getBeanClass())) | |
| return serializer; | |
| final var typed = (JsonSerializer<ResponseCreateParams.Body>) serializer; | |
| return new StdSerializer<>(ResponseCreateParams.Body.class) { | |
| @Override | |
| public void serialize(ResponseCreateParams.Body value, JsonGenerator gen, SerializerProvider provider) | |
| throws IOException { | |
| final var buf = new TokenBuffer(gen.getCodec(), false); | |
| typed.serialize(value, buf, provider); | |
| final ObjectNode node = gen.getCodec().readTree(buf.asParser()); | |
| if (!node.has("model")) node.put("model", "gpt-5"); | |
| gen.writeTree(node); | |
| } | |
| }; | |
| } | |
| }); | |
| }}; | |
| final ClientOptions clientOptions = | |
| ClientOptions.builder().baseUrl(baseUrl).httpClient(httpClient).apiKey("unused") | |
| .jsonMapper((JsonMapper) jsonMapper().registerModule(m)) | |
| .build(); |
There was a problem hiding this comment.
I will try this out and get back to you.
There was a problem hiding this comment.
I tried to make it work with mixin, without success.
There was a problem hiding this comment.
I am suggesting we need to accept the limitation. We could ofc, get clever with propagating model info to request body for Response API since model is optional in it request builder api. But for Chat Completion, the request builder throws right away if model is not provided. So, we can't really provide any convenience there.
Assuming we will deprecate old openai client we have (as per @CharlesDuboisSAP suggestion), we can keep the current state for consistency. Also, add documentation clarifying which will be the authoritative model.
- Streaming no fully enabled
ccca300 to
127bc45
Compare
| * @return A configured OpenAI client instance. | ||
| */ | ||
| @Nonnull | ||
| @SuppressWarnings("PMD.CloseResource") |
There was a problem hiding this comment.
There is no test that makes sure we don't leave stuff open and we don't have a memory leak
| val params = | ||
| ResponseCreateParams.builder().input(input).model(ChatModel.GPT_5).store(false).build(); | ||
| return CLIENT.responses().create(params); | ||
| } | ||
|
|
||
| /** | ||
| * Create a response and immediately retrieve it using the Responses API. This demonstrates the | ||
| * two-step process of creating and then fetching a response. | ||
| * | ||
| * @param input the input text to send to the model | ||
| * @return the retrieved response object from the Responses API | ||
| */ | ||
| @Nonnull | ||
| public Response retrieveResponse(@Nonnull final String input) { | ||
| // Create a non-persistent response with store=false |
There was a problem hiding this comment.
| val params = | |
| ResponseCreateParams.builder().input(input).model(ChatModel.GPT_5).store(false).build(); | |
| return CLIENT.responses().create(params); | |
| } | |
| /** | |
| * Create a response and immediately retrieve it using the Responses API. This demonstrates the | |
| * two-step process of creating and then fetching a response. | |
| * | |
| * @param input the input text to send to the model | |
| * @return the retrieved response object from the Responses API | |
| */ | |
| @Nonnull | |
| public Response retrieveResponse(@Nonnull final String input) { | |
| // Create a non-persistent response with store=false | |
| // Create a non-persistent response with store=false | |
| val params = | |
| ResponseCreateParams.builder().input(input).model(ChatModel.GPT_5).store(false).build(); | |
| return CLIENT.responses().create(params); | |
| } | |
| /** | |
| * Create a response and immediately retrieve it using the Responses API. This demonstrates the | |
| * two-step process of creating and then fetching a response. | |
| * | |
| * @param input the input text to send to the model | |
| * @return the retrieved response object from the Responses API | |
| */ | |
| @Nonnull | |
| public Response retrieveResponse(@Nonnull final String input) { |
The comment only makes sense next to the function that it refers
| assertThat(response.output().get(1).message().get().content().get(0).asOutputText().text()) | ||
| .contains("Paris"); | ||
| } | ||
|
|
||
| @Test | ||
| @Disabled("Flaky test") | ||
| void testGetResponse() { | ||
| final var response = service.retrieveResponse("What is the capital of France?"); | ||
| assertThat(response).isNotNull(); | ||
| assertThat(response.output()).isNotNull(); | ||
| assertThat(response.output().get(1).message().get().content().get(0).asOutputText().text()) | ||
| .contains("Paris"); |
There was a problem hiding this comment.
| assertThat(response.output().get(1).message().get().content().get(0).asOutputText().text()) | |
| .contains("Paris"); | |
| } | |
| @Test | |
| @Disabled("Flaky test") | |
| void testGetResponse() { | |
| final var response = service.retrieveResponse("What is the capital of France?"); | |
| assertThat(response).isNotNull(); | |
| assertThat(response.output()).isNotNull(); | |
| assertThat(response.output().get(1).message().get().content().get(0).asOutputText().text()) | |
| .contains("Paris"); | |
| val message = response.output().get(1).message(); | |
| assertThat(message.isPresent()).isTrue(); | |
| assertThat(message.get().content().get(0).asOutputText().text()).contains("Paris"); | |
| } | |
| @Test | |
| @Disabled("Flaky test") | |
| void testGetResponse() { | |
| final var response = service.retrieveResponse("What is the capital of France?"); | |
| assertThat(response).isNotNull(); | |
| assertThat(response.output()).isNotNull(); | |
| val message = response.output().get(1).message(); | |
| assertThat(message.isPresent()).isTrue(); | |
| assertThat(message.get().content().get(0).asOutputText().text()).contains("Paris"); |
removes 2 warnings
* Minimal new module setup including spec * Generation partial-success * Remove examples * Successfully filter by path * Attach spec filter command * Initial setup * Successful PoC with OpenAI Models * Version 1 * Stable api * Change class name * Add tests * fix dependency analyse issues * Initial draft untested * Second draft - Streaming no fully enabled * Successful E2E * Streaming initial draft * Streaming E2E with chat completion * isStreaming check simplified * Cleanup PoC and rename module * Reduce Javadoc verbosity * Restrict to `/responses` api * Cleanup comments * Charles review suggestions * Charles review - round 2 suggestions * Add dependency * Mark openai dependency optional and new client `@Beta` * Cleanup and no throw on missing model * pmd * Responses API complete * ChatCompletionCreateParams throws without model. Needs rethink client API creation forModel * Cleanup and close with test documenting limitation * First draft responses only * jacoco limits * Remove ResponseService wrapper since remote API behaviour changed
Context
AI/ai-sdk-java-backlog#364.
This PoC provides an adapter for Official OpenAI SDK integrations with our SDK by implementing
com.openai.core.http.HttpClient. You can find out more about the OpenAI recommended approach here in their docs.Feature scope:
AiCoreOpenAiClientthat can work as an adapter for any OpenAI endpoints- Easy adoption of any openai endpoints (eg:
/realtime) that are/will be supported by AiCoreAiCoreHttpClientImpl/response(for now)Usage
Pros
/chat/completion,/realtimeetc)Cons
com.sap.ai.sdk.foundationmodels.openai.OpenAiModelused to select available models in AICore andcom.openai.models.ChatModel.ChatModelfor request payload configuration in OpenAI SDKDefinition of Done
Aligned changes with the JavaScript SDK