Skip to content

openai response API#3336

Open
kinjalh wants to merge 1 commit intomasterfrom
openai-responses
Open

openai response API#3336
kinjalh wants to merge 1 commit intomasterfrom
openai-responses

Conversation

@kinjalh
Copy link
Member

@kinjalh kinjalh commented Mar 17, 2026

Test Plan

Checklist

  • If required, I have updated the Plural documentation accordingly.
  • I have added tests to cover my changes.
  • I have added a meaningful title and summary to convey the impact of this PR to a user.

Plural Flow: console

@kinjalh kinjalh added the enhancement New feature or request label Mar 17, 2026
@kinjalh kinjalh marked this pull request as ready for review March 17, 2026 15:43
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR adds support for OpenAI's Responses API as an alternative backend for newer models (gpt-5, codex, o3, o4), routing them through /v1/responses instead of /v1/chat/completions. It introduces new response structs, a use_responses_api?/1 dispatch predicate, streaming event handling for the Responses API SSE format, and improves the tool-call accumulation layer to work with both API formats.

Key changes:

  • completion/3 now dispatches to completion_via_chat or completion_via_responses based on the model name
  • New handle_openai_responses/1 streaming handler processes response.output_text.delta and response.function_call_arguments.delta SSE events
  • Result.Tool gains an update/2 function to cleanly accumulate partial tool-call data across streaming chunks
  • Two P1 bugs found: extract_responses_output/1 will raise FunctionClauseError on a nil output (no fallback clause), and tool_call/4 bypasses the new routing entirely, meaning newer models that require the Responses API will receive an error when invoked via tool_call
  • Unused variables n and args in extract_instructions_and_input will produce compiler warnings

Confidence Score: 2/5

  • Not safe to merge — two P1 logic bugs can cause runtime crashes for newer OpenAI models.
  • The core streaming path and Result accumulator look solid, but extract_responses_output lacks a nil-guard that will crash on any API response that omits the output field, and tool_call completely bypasses the new routing so tool-use requests for o3/o4/gpt-5/codex will hit the wrong endpoint.
  • lib/console/ai/provider/openai.ex — both P1 bugs are in this file; pay close attention to extract_responses_output/1 and tool_call/4.

Important Files Changed

Filename Overview
lib/console/ai/provider/openai.ex Adds Responses API support for newer OpenAI models (gpt-5, codex, o3, o4) via new structs, routing logic, and request/response helpers. Has three issues: extract_responses_output can raise FunctionClauseError when output is nil, tool_call/4 doesn't route newer models through the Responses API (only completion/3 does), and unused variables n/args will produce compiler warnings.
lib/console/ai/stream/exec.ex Adds handle_openai_responses/1 for streaming the Responses API SSE events; also adds an empty-string guard to handle_openai. Handler functions changed from defp to def to support direct unit testing, which exposes implementation details unnecessarily.
lib/console/ai/stream/result.ex Adds Tool.update/2 and guards to Tool.args/2; refactors Result.tool/3 to use update for accumulating streaming chunks from both Chat Completions and Responses API formats. Changes are clean and correct.
test/console/ai/stream/exec_test.exs New test file providing good unit coverage of all three handler functions. One test ("accumulates function call arguments across multiple deltas") is misleadingly named — it only verifies that each individual delta chunk returns the right value; actual accumulation is handled by Result.tool/3 elsewhere.

Comments Outside Diff (1)

  1. lib/console/ai/provider/openai.ex, line 204-215 (link)

    P1 tool_call/4 bypasses the Responses API routing for newer models

    tool_call hardcodes chat/3 (Chat Completions endpoint) for every model, including o3, o4, gpt-5, and codex — the same models that completion/3 routes through the Responses API. If those models have dropped support for the Chat Completions endpoint, tool_call will receive an API error at runtime.

    The routing logic added to completion/3 (use_responses_api?/1) should also be applied inside tool_call so newer models use responses/3 with tool_choice: "required" and require_tools: true.

Last reviewed commit: 888448d

Comment on lines +165 to +190
defp extract_responses_output(output) when is_list(output) do
{text_parts, tool_calls} = Enum.reduce(output, {[], []}, fn item, {texts, tools} ->
case item do
%ResponsesOutputItem{type: "message", content: content} when is_list(content) ->
text = Enum.map_join(content, "", fn
%ResponsesContentPart{type: "output_text", text: t} when is_binary(t) -> t
_ -> ""
end)
{[text | texts], tools}

%ResponsesOutputItem{type: "function_call", id: id, name: name, arguments: args} ->
tool = %Console.AI.Tool{id: id, name: name, arguments: safe_decode_args(args)}
{texts, [tool | tools]}

_ ->
{texts, tools}
end
end)

content = text_parts |> Enum.reverse() |> Enum.join("")

case Enum.reverse(tool_calls) do
[] -> {:ok, content}
tools -> {:ok, content, tools}
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 FunctionClauseError when output is nil

extract_responses_output/1 has an is_list(output) guard but ResponsesResponse.output defaults to nil (no default in the struct). When Poison.decode! deserializes a response with a missing output field, the struct will have output: nil. The pattern match in completion_via_responses will still bind output (it's not a guard), and then extract_responses_output(nil) will raise a FunctionClauseError because no clause matches a non-list value.

A defensive fallback clause should be added:

defp extract_responses_output(nil), do: {:error, "could not generate an ai completion for this context"}
defp extract_responses_output(output) when is_list(output) do
  # existing body
end

end

input = Enum.flat_map(other_msgs, fn
{:tool, msg, %{call_id: id, name: n, arguments: args}} when is_binary(id) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused variables will generate compiler warnings

n and args are bound from the pattern match but never referenced in the function body. Elixir will emit variable "n" is unused and variable "args" is unused warnings at compile time. Prefix them with _ to silence the warnings:

Suggested change
{:tool, msg, %{call_id: id, name: n, arguments: args}} when is_binary(id) ->
{:tool, msg, %{call_id: id, name: _n, arguments: _args}} when is_binary(id) ->

Comment on lines +13 to +19
def handle_openai(%{"choices" => [%{"delta" => %{"content" => c}} | _]}) when is_binary(c) and c != "", do: c
def handle_openai(%{"choices" => [%{"delta" => %{"tool_calls" => [_ | _] = calls}}]}) do
{:tools, Enum.map(calls, fn %{"index" => ind, "function" => call} = func ->
{ind, Map.put(call, "id", func["id"])}
end)}
end
defp handle_openai(_), do: :pass
def handle_openai(_), do: :pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Private implementation details exposed as public API to accommodate tests

handle_openai/1 and handle_anthropic/1 were previously defp and are now promoted to def solely so that the unit tests in exec_test.exs can call them directly. The test file itself acknowledges this trade-off in a comment. Exposing internal handler functions as public API makes them part of the module's contract and harder to change later.

Consider using @doc false or testing these indirectly through the higher-level openai/2, anthropic/2, and openai_responses/2 functions instead, or extracting the handlers into a dedicated testable sub-module.

@michaeljguarino
Copy link
Member

This actually shouldn't be needed with the ReqLLM pr, which is going a lot faster than I anticipated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants