Skip to content

Begin ReqLLM migration#3338

Merged
michaeljguarino merged 4 commits intomasterfrom
implement-req-llm
Mar 20, 2026
Merged

Begin ReqLLM migration#3338
michaeljguarino merged 4 commits intomasterfrom
implement-req-llm

Conversation

@michaeljguarino
Copy link
Copy Markdown
Member

This should reimplement our ai provider behaviors using reqllm to offload response parsing and other annoyances. This is going to require a decent amount of e2e testing before merge but is ultimately the better solve here.

Test Plan

unit tests updated

Checklist

  • If required, I have updated the Plural documentation accordingly.
  • I have added tests to cover my changes.
  • I have added a meaningful title and summary to convey the impact of this PR to a user.

Plural Flow: console

@michaeljguarino michaeljguarino requested a review from a team March 17, 2026 19:26
@michaeljguarino michaeljguarino added the enhancement New feature or request label Mar 17, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR begins a migration of all AI provider implementations (OpenAI, Anthropic, Azure, Bedrock, Vertex) from hand-rolled HTTPoison/Poison HTTP handling to the req_llm ~> 1.7 library. All five providers are rewritten to use a shared Console.AI.Provider.Base adapter, and context window sizes are now dynamically resolved from LLMDB rather than hardcoded per model. A new OpenAiMethod enum (CHAT / RESPONSES / AUTO) is added to the OpenAI settings to control which wire protocol reqllm uses, and the Kubernetes CRD / Go types are updated accordingly.

Key items to verify before merging:

  • Logic bug in openai.ex: guess_protocol(model, method) (line 91) receives the atom field key (:model, :tool_model) rather than the resolved model string. The gpt-5 auto-detection clause ("gpt-5" <> _) can therefore never match when method is :auto and a custom base_url is set, so gpt-5 models with custom endpoints will silently use the chat completions API instead of the Responses API.
  • Commented-out Nexus code in deploymentsettings_types.go (lines 719–730): The attribute-mapping block was commented out but NexusSettings struct, its CRD schema entry, and the DeepCopyInto generated code all remain active. The CRD will accept nexus configuration that is never forwarded to the API.
  • Typo in azure.ex: noramlize_url should be normalize_url.
  • "chart completions" typo propagated through descriptions in settings.ex, schema.graphql, graphql.ts, models_gen.go, and the CRD YAML — should read "chat completions".
  • Breaking default for Vertex: Default model changed from gemini-2.5-flash to claude-sonnet-4-5@20250929; users relying on the Vertex default without explicit model configuration will silently switch LLM families.
  • The PR description notes that significant e2e testing is still needed before merge, which is reflected in the confidence score.

Confidence Score: 2/5

  • Not safe to merge yet — confirmed logic bug in OpenAI protocol detection plus incomplete Nexus removal and e2e testing still pending per the author's own checklist.
  • Score reflects: (1) a confirmed logic bug where guess_protocol never receives the model string for custom base URL setups, causing wrong API protocol selection for gpt-5; (2) commented-out Nexus code leaving the controller in an inconsistent state; (3) a potentially breaking default model change for Vertex; and (4) the author explicitly acknowledges e2e testing is incomplete.
  • lib/console/ai/provider/openai.ex (protocol detection bug), go/controller/api/v1alpha1/deploymentsettings_types.go (half-removed Nexus), and lib/console/ai/provider/vertex.ex (silent default model change).

Important Files Changed

Filename Overview
lib/console/ai/provider/openai.ex Core OpenAI provider rewritten to use ReqLLM; contains a logic bug where guess_protocol receives an atom field key instead of the actual model string, breaking auto-detection of gpt-5 → Responses API for custom base URLs.
lib/console/ai/provider/azure.ex Azure provider rewritten to use ReqLLM directly; proxy now returns an error (breaking change), stream support added, and a typo exists in the private noramlize_url function name.
lib/console/ai/provider/base.ex New shared ReqLLM adapter layer; handles message conversion, tool wrapping, result parsing, and streaming callback. Logic is straightforward and consistent across providers.
lib/console/ai/provider/anthropic.ex Anthropic provider dramatically simplified via ReqLLM; default model bumped to claude-4-5-sonnet-latest (from claude-3-5-sonnet-latest) and context window now dynamically looked up from LLMDB instead of being hardcoded.
lib/console/ai/provider/vertex.ex Vertex provider rewritten to use ReqLLM with google-vertex: prefix; default model changed from gemini-2.5-flash to claude-sonnet-4-5@20250929, which is a potentially breaking default for existing Vertex users.
lib/console/ai/provider/bedrock.ex Bedrock provider rewritten to use ReqLLM with bedrock: prefix; AWS credentials now handled explicitly or via ExAws fallback; proxy removed.
go/controller/api/v1alpha1/deploymentsettings_types.go New OpenAISettings type added with Method field; Nexus attribute-mapping code is commented out but NexusSettings struct and its CRD schema entry remain active, creating an inconsistent half-removed state.
lib/console/ai/stream.ex New publish/1 callback added for ReqLLM streaming; index field added to track position within a message. Logic is correct assuming the stream process state is always initialized before use.
lib/console/ai/provider.ex context_window/1 now accepts a client argument (:default or :tool) and keys the cache accordingly, allowing different window sizes for different client types.
lib/console/schema/deployment_settings.ex Adds OpenAIMethod enum and method field to the OpenAI settings embedded schema; dedicated openai_changeset replaces the generic ai_api_changeset.
mix.exs Adds req_llm ~> 1.7 dependency; rest of the file unchanged.

Comments Outside Diff (4)

  1. lib/console/ai/provider/openai.ex, line 88-95 (link)

    guess_protocol receives atom key instead of model string

    guess_protocol(model, method) is called with model as an atom field key (e.g. :model, :tool_model) passed from the callers — not the actual model string. The third clause defp guess_protocol("gpt-5" <> _, _) can therefore never match (an atom won't match a binary pattern), so auto-detection of gpt-5 models to use the Responses API will silently fall through to "openai_chat" even when method is :auto. The fix is to pass the resolved model string instead:

  2. lib/console/ai/provider/azure.ex, line 85-87 (link)

    Typo in private function name

    noramlize_url is misspelled — the letters r and m are transposed.

    The call site on line 74 (noramlize_url(base_url)) would need to be updated as well.

  3. go/controller/api/v1alpha1/deploymentsettings_types.go, line 719-733 (link)

    Dead commented-out code should be removed

    A large block of Nexus attribute-mapping code has been commented out rather than deleted. NexusSettings still has a live struct definition (line 1027), a Token() method, a DeepCopyInto in the generated code, and a field on AISettings (line 468), yet the attributes path is silently skipped. This creates a confusing half-removed state: the CRD schema accepts nexus config but it will never be forwarded to the API.

    Consider either removing the dead code and the still-unused NexusSettings struct entirely, or leaving a clear // TODO with a tracking issue if this is intentionally deferred.

  4. lib/console/graphql/deployments/settings.ex, line 119 (link)

    "chart completions" should be "chat completions"

    The description text says "chart completions" which is a typo. The same typo is present in multiple generated/schema files: schema/schema.graphql, assets/src/generated/graphql.ts, go/client/models_gen.go, and the CRD YAML files. Fixing it here in the source definition will naturally propagate to the generated files on the next codegen run.

Last reviewed commit: "drop handwritten sse..."

@michaeljguarino michaeljguarino force-pushed the implement-req-llm branch 3 times, most recently from 8640438 to e94cd91 Compare March 17, 2026 20:21
@michaeljguarino
Copy link
Copy Markdown
Member Author

@greptileai

@michaeljguarino michaeljguarino force-pushed the implement-req-llm branch 15 times, most recently from 6856612 to b87b9e4 Compare March 18, 2026 03:41
@michaeljguarino
Copy link
Copy Markdown
Member Author

@greptileai (also ignore the nexus deprecation, that's going to need work separately)

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 18, 2026

Got it, that makes sense. The nexus deprecation is a separate concern from the ReqLLM migration work in this PR, so it's fine to leave that for later.

@michaeljguarino
Copy link
Copy Markdown
Member Author

@greptileai

@michaeljguarino michaeljguarino force-pushed the implement-req-llm branch 2 times, most recently from b3c61a8 to d826288 Compare March 18, 2026 14:46
@michaeljguarino michaeljguarino force-pushed the implement-req-llm branch 19 times, most recently from 1fe519b to 66c34da Compare March 20, 2026 02:06
This should reimplement our ai provider behaviors using reqllm to offload response parsing and other annoyances.  This is going to require a decent amount of e2e testing before merge but is ultimately the better solve here.
@michaeljguarino michaeljguarino force-pushed the implement-req-llm branch 3 times, most recently from dc865e9 to fb8abfa Compare March 20, 2026 04:37
@michaeljguarino michaeljguarino merged commit 55e59d7 into master Mar 20, 2026
53 of 56 checks passed
@michaeljguarino michaeljguarino deleted the implement-req-llm branch March 20, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants