Skip to content

Conversation

@avinash2692
Copy link
Contributor

@avinash2692 avinash2692 commented Dec 24, 2025

@avinash2692 avinash2692 changed the title Fix OpenAI reasoning effort: fix: OpenAI reasoning effort Dec 24, 2025
@mergify
Copy link

mergify bot commented Dec 24, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

@nrfulton nrfulton self-requested a review December 26, 2025 20:16
Copy link
Contributor

@nrfulton nrfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Requested a comment.

@avinash2692 : after adding the comment please feel free to break glass and merge without a code review.

@avinash2692: FYI, NVM. I'm rolling the fix to #274 into this PR. That fix requires discussion with the rest of the core team to make sure there are no objections or subtle bugs introduced by defaulting to self._base_url = None for the OpenAI backend.

Comment on lines 634 to 638
# Build optional reasoning parameters
reasoning_params = {}
if thinking is not None:
reasoning_params["reasoning_effort"] = thinking

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add a comment noting that OpenAI doesn't like it when non-reasoning models get reasoning parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@nrfulton
Copy link
Contributor

nrfulton commented Dec 26, 2025

FYI @jakelorocco and @HendrikStrobelt:

We currently default to ollama when the user instantiates the OpenAI backend without a self._base_url. This PR will change that default behavior.

Now, if the user types:

import mellea

mellea.start_session(backend_name="openai", api_key="...")

then the OpenAI SDK's default endpoint will be used.

However, if the user types:

import mellea

mellea.start_session(backend_name="openai")

without an API key argument, then the ollama backend will be used and a warning will be printed.

Note that the user must still explicitly provide an api_key kwarg. Having an OpenAI API key in their ENV is not enough. This should prevent accidental expenses / data leaks while still providing less surprising behavior for users who are using the OpenAI backend.

@nrfulton nrfulton changed the title fix: OpenAI reasoning effort fix: OpenAI base_url default and reasoning effort model option. Dec 26, 2025
This default is changed because the default base_url is also changed.
The OpenAI response_format only accepts a limited set of schemas and
will error out with a 400 if you do not follow their guidelines.

One of these guidelines is that additionalProperties is set and is set
to False.

This commit monkey-patches the response_format provided to OpenAI
platform backends, and leaves other OpenAI-"compatible" backends with
the existing default behavior. This is a debatable choice.

See https://community.openai.com/t/api-rejects-valid-json-schema/906163 and https://platform.openai.com/docs/guides/structured-outputs?api-mode=chat
@nrfulton
Copy link
Contributor

nrfulton commented Jan 3, 2026

We should also review @jakelorocco 's notes in #141 before merging.

@nrfulton nrfulton linked an issue Jan 3, 2026 that may be closed by this pull request
@nrfulton nrfulton self-requested a review January 3, 2026 16:07
Comment on lines 155 to 160
if api_key is None:
FancyLogger.get_logger().warning(
"You are using an OpenAI backend with no api_key. Because no API key was provided, mellea assumes you intend to use the openai-compatible interface to your local ollama instance. If you intend to use OpenAI's platform you must specify your API key when instantiating your Mellea session/backend object."
)
self._base_url: str | None = "http://localhost:11434/v1" # ollama
self._api_key = "ollama"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are close to defaults that make sense here. I think if the user specifies a base_url we should always use that base_url (even if no apikey is set). I also wonder if we should default the apikey to ollama in those situations.

Otherwise, we have no way to target arbitrary localhost ports that don't require an apikey.

For example (and this isn't the best since it uses LiteLLM and we have a separate backend for that), LiteLLM has a proxy that you can run locally. This proxy stores the apikey information itself; so you can target an arbitrary localhost port without an apikey.

My proposed solution would be to just set the parameter default values to work for the ollama version (ie api_key="ollama" and base_url="http://localhost:11434/v1"). Then users can override these values. I think this would also allow users to explicitly set api_key / base_url to None and have the underlying OpenAI SDK still automatically pick up their env vars (without the risk of users accidentally incurring expenses).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consensus: just pass the args through to the openai sdk. Don't do argument handling such as this.

@jakelorocco
Copy link
Contributor

We should also review @jakelorocco 's notes in #141 before merging.

This issues fixes everything but the model id selection mentioned in #141. I'm ambivalent on if we actually want to handle that "smartly" or just make users choose the specific name themselves when we pick wrongly. We could develop a heuristic for it based on base_urls, but that seems prone to errors. If others agree, we can close #141 when this PR is merged.

@nrfulton
Copy link
Contributor

nrfulton commented Jan 5, 2026

We should also review @jakelorocco 's notes in #141 before merging.

This issues fixes everything but the model id selection mentioned in #141. I'm ambivalent on if we actually want to handle that "smartly" or just make users choose the specific name themselves when we pick wrongly. We could develop a heuristic for it based on base_urls, but that seems prone to errors. If others agree, we can close #141 when this PR is merged.

Discussed in leads.

In the initializer, these lines of code need to change:
https://github.com/generative-computing/mellea/blob/e7e161bc14f4ba419d5bcf69954686cbe86ff75b/mellea/backends/openai.py#L144C1-L154C1

New logic:

  1. If model_id : str then that self._model_id = model_id is used.
  2. If model_id : ModelIdentifier then self._model_id = model_id.openai_name.
  3. Delete all references to the self._hf_model_id field and replace with self._model_id.

Also ensure that this is dead code and then delete:

def apply_chat_template(self, chat: list[dict[str, str]]):
"""Apply the chat template for the model, if such a model is available (e.g., when it can deduce the huggingface model id)."""
from transformers import AutoTokenizer
if not hasattr(self, "_tokenizer"):
assert self._base_url, (
"The OpenAI Platform does not support adapters. You must specify a _base_url when using adapters."
)
match _server_type(self._base_url):
case _ServerType.LOCALHOST:
self._tokenizer: "PreTrainedTokenizer" = ( # noqa: UP037
AutoTokenizer.from_pretrained(self._hf_model_id)
)
case _ServerType.OPENAI:
raise Exception(
"apply_chat_template is called while targeting a server at openai.com. "
"This is not supported --- openai.com does not support Activated Lora. "
"Use a locally served vllm instance. "
)
return self._tokenizer.apply_chat_template(chat, tokenize=False)

Resolution assigned to @avinash2692

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAI backend is hard to use OpenAI backend passes invalid reasoning_effort to non-reasoning models investigate defaults for openai backend

4 participants