Skip to content

fix: retire models, feature fixes#1299

Merged
harshiv-26 merged 9 commits into
mainfrom
retire-models
Jun 8, 2026
Merged

fix: retire models, feature fixes#1299
harshiv-26 merged 9 commits into
mainfrom
retire-models

Conversation

@harshiv-26

@harshiv-26 harshiv-26 commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Note

Medium Risk
Catalog-only changes, but marking models retired affects routing and discoverability for anyone still referencing those IDs; incorrect entries could hide usable models or leave bad ones selectable.

Overview
This PR updates provider model catalog YAML to reflect models that are no longer available or should not be selected, plus a couple of capability/param fixes.

Lifecycle: Many entries across google-gemini, google-vertex, deepinfra, and openrouter move to status: retired (from active or deprecated). Several Gemini 2.0 Flash / Flash Lite variants on Vertex also gain isDeprecated: true alongside retirement. OpenRouter listings for Mistral, Baidu ERNIE, Alibaba Tongyi, Arcee Trinity, and related Gemini 2.0 routes follow the same pattern.

Metadata tweaks: ByteDance/Seed-2.0-code on DeepInfra replaces the json_output feature with structured_output. anthropic/claude-opus-4-8 on Google Vertex adds removeParams: [temperature] so callers do not send an unsupported parameter.

Reviewed by Cursor Bugbot for commit 123f1a5. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread providers/deepinfra/ByteDance/Seed-2.0-code.yaml
@harshiv-26

Copy link
Copy Markdown
Collaborator Author

/test-models

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 547fd5c. Configure here.

Comment thread providers/deepinfra/ByteDance/Seed-2.0-code.yaml Outdated
@harshiv-26

Copy link
Copy Markdown
Collaborator Author

Gateway test results

  • Total: 4
  • Passed: 0
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 4
  • Success rate: 0.0%
Provider Model Scenarios
google-gemini gemini-2.0-flash skipped: skip-check
google-gemini gemini-2.0-flash-001 skipped: skip-check
google-gemini gemini-2.0-flash-lite skipped: skip-check
google-gemini gemini-2.0-flash-lite-001 skipped: skip-check
Skipped (4)

google-gemini/gemini-2.0-flash — skip-check (skipped)

Skip reason
deprecated or retired model

google-gemini/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

google-gemini/gemini-2.0-flash-lite — skip-check (skipped)

Skip reason
deprecated or retired model

google-gemini/gemini-2.0-flash-lite-001 — skip-check (skipped)

Skip reason
deprecated or retired model

@harshiv-26

Copy link
Copy Markdown
Collaborator Author

Gateway test results

  • Total: 5
  • Passed: 4
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 1
  • Success rate: 100.0%
Provider Model Scenarios
deepinfra ByteDance/Seed-2.0-code success: tool-call:stream, tool-call, params, params:stream
deepinfra google/gemini-2.0-flash-001 skipped: skip-check
Successes (4)

deepinfra/ByteDance/Seed-2.0-code — tool-call:stream (success)

Output
{"location": "London"}
VALIDATION: tool-call stream SUCCESS

deepinfra/ByteDance/Seed-2.0-code — tool-call (success)

Output
Function: get_weather
Arguments: {"location": "London"}
VALIDATION: tool-call SUCCESS

deepinfra/ByteDance/Seed-2.0-code — params (success)

Output
The capital of France is Paris. It is also the country’s most populous city, situated along the Seine River in northern France. Paris serves as France
... (truncated, 198 chars omitted)

deepinfra/ByteDance/Seed-2.0-code — params:stream (success)

Output
The capital of France is **Paris**. It is the country’s most populous city, a global hub for art, fashion, and culture, and is situated along the Sein
... (truncated, 124 chars omitted)
Skipped (1)

deepinfra/google/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

@harshiv-26

Copy link
Copy Markdown
Collaborator Author

Gateway test results

  • Total: 6
  • Passed: 0
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 6
  • Success rate: 0.0%
Provider Model Scenarios
google-vertex gemini-2.0-flash skipped: skip-check
google-vertex gemini-2.0-flash-001 skipped: skip-check
google-vertex gemini-2.0-flash-lite skipped: skip-check
google-vertex gemini-2.0-flash-lite-001 skipped: skip-check
google-vertex google/gemini-2.0-flash-001 skipped: skip-check
google-vertex google/gemini-2.0-flash-lite-001 skipped: skip-check
Skipped (6)

google-vertex/gemini-2.0-flash — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/gemini-2.0-flash-lite — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/gemini-2.0-flash-lite-001 — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/google/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/google/gemini-2.0-flash-lite-001 — skip-check (skipped)

Skip reason
deprecated or retired model

@harshiv-26

Copy link
Copy Markdown
Collaborator Author

Gateway test results

  • Total: 1
  • Passed: 0
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 1
  • Success rate: 0.0%
Provider Model Scenarios
openrouter google/gemini-2.0-flash-001 skipped: skip-check
Skipped (1)

openrouter/google/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

@harshiv-26

Copy link
Copy Markdown
Collaborator Author

/test-models

@harshiv-26

Copy link
Copy Markdown
Collaborator Author

Gateway test results

  • Total: 6
  • Passed: 0
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 6
  • Success rate: 0.0%
Provider Model Scenarios
google-vertex gemini-2.0-flash skipped: skip-check
google-vertex gemini-2.0-flash-001 skipped: skip-check
google-vertex gemini-2.0-flash-lite skipped: skip-check
google-vertex gemini-2.0-flash-lite-001 skipped: skip-check
google-vertex google/gemini-2.0-flash-001 skipped: skip-check
google-vertex google/gemini-2.0-flash-lite-001 skipped: skip-check
Skipped (6)

google-vertex/gemini-2.0-flash — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/gemini-2.0-flash-lite — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/gemini-2.0-flash-lite-001 — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/google/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

google-vertex/google/gemini-2.0-flash-lite-001 — skip-check (skipped)

Skip reason
deprecated or retired model

@harshiv-26

Copy link
Copy Markdown
Collaborator Author

Gateway test results

  • Total: 9
  • Passed: 6
  • Failed: 1
  • Validation failed: 1
  • Errored: 0
  • Skipped: 1
  • Success rate: 75.0%
Provider Model Scenarios
deepinfra ByteDance/Seed-2.0-code success: structured-output:stream, structured-output, tool-call, tool-call:stream, params, params:stream

failure: json-output

validation_failure: json-output:stream
deepinfra google/gemini-2.0-flash-001 skipped: skip-check
Failures (2)

deepinfra/ByteDance/Seed-2.0-code — json-output:stream (validation_failure)

Error
Traceback (most recent call last):
  File "/tmp/tmpb229sbuo/snippet.py", line 24, in <module>
    raise Exception("VALIDATION FAILED: json-output stream - no content received")
Exception: VALIDATION FAILED: json-output stream - no content received
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-deepinfra/ByteDance-Seed-2.0-code",
    messages=[
        {"role": "user", "content": "List 3 colors with their hex codes in JSON."},
    ],
    response_format={"type": "json_object"},
    stream=True,
)
import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: json-output stream - no content received")

_json.loads(_accumulated)
print("\nVALIDATION: json-output stream SUCCESS")

deepinfra/ByteDance/Seed-2.0-code — json-output (failure)

Error
Traceback (most recent call last):
  File "/tmp/tmptvzaib1b/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Error code: 500 - {'status': 'failure', 'message': 'Invalid response received from deepinfra: {"error":{"message":"{\\"error\\":{\\"code\\":\\"InvalidParameter\\",\\"message\\":\\"The parameter `response_format.type` specified in the request are not valid: `json_object` is not supported by this model. Request id: 02178066230326885fbb282d67f6232524942e9d2086bad783ca8\\",\\"param\\":\\"response_format.type\\",\\"type\\":\\"BadRequest\\"}}","type":"api_error","param":null,"code":null}}', 'error': {'message': 'Invalid response received from deepinfra: {"error":{"message":"{\\"error\\":{\\"code\\":\\"InvalidParameter\\",\\"message\\":\\"The parameter `response_format.type` specified in the request are not valid: `json_object` is not supported by this model. Request id: 02178066230326885fbb282d67f6232524942e9d2086bad783ca8\\",\\"param\\":\\"response_format.type\\",\\"type\\":\\"BadRequest\\"}}","type":"api_error","param":null,"code":null}}', 'type': 'APIError', 'code': '500'}, 'error_origin_level': 'api_error', 'provider': 'deepinfra'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-deepinfra/ByteDance-Seed-2.0-code",
    messages=[
        {"role": "user", "content": "List 3 colors with their hex codes in JSON."},
    ],
    response_format={"type": "json_object"},
    stream=False,
)
import json as _json

_content = response.choices[0].message.content
print(_content)

if not _content:
    raise Exception("VALIDATION FAILED: json-output - response content is empty")

_json.loads(_content)
print("VALIDATION: json-output SUCCESS")
Successes (6)

deepinfra/ByteDance/Seed-2.0-code — structured-output:stream (success)

Output
{"name": "Science Fair", "date": "Friday", "participants": ["Alice", "Bob"]}
VALIDATION: structured-output stream SUCCESS

deepinfra/ByteDance/Seed-2.0-code — structured-output (success)

Output
{"name":"Science Fair","date":"Friday","participants":["Alice","Bob"]}
VALIDATION: structured-output SUCCESS

deepinfra/ByteDance/Seed-2.0-code — tool-call (success)

Output
Function: get_weather
Arguments: {"location": "London"}
VALIDATION: tool-call SUCCESS

deepinfra/ByteDance/Seed-2.0-code — tool-call:stream (success)

Output
{"location": "London"}
VALIDATION: tool-call stream SUCCESS

deepinfra/ByteDance/Seed-2.0-code — params (success)

Output
The capital of France is Paris. Paris is not only the political capital but also a major cultural, economic, and historical center of the country. It 
... (truncated, 101 chars omitted)

deepinfra/ByteDance/Seed-2.0-code — params:stream (success)

Output
The capital of France is **Paris**. 

Paris is also France’s most populous city and a global hub for culture, art, politics, and economics, home to ic
... (truncated, 78 chars omitted)
Skipped (1)

deepinfra/google/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

@harshiv-26

Copy link
Copy Markdown
Collaborator Author

Gateway test results

  • Total: 1
  • Passed: 0
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 1
  • Success rate: 0.0%
Provider Model Scenarios
openrouter google/gemini-2.0-flash-001 skipped: skip-check
Skipped (1)

openrouter/google/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

@harshiv-26

Copy link
Copy Markdown
Collaborator Author

Gateway test results

  • Total: 4
  • Passed: 0
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 4
  • Success rate: 0.0%
Provider Model Scenarios
google-gemini gemini-2.0-flash skipped: skip-check
google-gemini gemini-2.0-flash-001 skipped: skip-check
google-gemini gemini-2.0-flash-lite skipped: skip-check
google-gemini gemini-2.0-flash-lite-001 skipped: skip-check
Skipped (4)

google-gemini/gemini-2.0-flash — skip-check (skipped)

Skip reason
deprecated or retired model

google-gemini/gemini-2.0-flash-001 — skip-check (skipped)

Skip reason
deprecated or retired model

google-gemini/gemini-2.0-flash-lite — skip-check (skipped)

Skip reason
deprecated or retired model

google-gemini/gemini-2.0-flash-lite-001 — skip-check (skipped)

Skip reason
deprecated or retired model

@harshiv-26 harshiv-26 enabled auto-merge (squash) June 5, 2026 12:33
@harshiv-26 harshiv-26 merged commit 6ca568d into main Jun 8, 2026
8 checks passed
@harshiv-26 harshiv-26 deleted the retire-models branch June 8, 2026 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants