Skip to content

feat(temporal): anti-drift injection for all provider requests#2574

Open
dwillitzer wants to merge 2 commits into
router-for-me:mainfrom
dwillitzer:feat/temporal-anti-drift
Open

feat(temporal): anti-drift injection for all provider requests#2574
dwillitzer wants to merge 2 commits into
router-for-me:mainfrom
dwillitzer:feat/temporal-anti-drift

Conversation

@dwillitzer
Copy link
Copy Markdown

Summary

  • Adds temporal anti-drift system that injects <temporal> tags into every outbound request to all providers
  • Format-aware injection: Claude format (top-level system array) vs OpenAI format (messages array) — no 400 errors
  • Single injection point in conductor covers all providers (Antigravity, Claude, GLM, Qwen, OpenRouter) without per-executor changes

Why This Matters

In long agent sessions, AI models hallucinate dates — wrong days, wrong weeks, wrong months. This cascades into incorrect tool calls, wrong file paths, and stale context decisions. By injecting fresh temporal context on every request, models stay grounded in the current date/time.

Pattern validated in pi-rs where TemporalDriftDetector found real drift in sessions >20 turns.

Changes

  • New: internal/temporal/temporal.go — temporal tag builder + format-aware payload injection (~120 lines)
  • Modified: sdk/cliproxy/auth/conductor.go — injects before Execute() and ExecuteStream()
  • Zero PII — only UTC timestamps, no user data

Test plan

  • go build -o cliproxyapi ./cmd/server/ — compilation succeeds
  • Claude API requests no longer return 400 (format-aware system injection)
  • OpenAI-format requests receive temporal system message in messages array
  • All providers route correctly with injection enabled
  • Smoke test: curl -s -X POST localhost:8317/v1/chat/completions -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" -d '{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"what day is it"}],"max_tokens":100}'

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces model health monitoring, OpenRouter metadata enrichment, and a temporal injection feature that adds current date and time context to LLM requests. It also expands model metadata in API responses and updates routing logic for Claude-related User-Agents. Feedback focuses on thread-safety issues in the new temporal package, specifically recommending the use of atomic operations for the global request counter to prevent race conditions.

}

// requestCounter tracks how many requests have been seen for interval-based injection.
var requestCounter uint64
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The global variable requestCounter is not thread-safe. In a concurrent environment like a web server, this will lead to race conditions when multiple requests are processed simultaneously. Consider using sync/atomic to increment the counter safely.

Comment thread internal/temporal/temporal.go Outdated
if cfg.InjectInterval <= 0 {
return true
}
requestCounter++
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The increment operation on requestCounter is not atomic. Use atomic.AddUint64(&requestCounter, 1) to ensure thread safety.

Suggested change
requestCounter++
atomic.AddUint64(&requestCounter, 1)

Comment thread internal/temporal/temporal.go Outdated
return true
}
requestCounter++
return requestCounter%uint64(cfg.InjectInterval) == 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The read operation on requestCounter is not atomic. Use atomic.LoadUint64(&requestCounter) to ensure thread safety.

Suggested change
return requestCounter%uint64(cfg.InjectInterval) == 0
return atomic.LoadUint64(&requestCounter)%uint64(cfg.InjectInterval) == 0

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 71a6b44de3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread internal/temporal/temporal.go Outdated
Comment on lines +84 to +85
system.ForEach(func(_, v gjson.Result) bool {
newSystem = append(newSystem, v.Value())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve scalar system prompts during Claude injection

This rewrite path is entered whenever a top-level system key exists, but injectIntoClaudeSystem only preserves existing entries by iterating system.ForEach and never normalizes scalar system values. The repo still supports string system prompts elsewhere (e.g. Claude executor handles system strings), so payloads like {"system":"...","messages":[...]} can have their original system instruction dropped or reshaped when temporal injection runs, changing model behavior.

Useful? React with 👍 / 👎.

Comment on lines +103 to +104
var newMessages []any
newMessages = append(newMessages, temporalMsg)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid prepending temporal message before Imagen prompt parse

Prepending the temporal message at messages[0] breaks the Imagen compatibility path that derives prompt text from the first non-empty messages.#.content entry (internal/runtime/executor/gemini_vertex_executor.go loop at lines 125-130). After this change, messages-format Imagen requests can use the injected temporal block as the prompt instead of the user’s prompt, producing incorrect image generations.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 115cb5f0ab

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +70 to +75
// OpenAI format: messages array exists
if messages := gjson.GetBytes(payload, "messages"); messages.IsArray() {
return injectIntoMessages(payload, messages, text)
}

return payload
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Inject temporal tags for Gemini-style payloads

InjectIntoPayload only handles top-level system and messages, then returns the payload unchanged for other schemas. Native Gemini/Vertex requests use contents payloads, so this path skips temporal injection entirely and anti-drift is not applied on those provider routes, despite this change advertising provider-wide request injection.

Useful? React with 👍 / 👎.

globalRegistry := registry.GetGlobalRegistry()

// Get all available models with full metadata
models := globalRegistry.GetAvailableModels("openai")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Report health from all registered models

GetModelsHealth starts from GetAvailableModels("openai"), which excludes models whose clients are fully suspended/unavailable. That causes /models/health to omit exactly the models operators need to debug during outages, contradicting the handler’s stated goal of reporting all registered models.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50ed89fcbd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread sdk/cliproxy/auth/conductor.go Outdated

// Inject temporal context into the request payload
if temporal.ShouldInject(temporal.DefaultConfig()) {
req.Payload = temporal.InjectIntoPayload(req.Payload, req.Model)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Resolve model alias before deciding temporal injection

This injects temporal text based on req.Model before alias/model-pool resolution, so the image-model safeguard can be bypassed when users call an aliased route model (for example, img) that later resolves to an Imagen upstream model. In that case InjectIntoPayload sees a non-image alias and prepends a system message, then executeMixedOnce rewrites to the Imagen model (execReq.Model = upstreamModel), and Imagen prompt extraction can consume the injected block instead of the user prompt. Fresh evidence: alias remapping happens only after this line, so the prior Imagen fix does not apply to aliased image routes.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: efb9735bfa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

enhancedModels = append(enhancedModels, enhancedModel)
}
// Build the sources map indicating where context_length came from
sources := registry.BuildModelSources(globalRegistry)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Define newly referenced OpenRouter registry functions

This change introduces calls to registry.BuildModelSources, registry.GetOpenRouterLastRefresh, registry.TriggerOpenRouterRefresh, and registry.StartOpenRouterEnrichment, but no corresponding definitions are added in the internal/registry package. That leaves these references unresolved at build time, so server/management packages cannot compile until the symbols are implemented (or the calls are removed).

Useful? React with 👍 / 👎.

Comment on lines +83 to +85
// OpenAI format: messages array exists
if messages := gjson.GetBytes(payload, "messages"); messages.IsArray() {
return injectIntoMessages(payload, messages, text)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Inject temporal tags for OpenAI Responses payloads

InjectIntoPayload only injects when a top-level system or messages field is present, and otherwise returns the body unchanged. Requests sent through /v1/responses use the Responses schema (input/instructions) and are forwarded as-is, so temporal anti-drift is skipped for that endpoint family despite this feature being applied in the central conductor path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8ed002894

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +180 to +182
if info := registry.GetModelInfo(modelID, ""); info != nil {
info.ContextLength = ctxLen
enrichedCount++
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Update registry entries instead of cloned model info

enrichModelsFromOpenRouter calls GetModelInfo and assigns info.ContextLength, but GetModelInfo returns a cloned ModelInfo, so this write is discarded and never updates r.models. That makes enrichment effectively a no-op even though enrichedCount is incremented and logs say models were enriched, leaving downstream model metadata unchanged.

Useful? React with 👍 / 👎.

lastRefresh := registry.GetOpenRouterLastRefresh()
c.JSON(http.StatusOK, gin.H{
"status": "refreshed",
"enriched_count": count,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return true enriched count in refresh response

RefreshModels publishes enriched_count from TriggerOpenRouterRefresh, but that function returns the size of the OpenRouter cache (len(contextLength)), not how many registered models were enriched in this refresh. This can report misleadingly large counts (or wrong error semantics when cache is empty due fetch failure), so operators cannot trust the endpoint output.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:
I found two blocking issues in this change set.

Key findings:

  • Blocking: the temporal injector is enabled unconditionally via temporal.DefaultConfig() in the execution path, so every proxied request now gets a hidden system prompt with no user opt-in and no configuration escape hatch. That changes request semantics for all providers and breaks API compatibility rather than preserving it.
  • Blocking: internal/registry/openrouter_enrichment.go adds a new http.Client{Timeout: 30 * time.Second} / context.WithTimeout fetch path. This repo explicitly restricts new timeout-driven network behavior to a small set of documented exceptions, and this enrichment path is not one of them.

Test plan:

  • Reviewed the temporal injection flow and the new registry enrichment loop plus management endpoints.

Copy link
Copy Markdown
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:
I like the goal, but this PR currently mixes the temporal injection feature with unrelated registry/management work and introduces a couple of blockers.

Blocking findings:

  • sdk/cliproxy/auth/conductor.go hard-codes temporal.DefaultConfig() at the injection call sites instead of wiring the feature to runtime config, which makes this always-on request mutation impossible to disable or roll back if it causes provider regressions.
  • internal/registry/openrouter_enrichment.go adds a new long-lived background HTTP poller with its own timeout/refresh policy. That is a separate runtime/network behavior change from the temporal feature and should be reviewed independently instead of landing as part of this PR.

Test plan:

  • Reviewed the temporal injection path, the new management endpoints, and the OpenRouter enrichment worker.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8fb9745510

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread sdk/cliproxy/auth/conductor.go Outdated
func (m *Manager) temporalConfig() temporal.Config {
cfg, _ := m.runtimeConfig.Load().(*internalconfig.Config)
if cfg != nil {
return cfg.Temporal
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fall back to temporal defaults when config omits section

temporalConfig() returns cfg.Temporal whenever a runtime config is loaded, but Config.Temporal.Enabled defaults to false when the temporal block is absent in YAML. Because SetConfig always stores a non-nil config, ShouldInject sees Enabled=false and skips injection for all normal deployments unless users explicitly add temporal.enabled: true, which silently disables the feature this commit introduces.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 956ce7f040

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +1234 to +1235
if temporal.ShouldInject(m.temporalConfig()) {
execReq.Payload = temporal.InjectIntoPayload(execReq.Payload, upstreamModel)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply temporal injection in CountTokens path

Temporal rewriting is added for Execute/ExecuteStream here, but executeCountMixedOnce still sends the original payload to CountTokens. Endpoints that preflight with token-count APIs (for example Claude/Gemini count routes via ExecuteCountWithAuthManager) will undercount because the injected temporal system block is missing there, so a request can pass counting and then fail real execution with token/context limits.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 33f0a789b3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread sdk/cliproxy/auth/conductor.go Outdated
Comment on lines +285 to +286
if cfg != nil && cfg.Temporal != nil {
return *cfg.Temporal
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Merge temporal defaults when config block is partially set

temporalConfig() returns *cfg.Temporal as-is whenever the block exists, so omitted YAML fields keep Go zero-values. A config like temporal: { inject_interval: 5 } leaves Enabled=false, causing ShouldInject to skip all requests even though defaults are documented as enabled-by-default; this silently disables the feature for partial configs instead of applying defaults plus overrides.

Useful? React with 👍 / 👎.

Comment on lines +191 to +192
for modelID, ctxLen := range openRouterContextLengths {
openRouterStore.contextLength[modelID] = ctxLen
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cache OpenRouter source using local model IDs

The enrichment cache stores keys from the OpenRouter response (openRouterContextLengths), but source lookup later is done with local registry IDs via GetOpenRouterContextLengthSource(modelID) in BuildModelSources. For non-exact matches (the same function explicitly supports cases like local gemini-3.1-pro matched from provider-prefixed IDs), source attribution will miss and /models/health reports provider/static even when context length came from OpenRouter.

Useful? React with 👍 / 👎.

Comment on lines +137 to +138
// Get all registered models and enrich those lacking context_length
allModels := registry.GetAvailableModels("openai")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include unavailable models in enrichment pass

The enrichment loop uses GetAvailableModels("openai"), which excludes models whose clients are currently suspended/unavailable, but this function is intended to enrich registered models lacking context_length. During outages or quota suspension, those models are skipped entirely and remain unenriched until they become available again, undermining the management health metadata this change introduces.

Useful? React with 👍 / 👎.

@dwillitzer dwillitzer force-pushed the feat/temporal-anti-drift branch from 1124bcd to 375e840 Compare April 17, 2026 22:54
dwillitzer added a commit to dwillitzer/CLIProxyAPI that referenced this pull request Apr 17, 2026
Adds a background enrichment job that populates `context_length` metadata
for registered models by fetching OpenRouter's public models catalog
(https://openrouter.ai/api/v1/models). Runs once on startup and then
refreshes every 24 hours.

## Why

Clients that surface model capabilities (e.g. "supports 1M context")
currently rely on hand-maintained entries in internal/registry/models/
models.json. OpenRouter already publishes authoritative context_length
values for every routed model; consuming that upstream eliminates the
maintenance burden for any model OpenRouter fronts and keeps the catalog
accurate as providers bump windows.

## Behavior

- Startup fetch populates the in-memory enrichment store; if the fetch
  fails (network, upstream outage, etc.) the registry falls back to the
  static JSON with no service impact.
- 24-hour periodic refresh keeps the cache fresh without hot-pathing
  the OpenRouter API.
- Management endpoint exposes enriched metadata so the UI and clients
  can display accurate context windows.
- model_registry.RefreshModels() now writes through to the live
  registration so the enriched count is observable and returned.

## Timeout pattern

Uses `http.Client{Timeout: 30s}` + `context.WithTimeout` — structurally
identical to the existing precedent at
`internal/misc/antigravity_version.go` (commits 3774b56, 8d5e470),
which was accepted upstream for the same class of periodic metadata
fetch. Happy to tighten the timeout or rewrite the pattern if the
maintainer prefers — just point at the approved shape.

## Scope

This PR is intentionally scoped to OpenRouter enrichment only. The
temporal-anti-drift feature that was previously bundled in the same
branch is now split out to router-for-me#2574.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dwillitzer
Copy link
Copy Markdown
Author

@luispater — force-pushed a full rewrite addressing both blocking findings. This PR is now temporal-only; the OpenRouter enrichment you flagged separately is split out to #2880.

Blocker #1 — unconditional injection / no opt-in

  • DefaultConfig() now returns {Enabled: false}. With the temporal: YAML block absent, behavior is identical to a non-temporal build. No surprise request mutation.
  • Opt-in is explicit:
    temporal:
      enabled: true
      inject_interval: 0   # 0 = every request; N = every Nth request
  • Config.Temporal is *temporal.Config (pointer) so YAML-omitted and explicit-zero are distinguishable at runtime.

Blocker #2 — OpenRouter enrichment

Scope

  • Single commit, +389 / -5 (was 9 commits, +667 / -29).
  • Only touches: internal/temporal/, internal/config/config.go (adds Temporal *temporal.Config field), sdk/cliproxy/auth/conductor.go (hook + live config resolution + reload log), internal/runtime/executor/claude_executor.go + test (preserve temporal across Claude Code-shaped system rewrite), internal/runtime/executor/antigravity_executor.go (re-inject after signature validation).

Also fixed under this scope

Two injection-stripping paths that were silent in the prior revision:

  1. Claude OAuth cloakingcheckSystemInstructionsWithSigningMode rebuilds the system array and the OAuth sanitizer drops forwarded user-system content. The rebuilder now preserves any injected <temporal_context> block and excludes it from the sanitizer.
  2. Antigravity signature validationreq.Payload = originalPayload after validation reverted the conductor's mutation. The executor now re-applies injection at all three generation sites (Execute, streaming variant, retry stream). CountTokens intentionally left alone.

Verified

End-to-end against a running instance:

  • GLM (openai-compat): prompt_tokens 33 → 120; <temporal_context> visible in request-log.
  • Claude via Antigravity: input_tokens 34 → 145; model reproduced the element verbatim.
  • Unit tests cover: injection, dedup guard, image-model skip, opt-in default, explicit enable, disabled, Claude OAuth preservation regression.

Ready for re-review when you have time.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 375e840514

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +710 to +712
if temporal.ShouldInject(temporal.ConfigOrDefault(e.cfg.Temporal)) {
req.Payload = temporal.InjectIntoPayload(req.Payload, baseModel)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid double-counting temporal interval in Antigravity paths

The Antigravity executor calls ShouldInject again after signature validation even though the conductor already evaluated it for the same outbound attempt, and req.Payload is reset to originalPayload before this second check. That means interval mode increments the global counter twice per Antigravity request; for example, with inject_interval: 2, the second check is always true so Antigravity requests are injected every time instead of every second request, and other interval values are similarly phase-shifted.

Useful? React with 👍 / 👎.

@dwillitzer
Copy link
Copy Markdown
Author

Resolved merge conflict with upstream main in internal/runtime/executor/claude_executor_test.go.

Conflict: upstream renamed a test comment ("survive conversion" → "survive forwarding"). Our temporal preservation regression test was adjacent.

Resolution: Kept the TestCheckSystemInstructionsWithSigningMode_PreservesTemporalContext test, adopted upstream's comment wording. Build passes, all executor + temporal tests pass.

Still awaiting @luispater re-review of the clean rewrite (both original blockers addressed in 375e840).

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 827545ff30

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +1564 to +1566
if strings.Contains(txt, "<temporal_context>") {
preservedTemporal = txt
return false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate temporal marker before bypassing OAuth sanitization

This branch preserves any system text containing "<temporal_context>" and later excludes matching blocks from the forwarded/sanitized user-system path, so in OAuth cloaking mode a caller can wrap arbitrary instructions in a fake temporal wrapper and have them forwarded unsanitized in system. That bypasses the cloaking behavior that was intended to strip third-party prompt structure, and it was introduced by the new preservation logic.

Useful? React with 👍 / 👎.

Comment thread internal/temporal/temporal.go Outdated
Comment on lines +136 to +137
if bytes.Contains(payload, []byte("<temporal ")) {
return payload
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match escaped temporal tags in dedup guard

The dedup check only looks for raw "<temporal " bytes, but JSON payloads frequently encode < as \u003c (including output produced by sjson in this codepath). When a request already contains an escaped temporal tag, this guard misses it and injects a second temporal block, producing duplicated/conflicting temporal context instead of the intended no-op.

Useful? React with 👍 / 👎.

@dwillitzer
Copy link
Copy Markdown
Author

Addressed both findings from the Codex review on 827545ff:

P1 — Validate temporal marker before bypassing OAuth sanitization (389f2e30)

  • Added temporal.IsValidTemporalBlock() which regex-validates that preserved content matches the BuildTemporalTag output shape: <temporal_context><temporal day="..." date="YYYY-MM-DD" .../></temporal_context>.
  • The Claude executor now calls this before preserving a temporal block in the rebuilt system array. Arbitrary content wrapped in <temporal_context> tags is no longer forwarded unsanitized — it falls through to the normal sanitization path.

P2 — Match escaped temporal tags in dedup guard (389f2e30)

  • The InjectIntoPayload dedup check now also matches < (<-escaped <) in addition to raw <temporal . Prevents double injection when an upstream serializer encodes angle brackets.

Test fix: The PreservesTemporalContext test was constructing malformed JSON (unescaped inner quotes in the temporal tag text). gjson silently truncated it. Fixed to properly escape inner quotes via strings.ReplaceAll.

Build passes, all temporal and executor tests green.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 389f2e303c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +1340 to +1341
if temporal.ShouldInject(m.temporalConfig()) {
execReq.Payload = temporal.InjectIntoPayload(execReq.Payload, upstreamModel)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Count temporal interval once per inbound request

temporal.ShouldInject(...) is evaluated inside the per-model execution loop, so a single user request can increment the global interval counter multiple times when fallback models/providers are tried. That breaks the documented inject_interval semantics (“every Nth request”) and makes injection depend on retry behavior (e.g., interval=2 can inject on the first request if the first candidate fails). Compute the inject decision once per inbound request and reuse it for all retries/attempts (same pattern also appears in the streaming path).

Useful? React with 👍 / 👎.

// Re-apply temporal injection after signature validation (which forces
// req.Payload back to the raw original). Without this, the conductor's
// anti-drift hook is silently stripped for every Antigravity request.
if temporal.ShouldInject(temporal.ConfigOrDefault(e.cfg.Temporal)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard nil config before antigravity temporal check

These new calls dereference e.cfg.Temporal without checking whether e.cfg is nil. NewAntigravityExecutor accepts a *config.Config and existing tests construct it with nil, so any valid request that reaches this path will panic with a nil-pointer dereference. Add an e.cfg nil guard (or pass a nil *temporal.Config to ConfigOrDefault) before reading .Temporal.

Useful? React with 👍 / 👎.

// <temporal_context> wrapper around a single self-closing <temporal .../> tag
// with recognizable date attributes and nothing else.
var validTemporalRe = regexp.MustCompile(
`^<temporal_context><temporal\s+day="[A-Z][a-z]+" date="\d{4}-\d{2}-\d{2}"[^>]*/></temporal_context>$`,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Tighten temporal validator to block attribute smuggling

IsValidTemporalBlock is intended to prevent bypassing OAuth sanitization, but the regex allows arbitrary extra attributes via [^>]* after the required day/date fields. In checkSystemInstructionsWithSigningMode, any block passing this validator is preserved in system and excluded from sanitized user-system forwarding, so callers can still inject instructions in added attribute values. Fresh evidence: the new validator itself permits attacker-controlled attributes because of this wildcard.

Useful? React with 👍 / 👎.

dwillitzer and others added 2 commits May 6, 2026 04:43
Adds an optional temporal anti-drift system that injects a short
<temporal_context> XML element into the system prompt of outbound LLM
requests. Long-running agent sessions (>20 turns) frequently hallucinate
the current date; grounding every request in fresh wall-clock context
eliminates the drift.

Addresses the two blocking findings on the prior revision:

1. OPT-IN (was: unconditionally enabled)
   - DefaultConfig() now returns {Enabled: false}. When the `temporal`
     YAML block is omitted, behavior is identical to the non-temporal
     build. No surprise request mutation.
   - Enable via:
       temporal:
         enabled: true
         inject_interval: 0

2. CONFIG WIRING (was: hardcoded temporal.DefaultConfig() at call sites)
   - Config.Temporal is a *temporal.Config pointer, preserving the
     nil-vs-explicit-zero distinction YAML can't otherwise express.
   - Conductor reads the live runtime config on every request via
     Manager.temporalConfig(); no reference to DefaultConfig() in the
     hot path.
   - ConfigOrDefault() helper exposes the same resolution logic to
     downstream executors that need to re-apply injection.

Cross-cutting correctness:

- Claude OAuth (claude_executor.go): checkSystemInstructionsWithSigningMode
  rebuilds the system array to mimic Claude Code's shape. It now
  preserves any injected <temporal_context> block across the rewrite and
  excludes it from the OAuth-sanitized forwarded user-system text, so
  injection survives both the rewrite and the sanitizer.

- Antigravity (antigravity_executor.go): validateAntigravityRequestSignatures
  followed by `req.Payload = originalPayload` discards any conductor
  mutation. The executor now re-applies temporal.InjectIntoPayload after
  validation at all three generation sites (Execute, streaming, retry
  stream). CountTokens is intentionally left alone — it never reaches
  generation.

- Other executors (OpenAI-compat, Codex, Gemini) don't overwrite
  req.Payload, so the single conductor-level hook is sufficient.

Safety properties (covered by unit tests):

- No PII — only UTC and local timestamps.
- Format-aware — Claude (top-level `system`) vs OpenAI (`messages`).
- Dedup guard — payload already containing <temporal skips injection.
- Image-model skip — imagen / image-gen are never modified.
- Thread-safe — request counter uses sync/atomic.

Observability:

- Manager.SetConfig logs the effective temporal state on startup and on
  every reload: `temporal injection state: enabled=%t interval=%d source=%s`
  where source is `default` (YAML omitted) or `config` (explicit block).
  Helps diagnose the class of reload-state regressions that are very
  hard to catch otherwise.

Files:
- internal/temporal/temporal.go         (+package, +doc)
- internal/temporal/temporal_test.go    (6 unit tests)
- internal/config/config.go             (Temporal *Config field)
- sdk/cliproxy/auth/conductor.go        (hook + config resolution + log)
- internal/runtime/executor/claude_executor.go         (OAuth preserve)
- internal/runtime/executor/claude_executor_test.go    (regression test)
- internal/runtime/executor/antigravity_executor.go    (re-inject x3)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1: Prevent OAuth cloaking bypass via fake <temporal_context> wrappers.
The Claude executor now validates that preserved temporal blocks match
the shape produced by BuildTemporalTag (regex-checked) before bypassing
sanitization. Arbitrary content in a <temporal_context> wrapper is
no longer forwarded unsanitized in the system array.

P2: The dedup guard now also checks for <-escaped temporal tags,
which some JSON serializers produce. This prevents double injection
when the payload already contains an escaped temporal block.

Also fixes the PreservesTemporalContext test to properly JSON-escape
inner quotes in the hardcoded payload (the previous version produced
malformed JSON that gjson silently truncated).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dwillitzer dwillitzer force-pushed the feat/temporal-anti-drift branch from 389f2e3 to 0da0939 Compare May 6, 2026 11:52
dwillitzer added a commit to dwillitzer/CLIProxyAPI that referenced this pull request May 6, 2026
Adds a background enrichment job that populates `context_length` metadata
for registered models by fetching OpenRouter's public models catalog
(https://openrouter.ai/api/v1/models). Runs once on startup and then
refreshes every 24 hours.

## Why

Clients that surface model capabilities (e.g. "supports 1M context")
currently rely on hand-maintained entries in internal/registry/models/
models.json. OpenRouter already publishes authoritative context_length
values for every routed model; consuming that upstream eliminates the
maintenance burden for any model OpenRouter fronts and keeps the catalog
accurate as providers bump windows.

## Behavior

- Startup fetch populates the in-memory enrichment store; if the fetch
  fails (network, upstream outage, etc.) the registry falls back to the
  static JSON with no service impact.
- 24-hour periodic refresh keeps the cache fresh without hot-pathing
  the OpenRouter API.
- Management endpoint exposes enriched metadata so the UI and clients
  can display accurate context windows.
- model_registry.RefreshModels() now writes through to the live
  registration so the enriched count is observable and returned.

## Timeout pattern

Uses `http.Client{Timeout: 30s}` + `context.WithTimeout` — structurally
identical to the existing precedent at
`internal/misc/antigravity_version.go` (commits 3774b56, 8d5e470),
which was accepted upstream for the same class of periodic metadata
fetch. Happy to tighten the timeout or rewrite the pattern if the
maintainer prefers — just point at the approved shape.

## Scope

This PR is intentionally scoped to OpenRouter enrichment only. The
temporal-anti-drift feature that was previously bundled in the same
branch is now split out to router-for-me#2574.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0da09392ea

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +150 to +151
if messages := gjson.GetBytes(payload, "messages"); messages.IsArray() {
return injectIntoMessages(payload, messages, text)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve native Claude payload shape when injecting

When temporal injection is enabled for a native Claude Messages request that omits the optional top-level system field, this branch treats the messages array as OpenAI chat format and prepends a role: "system" message. The Claude executor later translates claudeclaude via the registry fallback, so that injected message is forwarded unchanged instead of being moved into top-level system (the repo’s OpenAI→Claude translator does that conversion explicitly). Native Claude messages only support conversational roles, so otherwise valid Claude requests without system start failing once temporal injection is enabled; pass the source format into this helper or inject a top-level Claude system block for Claude-shaped payloads.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants