feat(gateway): add downstream keepalive for non-stream compact responses#2976
Open
bud-primordium wants to merge 1 commit into
Open
feat(gateway): add downstream keepalive for non-stream compact responses#2976bud-primordium wants to merge 1 commit into
bud-primordium wants to merge 1 commit into
Conversation
While waiting for upstream compact responses, periodically write blank lines to the downstream connection and flush, preventing reverse proxies (e.g. Cloudflare Tunnel, Nginx) from closing the connection due to idle timeout. New config: gateway.openai_compact_nonstream_keepalive_interval (seconds) Default 0 (disabled); valid range 5-60 when enabled. Refs Wei-Shaw#2773, Wei-Shaw#2243
Contributor
|
All contributors have signed the CLA. ✅ |
Author
|
I have read the CLA Document and I hereby sign the CLA |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
通过 Cloudflare Tunnel、Nginx 等反向代理转发
/responses/compact请求时,上游 compact 在模型处理期间可能长时间不返回任何字节。当静默时间超过反代的空闲/读超时(如 Cloudflare Free 约 100-120 秒),反代主动断开连接,Sub2API 写响应时出现broken pipe,客户端报stream disconnected before completion。当前 compact passthrough 通过
normalizeOpenAIPassthroughOAuthBody强制stream=false。非流式路径在拿到 upstream response 后仍会先读取完整 body(ReadUpstreamResponseBody),随后才c.Data()一次性写给下游。整个过程中下游可能长期零字节传输。流式路径已有
stream_keepalive_interval(SSE comment 心跳),但 compact 被强制非流式无法受益。#2243 曾尝试将 compact 改为 SSE 流式,但验证发现上游 compact endpoint 在处理期间不发 SSE event,且 TCP keepalive 无法解决应用层空闲超时,方案关闭。本 PR 采用不同机制:不依赖上游发数据,Sub2API 主动向下游写空行心跳。实现方式参考了 CLIProxyAPI 的
nonstream-keepalive-interval(StartNonStreamingKeepAlive)。改动
新增配置
gateway.openai_compact_nonstream_keepalive_interval(秒),默认0禁用,启用时范围 5-60。启用后,对
/responses/compact非流式 passthrough 请求:forwardOpenAIPassthrough()完成 body normalize、本地 policy 校验、access token 获取和 upstream request 构造后,httpUpstream.Do()前启动 goroutine,按配置间隔向下游写\n并Flush()handleNonStreamingResponsePassthrough或handlePassthroughSSEToJSON写下游前停止心跳,写出 JSON bodyJSON 标准允许前导空白,测试已验证
json.Valid(bytes.TrimSpace(body))成立。Trade-off
一旦心跳写出第一个
\n,HTTP status 会被提交为 200。此后若上游返回 >=400,无法再向客户端传递正确状态码,也不再触发 429/529 failover,只能直接代理错误 body。此时会记录compact_keepalive_committed=true诊断日志。心跳未写出时保持现有 failover 和错误处理逻辑不变。功能默认禁用,仅在部署方明确需要绕过反代空闲超时时启用。
配置
环境变量:
GATEWAY_OPENAI_COMPACT_NONSTREAM_KEEPALIVE_INTERVAL=15测试
\n开头且json.Valid(TrimSpace(body))生产验证
已在自建 Cloudflare Tunnel (Free) + VPS 环境部署验证,keepalive 间隔 15 秒,Codex CLI compact 请求不再出现
broken pipe/stream disconnected。Related