fix(gateway): preserve system block cache control during rewrite#2949
Open
black-06 wants to merge 2 commits into
Open
fix(gateway): preserve system block cache control during rewrite#2949black-06 wants to merge 2 commits into
black-06 wants to merge 2 commits into
Conversation
Keep original system text blocks as separate migrated message content blocks instead of joining them into one string, and preserve each block's cache_control metadata when rewriting non-Claude-Code requests.
Contributor
|
All contributors have signed the CLA. ✅ |
Author
|
I have read the CLA Document and I hereby sign the CLA |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
摘要
在 rewriteSystemForNonClaudeCode 时, 不合并为字符串, 而是将原始的 system content 以 block 为单位迁移 (包括 cache_control)
背景
问题
在使用 sub2api 作为 base_url 测试缓存命中时发现, 如果手动控制 system 块缓存, 多次请求缓存无法命中
最小复现请求是:
{ "model": "claude-opus-4-7", "max_tokens": 64, "system": [ { "type": "text", "text": "system prompt 1 ..." }, { "type": "text", "text": "system prompt 2 ...", "cache_control": { "type": "ephemeral" } } ], "messages": [ { "role": "user", "content": "reply with one short sentence." } ] }sub2api 会将 system 注入到 messages 中,改写影响了对应的 cache_control:
{ "model": "claude-opus-4-7", "max_tokens": 64, "messages": [ { "role": "user", "content": [ { "text": "[System Instructions]\nsystem prompt 1 ...\n\nsystem prompt 2 ...", "type": "text" } ] }, { "role": "assistant", "content": "Understood. I will follow these instructions." }, { "role": "user", "content": "Request 1: reply with one short sentence." } ], "system": ... }方案
将 system text block 整个迁移为 user message content block, 保留 cache_control 属性
{ "model": "claude-opus-4-7", "max_tokens": 64, "messages": [ { "role": "user", "content": [ - { "text": "[System Instructions]\nsystem prompt 1 ...\n\nsystem prompt 2 ...", "type": "text" } + { "text": "[System Instructions]", "type": "text" }, + { "text": "system prompt 1 ...", "type": "text" }, + { "text": "system prompt 2 ...", "type": "text", "cache_control": { "type": "ephemeral" } }, ] }, { "role": "assistant" "content": "Understood. I will follow these instructions." }, { "role": "user", "content": "Request 1: reply with one short sentence." } ], "system": ... }