fix(gateway): re-apply tool rate limiter after system_configs overlay#1111
Open
nguyennguyenit wants to merge 1 commit intodevfrom
Open
fix(gateway): re-apply tool rate limiter after system_configs overlay#1111nguyennguyenit wants to merge 1 commit intodevfrom
nguyennguyenit wants to merge 1 commit intodevfrom
Conversation
setupToolRegistry creates the rate limiter from cfg.Tools.RateLimitPerHour during early bootstrap (gateway.go line 137). System_configs DB overlay runs ~50 lines later via cfg.ApplySystemConfigs (line 191), so any DB override of tools.rate_limit_per_hour was silently lost - the limiter object was already initialised from the JSON5 default. Symptom in production: editing tools.rate_limit_per_hour via system_configs table or the config HTTP API had no effect on running gateways. Operators had to inject a config.json file to change the value, defeating the DB-as-source-of-truth pattern that other tunables rely on. Re-apply the limiter after ApplySystemConfigs runs. Safe ordering: server has not started yet, no in-flight tool calls, and SetRateLimiter is a plain field assignment with no shutdown cost on the discarded limiter. nil case (rate_limit_per_hour <= 0) also handled so DB writes can disable the limiter without restart. Tests: new TestRegistry_SetRateLimiter_ReplacesPriorLimiter covers both the replace-with-higher-limit path and the nil-disables path. All existing rate limiter tests still pass. Note: the same ordering pattern likely affects other config consumed inside setupToolRegistry (tools.scrub_credentials, MCP server wiring). Out of scope for this hotfix - they need a deeper restructure.
🔍 Code Review — fix(gateway): re-apply tool rate limiter after system_configs overlay🎯 Tổng quanHotfix: rate limiter init từ config trước khi DB overlay chạy → DB value bị silent loss. Production: Zalo bot burst past limit, operator bump DB → no effect. Scope: 2 files, +48/-0, 1 commit 🤖 CI + MergeCI: go: PENDING, web: PASSED (52s) ✅ Điểm Tốt
📊 SummaryNo CRITICAL, no MEDIUM. Clean hotfix. 💡 Recommendation🟢 APPROVED Wait for CI go:complete before merging. 🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
setupToolRegistryinitialises the tool rate limiter fromcfg.Tools.RateLimitPerHourearly in startup (gateway.go line 137), butsystem_configsDB overlay (cfg.ApplySystemConfigs) runs ~50 lines later. DB overrides oftools.rate_limit_per_hourwere silently lost - the limiter object was already built from the JSON5 default.ApplySystemConfigsso the DB value wins. Server has not started yet, no in-flight tool calls.nilcase handled too so DB writes can disable rate limiting without restart.Why this matters
Production observation: a Zalo sales-bot session burst past the 150/h default and operators tried to bump it via
system_configsUPDATE + container restart. No effect. Workaround was to inject/app/data/config.jsonwith{"tools":{"rate_limit_per_hour":800}}, which contradicts the DB-as-source-of-truth pattern other tunables rely on.Tests
TestRegistry_SetRateLimiter_ReplacesPriorLimitercovers the replace-with-higher-limit path and thenil-disables path.go build ./...,go build -tags sqliteonly ./...,go vet ./...clean.Out of scope
The same ordering pattern likely affects other config consumed inside
setupToolRegistry(tools.scrub_credentials, MCP server wiring from DB). Those need a broader restructure - kept out of this hotfix.Test plan
dev, deploy, setUPDATE system_configs SET value='800' WHERE key='tools.rate_limit_per_hour', restart container, verify log linetool rate limiting reapplied from system_configs per_hour=800.tool rate limiting enabled per_hour=<json5_default>log still fires earlier - the reapply log is additive.0, restart, verify limiter is disabled (notool rate limit exceedederrors regardless of call volume).