Skip to content

perf: trim redundant prose from built-in tool schemas#338

Merged
tianzhou merged 2 commits into
mainfrom
perf/trim-tool-schema-redundancy-266
Jun 23, 2026
Merged

perf: trim redundant prose from built-in tool schemas#338
tianzhou merged 2 commits into
mainfrom
perf/trim-tool-schema-redundancy-266

Conversation

@tianzhou

@tianzhou tianzhou commented Jun 23, 2026

Copy link
Copy Markdown
Member

Summary

In multi-source mode, execute_sql and search_objects are registered per source (execute_sql_prod, execute_sql_staging, …), so every byte of their description and parameter docs is replicated O(n) across sources in the MCP tools/list. This trims prose from the MCP schema that merely restates what the structured JSON Schema already encodes — lossless context savings on the MCP path.

Trimmed in the MCP path (search-objects.ts Zod schema + the shared tool description):

  • search_objects description: the (schemas, tables, columns, procedures, functions, indexes) list — duplicates the object_type enum
  • pattern param: ". Default: %" — already in .default("%")
  • limit param: " (default: 100, max: 1000)" — already in .default(100).max(1000)

On the MCP path the model still receives every fact via the schema's enum/default/maximum, so there is no information loss there.

REST API / web UI (/api/sources)

The REST tool metadata uses ToolParameter, which has no enum/default/maximum fields — so prose is the only place that info lives. The API copies therefore keep the descriptive text (Default: %, default: 100, max: 1000), and the object_type param description spells out its allowed values, since the shared tool description no longer lists them. The REST API is not in the model context, so keeping it verbose costs no tokens. (Thanks @copilot for catching this.)

Context

This came out of #266. The originally-proposed fix there was to collapse the per-source tools into a single execute_sql(sql, source_id) + a list_sources discovery tool (O(1) context). That collapse was rejected because it removes per-source tool identity in the MCP client — users would lose the ability to select/enable/permission a specific source's tool (e.g. always-deny writes to prod), since that only works while source_id is part of the tool name. Under MCP's flat tool list, per-source selectability and an O(1) tool count are mutually exclusive.

This PR is the alternative: keep per-source tools (selectability preserved), reduce the per-tool MCP token cost instead. It's the lossless floor — stays O(n) in count, but removes the duplicated prose from the MCP schema with no downside.

Test plan

  • pnpm test for search-objects, tool-metadata, execute-sql suites — 72 passing
  • Verified the leading phrases ("Execute SQL queries", "Search and list database objects") pinned by startsWith/indexOf assertions are preserved

🤖 Generated with Claude Code

The execute_sql/search_objects tools are registered per-source in
multi-source mode, so every byte of their description and parameter
docs is replicated O(n) across sources. Remove prose that merely
restates what the structured JSON Schema already encodes:

- search_objects description: drop the "(schemas, tables, columns,
  ...)" list, which duplicates the object_type enum
- pattern param: drop "Default: %" (already in .default("%"))
- limit param: drop "(default: 100, max: 1000)" (already in
  .default(100).max(1000))

Zero information loss — the model still receives every fact via the
JSON Schema's enum/default/maximum. Per-source tool identity is
preserved, so MCP-client per-source selection and permissioning are
unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 23, 2026 15:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reduces repeated natural-language prose in built-in MCP tool descriptions/parameter docs to lower per-source tool-list context size in multi-source mode, relying on structured schema fields (enum/default/maximum) instead.

Changes:

  • Shortened search_objects tool description text in tool registration metadata.
  • Trimmed pattern and limit parameter descriptions in both the MCP Zod schema and the REST API “tool metadata” copy.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/utils/tool-metadata.ts Trims search_objects tool/param prose used for MCP registration and for /api/sources tool metadata.
src/tools/search-objects.ts Trims Zod .describe(...) strings for pattern and limit in the MCP tool input schema.

Comment thread src/utils/tool-metadata.ts Outdated
Comment thread src/utils/tool-metadata.ts Outdated
The REST API tool metadata (/api/sources, web UI) uses ToolParameter,
which carries no enum/default/maximum — so prose is the only place that
info lives, unlike the MCP path where the JSON Schema encodes it.
Trimming those API-only copies lost information with no token benefit
(the REST API is not in the model context).

Per Copilot review:
- Restore "Default: %" on pattern and "(default: 100, max: 1000)" on
  limit in buildSearchObjectsTool (API copy)
- Add the allowed object_type values to its param description, since the
  shared tool description no longer lists them

The MCP Zod schema trims in search-objects.ts are unchanged — that's
where the per-source token win is, and the JSON Schema still carries
enum/default/maximum there.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@tianzhou tianzhou merged commit d59ff7c into main Jun 23, 2026
3 checks passed
@tianzhou tianzhou deleted the perf/trim-tool-schema-redundancy-266 branch June 23, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants