Skip to content

feat(web_search): add provider arg to force a specific engine#1108

Open
mozaa-solana wants to merge 1 commit intonextlevelbuilder:devfrom
mozaa-solana:feat/web-search-provider-param
Open

feat(web_search): add provider arg to force a specific engine#1108
mozaa-solana wants to merge 1 commit intonextlevelbuilder:devfrom
mozaa-solana:feat/web-search-provider-param

Conversation

@mozaa-solana
Copy link
Copy Markdown

Summary

Currently web_search resolves a tenant-configured provider chain (e.g. tavily → exa → brave) and stops at the first success. That's the right default for cost/latency, but it makes cross-engine corroboration impossible at the agent level: the caller has no way to ask "search the same query on Exa specifically" once Tavily already returned a hit.

This PR adds an optional provider argument. When set, the chain is narrowed to the named provider (case-insensitive); other providers are not queried. First-success-wins fallback is preserved when the arg is omitted.

Concrete use case

A research agent verifying source freshness:

  1. Call A: web_search(query=\"Solana DEX volume\", freshness=\"pd\", provider=\"tavily\")
  2. Call B: web_search(query=\"Solana DEX volume\", freshness=\"pd\", provider=\"exa\")
  3. URLs in both result sets → high-confidence original primary sources.
  4. URLs in only one engine → republish suspect; verify with web_fetch page metadata.

Without provider, both calls return the same cached Tavily payload and the corroboration is meaningless.

Changes

  • internal/tools/web_search.go
    • New optional provider arg in Parameters().
    • When passed, the resolved chain is narrowed to the matching provider (case-insensitive). Unknown name returns an error listing what IS configured for the tenant.
    • Cache key now includes requestedProvider so per-engine results don't collide.
  • internal/tools/web_search_provider_param_test.go (new)
    • 5 cases: narrowing, case-insensitivity, unknown provider error, default first-success preservation, cache isolation.

Backward compatibility

Fully backward-compatible — provider is optional. Existing callers (and existing cached entries during rollout) continue to work because the cache key change only affects new writes; old keys age out via TTL.

Test plan

  • go build ./... clean
  • go test ./internal/tools/... passes (new + existing)
  • Unit-tested all four code paths: narrowed-chain, case-insensitive match, unknown provider error, default fallback
  • Cache isolation verified (same query, two providers → both engines run)

Related

Used by the c02-choros content-research agent (Max) to implement multi-layer source-freshness verification — agent calls each engine separately and cross-checks URLs to detect republished/syndicated content slipping past Tavily/Exa's date filters. Currently shipped in our fork (mozaa-solana/goclaw); upstreaming so the capability is broadly available.

Currently the web_search tool resolves a tenant-configured provider
chain (e.g. tavily → exa → brave) and stops at the first success.
That's the right default for cost/latency, but it makes cross-engine
corroboration impossible at the agent level: the caller has no way to
ask "search the same query on Exa specifically" once Tavily has
already returned a hit.

Add an optional `provider` argument. When set, the chain is narrowed
to the named provider (case-insensitive); other providers are not
queried. The first-success-wins fallback is preserved when the arg
is omitted.

Use case (concrete): a research agent verifying source freshness
calls web_search twice for the same query — once with provider="tavily"
and once with provider="exa" — and compares the URLs returned. URLs
that appear in both engines are high-confidence original primary
sources; URLs only in one signal a republish or low-circulation outlet
worth verifying further.

Other changes:

  - Cache key now includes the requested provider so per-engine
    results don't collide. Without this, the second call would just
    replay the first engine's cached result, defeating the purpose.
  - Unknown provider returns a clear error listing what IS configured
    for the tenant — better DX than a silent no-op.

Tests: 5 new cases covering narrowing, case-insensitivity, unknown
provider, default first-success behaviour preservation, and cache
isolation across providers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant