-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Problem
Agents using the WebSearchTool can inadvertently access or return results from undesirable websites (e.g., irrelevant, malicious, paywalled, or policy-violating domains). This can lead to inefficient resource usage, security risks, and non-compliance with specific requirements.
Proposed solution
Add a sites_to_avoid parameter to the WebSearchTool (and potentially its underlying search implementation) that accepts a list of domain names or URL patterns. The tool should then filter out any search results or prevent direct access to URLs matching these patterns.
Is this not possible with the current options.
No, the current WebSearchTool does not provide any built-in mechanism to proactively restrict or filter web search results or visited sites based on a blacklist of domains or URL patterns.
Alternatives considered
An alternative would be to implement post-processing logic within the agent's code to filter the results returned by the WebSearchTool. However, this approach is reactive rather than proactive. It still incurs the cost and potential security implications of fetching data from undesirable sites before filtering. A dedicated sites_to_avoid parameter within the tool itself would be more efficient, secure, and align better with the tool's responsibility of interacting with the web.
Additional context (optional)
This feature would significantly enhance the safety, efficiency, and control over agent web interactions.
Checklist
- I have searched the existing issues and have not found a similar feature request.
- I have verified that this feature is not already implemented in the latest version.
- I am willing to work on this feature and submit a pull request.