Skip to content

fix: disable tool definitions during final answer/summary generation#159

Closed
octo-patch wants to merge 18 commits into
MiroMindAI:mainfrom
octo-patch:fix/issue-158-disable-tools-in-final-summary
Closed

fix: disable tool definitions during final answer/summary generation#159
octo-patch wants to merge 18 commits into
MiroMindAI:mainfrom
octo-patch:fix/issue-158-disable-tools-in-final-summary

Conversation

@octo-patch
Copy link
Copy Markdown
Contributor

Fixes #158

Problem

When generate_agent_summarize_prompt is invoked at the end of the main agent loop or sub-agent loop to request a final text answer, tool_definitions is still passed to the underlying LLM call. This means the model remains capable of invoking tools at the API level, even though the summarize prompt explicitly instructs it not to.

In practice this causes intermittent failures where the model — especially after hitting max_turns — enters a <think> block acknowledging it should produce a report, but then still emits a <use_mcp_tool> block. Because the model receives an empty tool result (no executor handles it in this phase), it produces no usable text, causing the retry loop to exhaust attempts and return a format error.

Solution

Pass [] for tool_definitions in both locations where the final answer is generated:

  • answer_generator.pygenerate_final_answer_with_retries: the main-agent retry loop
  • orchestrator.py → sub-agent final summary handle_llm_call

By removing tool definitions at the API level, the model is physically prevented from calling tools and is forced to produce a text response, matching the intent of the summarize prompt.

Testing

The change is a two-line substitution (tool_definitions[]). Existing unit tests continue to pass. The fix eliminates the race condition between prompt-level and API-level tool availability.

shawnlimn and others added 18 commits February 4, 2026 16:17
- Fix server_name routing: dynamically parse system prompt to build
  tool→server mapping, auto-correct wrong server_name in LLM responses
- Fix tool_name hallucination: python/python_code → run_python_code
  (only when system prompt defines run_python_code)
- Fix parameter names: code → code_block, add default sandbox_id
- Fix scrape_and_extract_info params: description → info_to_extract
- Add stateless sandbox fallback for invalid sandbox_id

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a description about the best practice recommendation.
…iroMindAI#137)

Previously, execute_tool_call() spawned a new subprocess and performed a full
stdio handshake on every single tool invocation (~400 times per BC task), adding
2-5 minutes of overhead per task.

This commit introduces persistent session management to ToolManager:
- Add _get_or_create_session() that lazily opens a stdio/SSE session and keeps
  it alive in an AsyncExitStack for the lifetime of the ToolManager
- Refactor execute_tool_call() to reuse the cached session instead of opening
  a new connection per call
- Refactor get_all_tool_definitions() similarly so sessions opened at startup
  are immediately available for subsequent tool calls
- Add close() method to cleanly shut down all sessions (including browser)
- Call close() in execute_task_pipeline() finally block to guarantee cleanup

This reduces N subprocess spawns (N = tool calls) down to one per server,
matching the approach already used by the playwright server.
When generating the final answer or sub-agent summary, tool_definitions
was still passed to the LLM, allowing the model to make tool calls
instead of producing a text response. This caused intermittent failures
where the model would output tool call XML after the summarize prompt
even though the prompt explicitly forbade it.

Passing an empty list for tool_definitions at the API level enforces
that no tools are available during final summary generation, ensuring
the model produces a text answer as intended.

Fixes MiroMindAI#158
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MiroThinker的miroflow-agent跑测-输出结果默认为\box{}的精简答案,如何复现官网的研究报告格式

5 participants