feat(hooks): add thinking_budget hook for extended thinking management #5

nmarasoiu · 2025-12-10T10:34:41Z

Note: Not sure if this is useful.

In case of Claude Code, using keywords like think, think hard & ultrathink go a long way of setting a thinking budget.
For non Claude thinking models, the http parameters might be different so not sure this would work beyond Anthropic.

Add a new hook that manages Claude's extended thinking budget_tokens parameter:

Inject default budget_tokens when thinking is enabled but budget is missing
Override budget_tokens when below configurable minimum threshold
Ensure budget < max_tokens (API constraint)
Optionally inject thinking configuration for thinking-capable models

Key design decisions:

Trust caller: if request has thinking, adjust budget regardless of model
Model filter only applies to inject_if_missing mode
Anthropic-specific: non-Anthropic providers will ignore thinking field
Simple config via hook params (no env var complexity)

Configuration:

budget_default: Default budget (10000)
budget_min: Minimum threshold (1024)
inject_if_missing: Auto-inject thinking (false)
log_modifications: Log changes (true)

Add a new hook that manages Claude's extended thinking budget_tokens parameter: - Inject default budget_tokens when thinking is enabled but budget is missing - Override budget_tokens when below configurable minimum threshold - Ensure budget < max_tokens (API constraint) - Optionally inject thinking configuration for thinking-capable models Key design decisions: - Trust caller: if request has thinking, adjust budget regardless of model - Model filter only applies to inject_if_missing mode - Anthropic-specific: non-Anthropic providers will ignore thinking field - Simple config via hook params (no env var complexity) Configuration: - budget_default: Default budget (10000) - budget_min: Minimum threshold (1024) - inject_if_missing: Auto-inject thinking (false) - log_modifications: Log changes (true)

starbased-co · 2025-12-20T02:43:32Z

Thanks for the PR, and another thank you for your patience. After looking through this, I kind of see where you're going with it, but first thing I'd want from you is a more focused value proposition.

Some things to note:

LiteLLM provides a unified reasoning support check via supports_reasoning(model)
You can configure params for thinking, max tokens, etc. per model in the litellm config
LiteLLM already supports some transformations between providers, i.e. gemini and anthropic thinking params are basically interchangeable between each other, reasoning_effort for some is converted to a token budget, other times it's dropped.

nmarasoiu force-pushed the feature/thinking-budget-hook branch from 6b20e94 to 4ffd06a Compare December 10, 2025 10:46

nmarasoiu force-pushed the feature/thinking-budget-hook branch from 4ffd06a to 09bb994 Compare December 10, 2025 10:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(hooks): add thinking_budget hook for extended thinking management #5

feat(hooks): add thinking_budget hook for extended thinking management #5

Uh oh!

nmarasoiu commented Dec 10, 2025 •

edited

Loading

Uh oh!

starbased-co commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(hooks): add thinking_budget hook for extended thinking management #5

Are you sure you want to change the base?

feat(hooks): add thinking_budget hook for extended thinking management #5

Uh oh!

Conversation

nmarasoiu commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

starbased-co commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nmarasoiu commented Dec 10, 2025 •

edited

Loading