Skip to content

Commit 2cacc0f

Browse files
buslaclaude
andcommitted
fix(agentcore): complete fix for circular import deadlock
The initial fix (f6bc8d2) was incomplete. It restored direct imports in litellm/__init__.py but missed two critical functions that proxy_server.py imports at module level: 1. load_credentials_from_list (proxy_server.py:56) 2. _add_custom_logger_callback_to_specific_event (proxy_server.py:464) When these functions weren't pre-imported, Python had to dynamically import them from utils.py, which triggered "import litellm" at utils.py:56, recreating the circular dependency chain that prevented Uvicorn from starting. This complete fix adds both functions to the import list, breaking the circular dependency cycle. Related to PR BerriAI#17171 which introduced lazy loading on Dec 3, 2025. Fixes: ECS health check failures where migrations complete but Uvicorn never starts, resulting in UNHEALTHY containers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent f6bc8d2 commit 2cacc0f

File tree

2 files changed

+127
-0
lines changed

2 files changed

+127
-0
lines changed

issue.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
## Bug: PR #17171 introduces circular import deadlock that prevents proxy server from starting
2+
3+
### Description
4+
5+
PR #17171 ("Lazy-load utils to reduce memory + import time") introduced a circular import deadlock that prevents the FastAPI proxy server from starting. This causes health check failures in containerized deployments (ECS, Kubernetes, etc.) because Uvicorn never initializes the HTTP server.
6+
7+
### Symptoms
8+
9+
- ✅ Container starts successfully
10+
- ✅ Database migrations complete
11+
- ✅ APScheduler background jobs start
12+
-**Uvicorn HTTP server never starts** (no "Started server process" or "Uvicorn running" logs)
13+
- ❌ Health check endpoints (`/health`, `/health/liveness`) unreachable
14+
- ❌ Container marked as UNHEALTHY and terminated after repeated failures
15+
16+
### Root Cause
17+
18+
The lazy loading system in `litellm/__init__.py` creates a circular import deadlock:
19+
20+
```python
21+
# Import chain that causes the deadlock:
22+
23+
1. proxy_server.py:56from litellm.utils import load_credentials_from_list
24+
25+
2. utils.py:56import litellm
26+
27+
3. litellm/__init__.py → Sets up __getattr__ for lazy loading
28+
29+
4. [later] Code accesses litellm.ModelResponse
30+
31+
5. __getattr__("ModelResponse") → _lazy_import_utils("ModelResponse")
32+
33+
6. Tries: from .utils import ModelResponse
34+
35+
7. DEADLOCK: utils.py is still being imported from step 2!
36+
```
37+
38+
**Why this hangs:** When `proxy_server.py` imports from `litellm.utils` before `litellm` finishes initializing, Python starts loading `utils.py`. When `utils.py` imports `litellm`, the module sets up lazy loading via `__getattr__`. Later, when code accesses `litellm.ModelResponse`, the `__getattr__` handler tries to import from `utils` again—but `utils.py` is still being loaded from step 2, creating an infinite wait.
39+
40+
### Affected Files
41+
42+
The circular dependency involves:
43+
- `litellm/__init__.py` (added `__getattr__` lazy loading)
44+
- `litellm/_lazy_imports.py` (new file, handles deferred imports)
45+
- `litellm/utils.py` (imports `litellm` at line 56)
46+
- `litellm/proxy/proxy_server.py` (imports from `litellm.utils` at line 56)
47+
48+
### Reproduction
49+
50+
1. Deploy litellm proxy with commit `56328e6535` or later
51+
2. Start the container
52+
3. Observe that migrations complete but Uvicorn never starts
53+
4. Health checks fail → container terminated
54+
55+
OR locally:
56+
57+
```bash
58+
# This will hang indefinitely with the lazy loading:
59+
python -m litellm.proxy.proxy_cli --port 4000
60+
```
61+
62+
### Workaround/Fix
63+
64+
The fix is to revert the lazy loading system and restore direct imports in `litellm/__init__.py`, **ensuring ALL functions imported by `proxy_server.py` are included**:
65+
66+
```python
67+
# Replace lazy loading __getattr__ with direct imports:
68+
from .utils import (
69+
client,
70+
exception_type,
71+
get_optional_params,
72+
# ... all other utils functions
73+
ModelResponse,
74+
ModelResponseStream,
75+
load_credentials_from_list, # CRITICAL: Used by proxy_server.py:56
76+
_add_custom_logger_callback_to_specific_event, # CRITICAL: Used by proxy_server.py:464
77+
# etc.
78+
)
79+
80+
from .cost_calculator import completion_cost, cost_per_token, response_cost_calculator
81+
from litellm.litellm_core_utils.litellm_logging import Logging, modify_integration
82+
83+
# Remove the __getattr__ function entirely
84+
```
85+
86+
**Note**: The initial fix (commit `f6bc8d2f62`) was incomplete because it didn't include `load_credentials_from_list` and `_add_custom_logger_callback_to_specific_event` in the import list, causing the circular import to persist when `proxy_server.py` imported these functions.
87+
88+
### Related Commits
89+
90+
- **Breaking change:** `56328e6535` - PR #17171 (Dec 3, 2025)
91+
- **Incomplete fix:** `f6bc8d2f62` - "fix(agentcore): remove lazy loading to resolve circular import deadlock" (missing 2 imports)
92+
- **Complete fix:** TBD - Added `load_credentials_from_list` and `_add_custom_logger_callback_to_specific_event` to import list
93+
94+
### Why the Initial Fix Failed
95+
96+
The first fix (commit `f6bc8d2f62`) restored direct imports but was incomplete. When `proxy_server.py` imported at line 56:
97+
```python
98+
from litellm.utils import load_credentials_from_list
99+
```
100+
101+
Since `load_credentials_from_list` wasn't in the pre-imported list in `litellm/__init__.py`, Python had to dynamically import it from `utils.py`, which triggered `import litellm` at `utils.py:56`, recreating the circular dependency.
102+
103+
### Impact
104+
105+
- **Severity:** Critical - Proxy server cannot start
106+
- **Affected deployments:** All containerized deployments with health checks (ECS, Kubernetes, Docker)
107+
- **Version:** All versions after Dec 3, 2025 (commit `56328e6535`)
108+
109+
### Suggested Solution
110+
111+
1. **Short-term:** Revert PR #17171 or apply the fix from commit `f6bc8d2f62`
112+
2. **Long-term:** If lazy loading is desired for performance, restructure the imports to break the circular dependency:
113+
- Move the `import litellm` statement in `utils.py` to local imports where needed
114+
- OR ensure `proxy_server.py` doesn't import from `litellm.utils` before `litellm` is fully initialized
115+
- OR use a different lazy loading mechanism that doesn't rely on `__getattr__` during module initialization
116+
117+
### Environment
118+
119+
- Python version: 3.12.8
120+
- Deployment: AWS ECS (also affects any containerized deployment)
121+
- Base image: Chainguard Wolfi
122+
123+
---
124+
125+
**Note:** This issue was discovered when health checks failed in production ECS deployment. The fix has been tested and confirmed to resolve the issue.

litellm/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1102,6 +1102,8 @@ def add_known_models():
11021102
get_provider_fields,
11031103
ModelResponseListIterator,
11041104
get_valid_models,
1105+
load_credentials_from_list,
1106+
_add_custom_logger_callback_to_specific_event,
11051107
)
11061108

11071109
from .llms.bytez.chat.transformation import BytezChatConfig

0 commit comments

Comments
 (0)