-
Notifications
You must be signed in to change notification settings - Fork 826
Open
Description
Which component is this bug for?
OpenAI Instrumentation
📜 Description
Severity: CRITICAL
The OpenAI instrumentation causes production crashes when using streaming chat completions with tool definitions.
The error occurs during metric recording with TypeError: unhashable type: 'list', causing the entire request to fail with a 500 error.
Environment
- traceloop-sdk version: 0.47.4, 0.47.5 (bug present in both)
- Python version: 3.13
- OpenAI SDK version: 1.66.5
- OpenTelemetry SDK version: 1.38.0
- Framework: LangGraph with direct OpenAI client calls
Root Cause Analysis
- OpenAI tool definitions contain lists/arrays (e.g.,
"required": ["param1", "param2"]) - Instrumentation captures these in
_shared_attributes()for metric recording - OpenTelemetry metrics require hashable attributes (for aggregation keys via
frozenset()) - Lists are not hashable →
TypeErrorcrashes the streaming iterator - Error propagates to user code → entire request fails with 500
Why This is Critical
Production Impact
- ✅ User's code is 100% correct
- ❌ Observability library crashes production
- ❌ No way to catch the error (happens in sync iterator)
- ❌ Results in 500 errors for end users
Failed Workarounds
- ❌
should_enrich_metrics=False- Still crashes - ❌
span_postprocess_callback- Too late, error happens during streaming - ✅
block_instruments={Instruments.OPENAI}- Only working solution (loses all OpenAI telemetry)
Proposed Fix
In opentelemetry/instrumentation/openai/shared/chat_wrappers.py, sanitize attributes before metric recording:
def _shared_attributes(self):
"""Get attributes for metrics - sanitize unhashable types."""
attrs = {
# ... existing attributes
}
# Sanitize for metric recording
sanitized = {}
for key, value in attrs.items():
if isinstance(value, (list, dict)):
# Convert to JSON string for hashability
try:
sanitized[key] = json.dumps(value)
except (TypeError, ValueError):
sanitized[key] = str(value)
else:
sanitized[key] = value
return sanitizedOR wrap metric recording in try/except:
def _process_item(self, chunk):
try:
self._streaming_time_to_first_token.record(
self._time_of_first_token - self._start_time,
attributes=self._shared_attributes(),
)
except (TypeError, ValueError) as e:
# Log but don't crash user code
logger.warning(f"Failed to record metric: {e}")This bug makes Traceloop unusable in production for any agent using OpenAI tools.
👟 Reproduction steps
Minimal Reproduction
from traceloop.sdk import Traceloop
from openai import OpenAI
# Initialize Traceloop (any configuration)
Traceloop.init(
app_name="test-app",
should_enrich_metrics=False, # Even with this disabled, still crashes!
)
# Setup OpenAI client
client = OpenAI(api_key="your-api-key")
# Make streaming call with tools - THIS CRASHES
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What's the weather?"}],
tools=[
{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"] # ← LIST causes crash
}
}
}
],
tool_choice="auto",
stream=True, # Only crashes with streaming
)
# Crash happens during iteration
for chunk in response: # TypeError on first chunk with tool_calls
if chunk.choices:
print(chunk.choices[0].delta)👍 Expected behavior
Instrumentation should:
- Never crash user code - fail gracefully or skip problematic metrics
- Sanitize attributes before recording - convert unhashable types to strings
- Handle errors defensively - log warning and continue
👎 Actual Behavior with Screenshots
Complete Stack Trace
File "my_app.py", line 25, in main
for chunk in response:
^^^^^^^^
File "/venv/lib/python3.13/site-packages/opentelemetry/instrumentation/openai/shared/chat_wrappers.py", line 693, in __next__
self._process_item(chunk)
~~~~~~~~~~~~~~~~~~^^^^^^^
File "/venv/lib/python3.13/site-packages/opentelemetry/instrumentation/openai/shared/chat_wrappers.py", line 718, in _process_item
self._streaming_time_to_first_token.record(
self._time_of_first_token - self._start_time,
attributes=self._shared_attributes(), # ← Problem: contains unhashable lists
)
File "/venv/lib/python3.13/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 428, in record
self._real_instrument.record(amount, attributes, context)
File "/venv/lib/python3.13/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 264, in record
self._measurement_consumer.consume_measurement(...)
File "/venv/lib/python3.13/site-packages/opentelemetry/sdk/metrics/_internal/_view_instrument_match.py", line 105, in consume_measurement
aggr_key = frozenset(attributes.items()) # ← Crash: can't hash lists
TypeError: unhashable type: 'list'🤖 Python Version
3.13
📃 Provide any additional context for the Bug.
- Non-streaming calls work fine (different code path)
- Calls without tools work fine
- Error occurs even with
should_enrich_metrics=False - Only solution is
block_instruments={Instruments.OPENAI}
👀 Have you spent some time to check if this bug has been raised before?
- I checked and didn't find similar issue
Are you willing to submit PR?
Yes I am willing to submit a PR!
Metadata
Metadata
Assignees
Labels
No labels