🐛 Bug Report: [Critical Bug] OpenAI streaming instrumentation crashes production with "unhashable type: 'list'" when using tool calls

### Which component is this bug for?

OpenAI Instrumentation

### 📜 Description

## Severity: CRITICAL

The OpenAI instrumentation causes production crashes when using streaming chat completions with tool definitions. 
The error occurs during metric recording with TypeError: unhashable type: 'list', causing the entire request to fail with a 500 error.
## Environment

- **traceloop-sdk version:** 0.47.4, 0.47.5 (bug present in both)
- **Python version:** 3.13
- **OpenAI SDK version:** 1.66.5
- **OpenTelemetry SDK version:** 1.38.0
- **Framework:** LangGraph with direct OpenAI client calls

## Root Cause Analysis

1. **OpenAI tool definitions contain lists/arrays** (e.g., `"required": ["param1", "param2"]`)
2. **Instrumentation captures these in `_shared_attributes()`** for metric recording
3. **OpenTelemetry metrics require hashable attributes** (for aggregation keys via `frozenset()`)
4. **Lists are not hashable** → `TypeError` crashes the streaming iterator
5. **Error propagates to user code** → entire request fails with 500

## Why This is Critical

### Production Impact
- ✅ User's code is **100% correct**
- ❌ Observability library **crashes production**
- ❌ No way to catch the error (happens in sync iterator)
- ❌ Results in 500 errors for end users

### Failed Workarounds
- ❌ `should_enrich_metrics=False` - **Still crashes**
- ❌ `span_postprocess_callback` - Too late, error happens during streaming
- ✅ `block_instruments={Instruments.OPENAI}` - **Only working solution** (loses all OpenAI telemetry)

## Proposed Fix

In `opentelemetry/instrumentation/openai/shared/chat_wrappers.py`, sanitize attributes before metric recording:

```python
def _shared_attributes(self):
    """Get attributes for metrics - sanitize unhashable types."""
    attrs = {
        # ... existing attributes
    }
    
    # Sanitize for metric recording
    sanitized = {}
    for key, value in attrs.items():
        if isinstance(value, (list, dict)):
            # Convert to JSON string for hashability
            try:
                sanitized[key] = json.dumps(value)
            except (TypeError, ValueError):
                sanitized[key] = str(value)
        else:
            sanitized[key] = value
    
    return sanitized
```

**OR** wrap metric recording in try/except:

```python
def _process_item(self, chunk):
    try:
        self._streaming_time_to_first_token.record(
            self._time_of_first_token - self._start_time,
            attributes=self._shared_attributes(),
        )
    except (TypeError, ValueError) as e:
        # Log but don't crash user code
        logger.warning(f"Failed to record metric: {e}")
```

This bug makes Traceloop **unusable in production** for any agent using OpenAI tools.



### 👟 Reproduction steps


## Minimal Reproduction

```python
from traceloop.sdk import Traceloop
from openai import OpenAI

# Initialize Traceloop (any configuration)
Traceloop.init(
    app_name="test-app",
    should_enrich_metrics=False,  # Even with this disabled, still crashes!
)

# Setup OpenAI client
client = OpenAI(api_key="your-api-key")

# Make streaming call with tools - THIS CRASHES
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "What's the weather?"}],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {"type": "string"}
                    },
                    "required": ["location"]  # ← LIST causes crash
                }
            }
        }
    ],
    tool_choice="auto",
    stream=True,  # Only crashes with streaming
)

# Crash happens during iteration
for chunk in response:  # TypeError on first chunk with tool_calls
    if chunk.choices:
        print(chunk.choices[0].delta)
```


### 👍 Expected behavior

Instrumentation should:
1. **Never crash user code** - fail gracefully or skip problematic metrics
2. **Sanitize attributes before recording** - convert unhashable types to strings
3. **Handle errors defensively** - log warning and continue


### 👎 Actual Behavior with Screenshots


## Complete Stack Trace

```python
File "my_app.py", line 25, in main
    for chunk in response:
                 ^^^^^^^^
File "/venv/lib/python3.13/site-packages/opentelemetry/instrumentation/openai/shared/chat_wrappers.py", line 693, in __next__
    self._process_item(chunk)
    ~~~~~~~~~~~~~~~~~~^^^^^^^
File "/venv/lib/python3.13/site-packages/opentelemetry/instrumentation/openai/shared/chat_wrappers.py", line 718, in _process_item
    self._streaming_time_to_first_token.record(
        self._time_of_first_token - self._start_time,
        attributes=self._shared_attributes(),  # ← Problem: contains unhashable lists
    )
File "/venv/lib/python3.13/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 428, in record
    self._real_instrument.record(amount, attributes, context)
File "/venv/lib/python3.13/site-packages/opentelemetry/sdk/metrics/_internal/instrument.py", line 264, in record
    self._measurement_consumer.consume_measurement(...)
File "/venv/lib/python3.13/site-packages/opentelemetry/sdk/metrics/_internal/_view_instrument_match.py", line 105, in consume_measurement
    aggr_key = frozenset(attributes.items())  # ← Crash: can't hash lists
TypeError: unhashable type: 'list'
```


### 🤖 Python Version

3.13

### 📃 Provide any additional context for the Bug.


- Non-streaming calls work fine (different code path)
- Calls without tools work fine
- Error occurs even with `should_enrich_metrics=False`
- Only solution is `block_instruments={Instruments.OPENAI}`

### 👀 Have you spent some time to check if this bug has been raised before?

- [x] I checked and didn't find similar issue

### Are you willing to submit PR?

Yes I am willing to submit a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 Bug Report: [Critical Bug] OpenAI streaming instrumentation crashes production with "unhashable type: 'list'" when using tool calls #3428

Which component is this bug for?

📜 Description

Severity: CRITICAL

Environment

Root Cause Analysis

Why This is Critical

Production Impact

Failed Workarounds

Proposed Fix

👟 Reproduction steps

Minimal Reproduction

👍 Expected behavior

👎 Actual Behavior with Screenshots

Complete Stack Trace

🤖 Python Version

📃 Provide any additional context for the Bug.

👀 Have you spent some time to check if this bug has been raised before?

Are you willing to submit PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🐛 Bug Report: [Critical Bug] OpenAI streaming instrumentation crashes production with "unhashable type: 'list'" when using tool calls #3428

Description

Which component is this bug for?

📜 Description

Severity: CRITICAL

Environment

Root Cause Analysis

Why This is Critical

Production Impact

Failed Workarounds

Proposed Fix

👟 Reproduction steps

Minimal Reproduction

👍 Expected behavior

👎 Actual Behavior with Screenshots

Complete Stack Trace

🤖 Python Version

📃 Provide any additional context for the Bug.

👀 Have you spent some time to check if this bug has been raised before?

Are you willing to submit PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions