Skip to content

Commit 9d05839

Browse files
committed
Fix pegasus response and add doc
1 parent 2471602 commit 9d05839

File tree

4 files changed

+301
-3
lines changed

4 files changed

+301
-3
lines changed

docs/my-website/docs/providers/bedrock.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1683,6 +1683,131 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
16831683
</TabItem>
16841684
</Tabs>
16851685

1686+
## TwelveLabs Pegasus - Video Understanding
1687+
1688+
TwelveLabs Pegasus 1.2 is a video understanding model that can analyze and describe video content. LiteLLM supports this model through Bedrock's `/invoke` endpoint.
1689+
1690+
| Property | Details |
1691+
|----------|---------|
1692+
| Provider Route | `bedrock/us.twelvelabs.pegasus-1-2-v1:0`, `bedrock/eu.twelvelabs.pegasus-1-2-v1:0` |
1693+
| Provider Documentation | [TwelveLabs Pegasus Docs ↗](https://docs.twelvelabs.io/docs/models/pegasus) |
1694+
| Supported Parameters | `max_tokens`, `temperature`, `response_format` |
1695+
| Media Input | S3 URI or base64-encoded video |
1696+
1697+
### Supported Features
1698+
1699+
- **Video Analysis**: Analyze video content from S3 or base64 input
1700+
- **Structured Output**: Support for JSON schema response format
1701+
- **S3 Integration**: Support for S3 video URLs with bucket owner specification
1702+
1703+
### Usage with S3 Video
1704+
1705+
<Tabs>
1706+
<TabItem value="sdk" label="SDK">
1707+
1708+
```python title="TwelveLabs Pegasus SDK Usage" showLineNumbers
1709+
from litellm import completion
1710+
import os
1711+
1712+
# Set AWS credentials
1713+
os.environ["AWS_ACCESS_KEY_ID"] = "your-aws-access-key"
1714+
os.environ["AWS_SECRET_ACCESS_KEY"] = "your-aws-secret-key"
1715+
os.environ["AWS_REGION_NAME"] = "us-east-1"
1716+
1717+
response = completion(
1718+
model="bedrock/us.twelvelabs.pegasus-1-2-v1:0",
1719+
messages=[{"role": "user", "content": "Describe what happens in this video."}],
1720+
mediaSource={
1721+
"s3Location": {
1722+
"uri": "s3://your-bucket/video.mp4",
1723+
"bucketOwner": "123456789012", # 12-digit AWS account ID
1724+
}
1725+
},
1726+
temperature=0.2
1727+
)
1728+
1729+
print(response.choices[0].message.content)
1730+
```
1731+
1732+
</TabItem>
1733+
1734+
<TabItem value="proxy" label="Proxy">
1735+
1736+
**1. Add to config**
1737+
1738+
```yaml title="config.yaml" showLineNumbers
1739+
model_list:
1740+
- model_name: pegasus-video
1741+
litellm_params:
1742+
model: bedrock/us.twelvelabs.pegasus-1-2-v1:0
1743+
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
1744+
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
1745+
aws_region_name: os.environ/AWS_REGION_NAME
1746+
```
1747+
1748+
**2. Start proxy**
1749+
1750+
```bash title="Start LiteLLM Proxy" showLineNumbers
1751+
litellm --config /path/to/config.yaml
1752+
1753+
# RUNNING at http://0.0.0.0:4000
1754+
```
1755+
1756+
**3. Test it!**
1757+
1758+
```bash title="Test Pegasus via Proxy" showLineNumbers
1759+
curl --location 'http://0.0.0.0:4000/chat/completions' \
1760+
--header 'Authorization: Bearer sk-1234' \
1761+
--header 'Content-Type: application/json' \
1762+
--data '{
1763+
"model": "pegasus-video",
1764+
"messages": [
1765+
{
1766+
"role": "user",
1767+
"content": "Describe what happens in this video."
1768+
}
1769+
],
1770+
"mediaSource": {
1771+
"s3Location": {
1772+
"uri": "s3://your-bucket/video.mp4",
1773+
"bucketOwner": "123456789012"
1774+
}
1775+
},
1776+
"temperature": 0.2
1777+
}'
1778+
```
1779+
1780+
</TabItem>
1781+
</Tabs>
1782+
1783+
### Usage with Base64 Video
1784+
1785+
You can also pass video content directly as base64:
1786+
1787+
```python title="Base64 Video Input" showLineNumbers
1788+
from litellm import completion
1789+
import base64
1790+
1791+
# Read video file and encode to base64
1792+
with open("video.mp4", "rb") as video_file:
1793+
video_base64 = base64.b64encode(video_file.read()).decode("utf-8")
1794+
1795+
response = completion(
1796+
model="bedrock/us.twelvelabs.pegasus-1-2-v1:0",
1797+
messages=[{"role": "user", "content": "What is happening in this video?"}],
1798+
mediaSource={
1799+
"base64String": video_base64
1800+
},
1801+
temperature=0.2,
1802+
)
1803+
1804+
print(response.choices[0].message.content)
1805+
```
1806+
1807+
### Important Notes
1808+
1809+
- **Response Format**: The model supports structured output via `response_format` with JSON schema
1810+
16861811
## Provisioned throughput models
16871812
To use provisioned throughput Bedrock models pass
16881813
- `model=bedrock/<base-model>`, example `model=bedrock/anthropic.claude-v2`. Set `model` to any of the [Supported AWS models](#supported-aws-bedrock-models)
@@ -1743,6 +1868,8 @@ Here's an example of using a bedrock model with LiteLLM. For a complete list, re
17431868
| Meta Llama 2 Chat 70b | `completion(model='bedrock/meta.llama2-70b-chat-v1', messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
17441869
| Mistral 7B Instruct | `completion(model='bedrock/mistral.mistral-7b-instruct-v0:2', messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
17451870
| Mixtral 8x7B Instruct | `completion(model='bedrock/mistral.mixtral-8x7b-instruct-v0:1', messages=messages)` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
1871+
| TwelveLabs Pegasus 1.2 (US) | `completion(model='bedrock/us.twelvelabs.pegasus-1-2-v1:0', messages=messages, mediaSource={...})` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
1872+
| TwelveLabs Pegasus 1.2 (EU) | `completion(model='bedrock/eu.twelvelabs.pegasus-1-2-v1:0', messages=messages, mediaSource={...})` | `os.environ['AWS_ACCESS_KEY_ID']`, `os.environ['AWS_SECRET_ACCESS_KEY']`, `os.environ['AWS_REGION_NAME']` |
17461873

17471874

17481875
## Bedrock Embedding

litellm/llms/bedrock/chat/invoke_transformations/amazon_twelvelabs_pegasus_transformation.py

Lines changed: 149 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,32 @@
55
https://docs.twelvelabs.io/docs/models/pegasus
66
"""
77

8-
from typing import Any, Dict, List, Optional
8+
import json
9+
import time
10+
from typing import TYPE_CHECKING, Any, Dict, List, Optional
911

12+
import httpx
13+
14+
import litellm
15+
from litellm._logging import verbose_logger
16+
from litellm.litellm_core_utils.core_helpers import map_finish_reason
1017
from litellm.llms.base_llm.base_utils import type_to_response_format_param
1118
from litellm.llms.base_llm.chat.transformation import BaseConfig
1219
from litellm.llms.bedrock.chat.invoke_transformations.base_invoke_transformation import (
1320
AmazonInvokeConfig,
1421
)
22+
from litellm.llms.bedrock.common_utils import BedrockError
1523
from litellm.types.llms.openai import AllMessageValues
24+
from litellm.types.utils import ModelResponse, Usage
1625
from litellm.utils import get_base64_str
1726

27+
if TYPE_CHECKING:
28+
from litellm.litellm_core_utils.litellm_logging import Logging as _LiteLLMLoggingObj
29+
30+
LiteLLMLoggingObj = _LiteLLMLoggingObj
31+
else:
32+
LiteLLMLoggingObj = Any
33+
1834

1935
class AmazonTwelveLabsPegasusConfig(AmazonInvokeConfig, BaseConfig):
2036
"""
@@ -53,7 +69,35 @@ def map_openai_params(
5369
return optional_params
5470

5571
def _normalize_response_format(self, value: Any) -> Any:
72+
"""Normalize response_format to TwelveLabs format.
73+
74+
TwelveLabs expects:
75+
{
76+
"jsonSchema": {...}
77+
}
78+
79+
But OpenAI format is:
80+
{
81+
"type": "json_schema",
82+
"json_schema": {
83+
"name": "...",
84+
"schema": {...}
85+
}
86+
}
87+
"""
5688
if isinstance(value, dict):
89+
# If it has json_schema field, extract and transform it
90+
if "json_schema" in value:
91+
json_schema = value["json_schema"]
92+
# Extract the schema if nested
93+
if isinstance(json_schema, dict) and "schema" in json_schema:
94+
return {"jsonSchema": json_schema["schema"]}
95+
# Otherwise use json_schema directly
96+
return {"jsonSchema": json_schema}
97+
# If it already has jsonSchema, return as is
98+
if "jsonSchema" in value:
99+
return value
100+
# Otherwise return the dict as is
57101
return value
58102
return type_to_response_format_param(response_format=value) or value
59103

@@ -72,9 +116,18 @@ def transform_request(
72116
if media_source is not None:
73117
request_data["mediaSource"] = media_source
74118

75-
for key in ("temperature", "maxOutputTokens", "responseFormat"):
119+
# Handle temperature and maxOutputTokens
120+
for key in ("temperature", "maxOutputTokens"):
76121
if key in optional_params:
77122
request_data[key] = optional_params.get(key)
123+
124+
# Handle responseFormat - transform to TwelveLabs format
125+
if "responseFormat" in optional_params:
126+
response_format = optional_params["responseFormat"]
127+
transformed_format = self._normalize_response_format(response_format)
128+
if transformed_format:
129+
request_data["responseFormat"] = transformed_format
130+
78131
return request_data
79132

80133
def _build_media_source(self, optional_params: dict) -> Optional[dict]:
@@ -131,3 +184,97 @@ def _convert_messages_to_prompt(self, messages: List[AllMessageValues]) -> str:
131184
prompt_parts.append(f"{role}: {content}")
132185
return "\n".join(part for part in prompt_parts if part).strip()
133186

187+
def transform_response(
188+
self,
189+
model: str,
190+
raw_response: httpx.Response,
191+
model_response: ModelResponse,
192+
logging_obj: LiteLLMLoggingObj,
193+
request_data: dict,
194+
messages: List[AllMessageValues],
195+
optional_params: dict,
196+
litellm_params: dict,
197+
encoding: Any,
198+
api_key: Optional[str] = None,
199+
json_mode: Optional[bool] = None,
200+
) -> ModelResponse:
201+
"""
202+
Transform TwelveLabs Pegasus response to LiteLLM format.
203+
204+
TwelveLabs response format:
205+
{
206+
"message": "...",
207+
"finishReason": "stop" | "length"
208+
}
209+
210+
LiteLLM format:
211+
ModelResponse with choices[0].message.content and finish_reason
212+
"""
213+
try:
214+
completion_response = raw_response.json()
215+
except Exception as e:
216+
raise BedrockError(
217+
message=f"Error parsing response: {raw_response.text}, error: {str(e)}",
218+
status_code=raw_response.status_code,
219+
)
220+
221+
verbose_logger.debug(
222+
"twelvelabs pegasus response: %s",
223+
json.dumps(completion_response, indent=4, default=str),
224+
)
225+
226+
# Extract message content
227+
message_content = completion_response.get("message", "")
228+
229+
# Extract finish reason and map to LiteLLM format
230+
finish_reason_raw = completion_response.get("finishReason", "stop")
231+
finish_reason = map_finish_reason(finish_reason_raw)
232+
233+
# Set the response content
234+
try:
235+
if (
236+
message_content
237+
and hasattr(model_response.choices[0], "message")
238+
and getattr(model_response.choices[0].message, "tool_calls", None) is None
239+
):
240+
model_response.choices[0].message.content = message_content # type: ignore
241+
model_response.choices[0].finish_reason = finish_reason
242+
else:
243+
raise Exception("Unable to set message content")
244+
except Exception as e:
245+
raise BedrockError(
246+
message=f"Error setting response content: {str(e)}. Response: {completion_response}",
247+
status_code=raw_response.status_code,
248+
)
249+
250+
# Calculate usage from headers
251+
bedrock_input_tokens = raw_response.headers.get(
252+
"x-amzn-bedrock-input-token-count", None
253+
)
254+
bedrock_output_tokens = raw_response.headers.get(
255+
"x-amzn-bedrock-output-token-count", None
256+
)
257+
258+
prompt_tokens = int(
259+
bedrock_input_tokens or litellm.token_counter(messages=messages)
260+
)
261+
262+
completion_tokens = int(
263+
bedrock_output_tokens
264+
or litellm.token_counter(
265+
text=model_response.choices[0].message.content, # type: ignore
266+
count_response_tokens=True,
267+
)
268+
)
269+
270+
model_response.created = int(time.time())
271+
model_response.model = model
272+
usage = Usage(
273+
prompt_tokens=prompt_tokens,
274+
completion_tokens=completion_tokens,
275+
total_tokens=prompt_tokens + completion_tokens,
276+
)
277+
setattr(model_response, "usage", usage)
278+
279+
return model_response
280+

litellm/llms/bedrock/chat/invoke_transformations/base_invoke_transformation.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -250,6 +250,14 @@ def transform_request(
250250
): # completion(top_k=3) > anthropic_config(top_k=3) <- allows for dynamic variables to be passed in
251251
inference_params[k] = v
252252
request_data = {"prompt": prompt, **inference_params}
253+
elif provider == "twelvelabs":
254+
return litellm.AmazonTwelveLabsPegasusConfig().transform_request(
255+
model=model,
256+
messages=messages,
257+
optional_params=optional_params,
258+
litellm_params=litellm_params,
259+
headers=headers,
260+
)
253261
else:
254262
raise BedrockError(
255263
status_code=404,
@@ -321,6 +329,20 @@ def transform_response( # noqa: PLR0915
321329
litellm_params=litellm_params,
322330
encoding=encoding,
323331
)
332+
elif provider == "twelvelabs":
333+
return litellm.AmazonTwelveLabsPegasusConfig().transform_response(
334+
model=model,
335+
raw_response=raw_response,
336+
model_response=model_response,
337+
logging_obj=logging_obj,
338+
request_data=request_data,
339+
messages=messages,
340+
optional_params=optional_params,
341+
litellm_params=litellm_params,
342+
encoding=encoding,
343+
api_key=api_key,
344+
json_mode=json_mode,
345+
)
324346
elif provider == "ai21":
325347
outputText = (
326348
completion_response.get("completions")[0].get("data").get("text")

tests/test_litellm/llms/bedrock/chat/invoke_transformations/test_twelvelabs_pegasus_transformation.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,9 @@ def test_map_openai_params_translates_fields():
3838
assert optional_params["maxOutputTokens"] == 20
3939
assert optional_params["temperature"] == 0.6
4040
assert "responseFormat" in optional_params
41-
assert optional_params["responseFormat"]["json_schema"]["name"] == "video_schema"
41+
# TwelveLabs format: responseFormat contains jsonSchema directly (not json_schema)
42+
assert "jsonSchema" in optional_params["responseFormat"]
43+
assert optional_params["responseFormat"]["jsonSchema"]["type"] == "object"
4244

4345

4446
def test_transform_request_includes_base64_media():

0 commit comments

Comments
 (0)