Description
When using generate_content_stream with image generation models (e.g. gemini-2.0-flash-preview-image-generation), the SDK raises ValueError: Chunk too big if the response contains base64-encoded image data that exceeds aiohttp's internal _high_water buffer limit (~128KB).
This makes generate_content_stream unusable for image generation use cases.
Context / Motivation
We switched from generate_content to generate_content_stream to mitigate a separate issue: when using generate_content for image generation, requests that take a long time (3-8 minutes) result in idle TCP connections being reset by the server mid-response, causing:
TransferEncodingError: 400, message='Not enough data to satisfy transfer length header.'
ConnectionResetError(104, 'Connection reset by peer')
This TransferEncodingError is raised by aiohttp's chunked transfer encoding parser (aiohttp/http_parser.py:feed_eof()) when the connection is closed before all chunks are received. The 400 here is aiohttp's internal BadHttpMessage.code attribute, not an HTTP 400 status from the API.
We attempted to use streaming to keep the connection active during long image generation tasks, but hit this new issue instead.
Steps to Reproduce
from google import genai
from google.genai import types
client = genai.Client(vertexai=True, project="your-project", location="global")
Use streaming with an image generation model
response_stream = await client.aio.models.generate_content_stream(
model="gemini-2.0-flash-preview-image-generation",
contents=types.Content(
role="user",
parts=[
types.Part.from_bytes(data=image_bytes, mime_type="image/jpeg"),
types.Part.from_text(text="Generate a flat lay packshot of this garment on a white background"),
],
),
config=types.GenerateContentConfig(
response_modalities=["IMAGE"],
image_config=types.ImageConfig(),
),
)
chunks = []
async for chunk in response_stream:
chunks.append(chunk) # ValueError: Chunk too big
Error
ValueError: Chunk too big
Traceback points to aiohttp/streams.py:388:
aiohttp/streams.py
async def readuntil(self, separator=b"\n"):
...
if chunk_size > self._high_water: # limit * 2 = 65536 * 2 = 131072 bytes
raise ValueError("Chunk too big")
Root Cause
The streaming implementation internally uses aiohttp's readuntil() to parse SSE lines. Image generation responses contain base64-encoded image data (typically several MB) in a single SSE event/line. This exceeds aiohttp's _high_water buffer limit (default ~128KB), which is designed for text streaming where individual lines/tokens are small.
generate_content (non-streaming) does not hit this issue because it uses read() instead of readuntil(), which has no per-chunk size limit.
Environment
google-genai: 1.53.0
aiohttp: 3.13.0
Python: 3.12.12
Platform: Linux (Ubuntu)
Vertex AI mode with service account credentials
Expected Behavior
generate_content_stream should handle image generation responses without raising ValueError: Chunk too big. Possible approaches:
Increase the internal aiohttp buffer limit for streaming connections that may carry large payloads (image data)
Use read() instead of readuntil() for parsing large SSE events
Document that generate_content_stream is not compatible with image generation models and should not be used for response_modalities=["IMAGE"]
Related
This also affects users who want to use streaming to avoid idle connection timeouts during long-running image generation requests (3-8 min). Currently there is no workaround: generate_content risks TransferEncodingError on long requests, and generate_content_stream fails with Chunk too big on image responses.
Description
When using generate_content_stream with image generation models (e.g. gemini-2.0-flash-preview-image-generation), the SDK raises ValueError: Chunk too big if the response contains base64-encoded image data that exceeds aiohttp's internal _high_water buffer limit (~128KB).
This makes generate_content_stream unusable for image generation use cases.
Context / Motivation
We switched from generate_content to generate_content_stream to mitigate a separate issue: when using generate_content for image generation, requests that take a long time (3-8 minutes) result in idle TCP connections being reset by the server mid-response, causing:
TransferEncodingError: 400, message='Not enough data to satisfy transfer length header.'
ConnectionResetError(104, 'Connection reset by peer')
This TransferEncodingError is raised by aiohttp's chunked transfer encoding parser (aiohttp/http_parser.py:feed_eof()) when the connection is closed before all chunks are received. The 400 here is aiohttp's internal BadHttpMessage.code attribute, not an HTTP 400 status from the API.
We attempted to use streaming to keep the connection active during long image generation tasks, but hit this new issue instead.
Steps to Reproduce
from google import genai
from google.genai import types
client = genai.Client(vertexai=True, project="your-project", location="global")
Use streaming with an image generation model
response_stream = await client.aio.models.generate_content_stream(
model="gemini-2.0-flash-preview-image-generation",
contents=types.Content(
role="user",
parts=[
types.Part.from_bytes(data=image_bytes, mime_type="image/jpeg"),
types.Part.from_text(text="Generate a flat lay packshot of this garment on a white background"),
],
),
config=types.GenerateContentConfig(
response_modalities=["IMAGE"],
image_config=types.ImageConfig(),
),
)
chunks = []
async for chunk in response_stream:
chunks.append(chunk) # ValueError: Chunk too big
Error
ValueError: Chunk too big
Traceback points to aiohttp/streams.py:388:
aiohttp/streams.py
async def readuntil(self, separator=b"\n"):
...
if chunk_size > self._high_water: # limit * 2 = 65536 * 2 = 131072 bytes
raise ValueError("Chunk too big")
Root Cause
The streaming implementation internally uses aiohttp's readuntil() to parse SSE lines. Image generation responses contain base64-encoded image data (typically several MB) in a single SSE event/line. This exceeds aiohttp's _high_water buffer limit (default ~128KB), which is designed for text streaming where individual lines/tokens are small.
generate_content (non-streaming) does not hit this issue because it uses read() instead of readuntil(), which has no per-chunk size limit.
Environment
google-genai: 1.53.0
aiohttp: 3.13.0
Python: 3.12.12
Platform: Linux (Ubuntu)
Vertex AI mode with service account credentials
Expected Behavior
generate_content_stream should handle image generation responses without raising ValueError: Chunk too big. Possible approaches:
Increase the internal aiohttp buffer limit for streaming connections that may carry large payloads (image data)
Use read() instead of readuntil() for parsing large SSE events
Document that generate_content_stream is not compatible with image generation models and should not be used for response_modalities=["IMAGE"]
Related
This also affects users who want to use streaming to avoid idle connection timeouts during long-running image generation requests (3-8 min). Currently there is no workaround: generate_content risks TransferEncodingError on long requests, and generate_content_stream fails with Chunk too big on image responses.