Handle HTTPX UTF-8 decoding errors #882

evantahler · 2024-11-14T19:23:28Z

Hello! Thank you for the vrc library - It makes testing our multi-service application /possible/.

In one of our tests, we want to ensure that application A can upload a file to application B and get some data back. We do something like this in our code:

files = {"file": ("my_file.pdf", open("my_file.pdf", "rb"))}

async with httpx.AsyncClient() as client:
  response = await client.post(
      "https://my-upload-service.com/api/post",
      json=request_payload,
      files=files,
  )

Recording this interaction with VCR throws an error because the PDF file in question can't be serialized to UTF8 without error, as it is a binary file

httpx_request = <Request('POST', 'http://parser:changeme123@localhost:8200/parser/api/v1/parse')>, kwargs = {}

    def _make_vcr_request(httpx_request, **kwargs):
>       body = httpx_request.read().decode("utf-8")
E       UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc7 in position 154: invalid continuation byte

As all the existing vcr filters require the request to be parsed so that we can inspect the body/headers/etc, they won't help us here. The assumption that most requests are UTF-8 serializable makes perfect sense, and this is a bit of a weird edge case. So, I'd like to keep the existing behavior as much as possible, but in the case of a UnicodeDecodeError`, let's try parsing again, and drop any bytes that are causing trouble. In our case, it didn't make a meaningul difference to the cassette recording.

evantahler · 2024-11-16T22:14:46Z

For the moment, we've gone with a monkeypatching approach:

import warnings

import vcr  # type: ignore[import-untyped]
from vcr.request import Request as VcrRequest  # type: ignore[import-untyped]
from vcr.stubs.httpx_stubs import (  # type: ignore
    _make_vcr_request,  # noqa: F401 this is needed for some reason so python knows this method exists
)


def _fixed__make_vcr_request(  # type: ignore
    httpx_request,
    **kwargs,  # noqa: ARG001
) -> VcrRequest:
    try:
        body = httpx_request.read().decode("utf-8")
    except UnicodeDecodeError as e:  # noqa: F841
        body = httpx_request.read().decode("utf-8", errors="ignore")
        warnings.warn(
            f"Could not decode full request payload as UTF8, recording may have lost bytes. {e}",
            stacklevel=2,
        )
    uri = str(httpx_request.url)
    headers = dict(httpx_request.headers)
    return VcrRequest(httpx_request.method, uri, body, headers)


vcr.stubs.httpx_stubs._make_vcr_request = _fixed__make_vcr_request

kevin1024 · 2025-12-05T18:24:00Z

Closing this PR as v8.0.0 included a complete rewrite of httpx support (now patching httpcore instead of httpx). The code path this PR was modifying has changed significantly. If you're still experiencing UTF-8 decoding issues with httpx on v8.0.0, please open a new issue and we can revisit. Thanks for the contribution!

evantahler added 2 commits November 14, 2024 19:18

Handle HTTPX UTF-8 decoding errors

abaeb7b

Update httpx_stubs.py

b03180f

kevin1024 closed this Dec 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handle HTTPX UTF-8 decoding errors #882

Handle HTTPX UTF-8 decoding errors #882

Uh oh!

evantahler commented Nov 14, 2024 •

edited

Loading

Uh oh!

evantahler commented Nov 16, 2024

Uh oh!

kevin1024 commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Handle HTTPX UTF-8 decoding errors #882

Handle HTTPX UTF-8 decoding errors #882

Uh oh!

Conversation

evantahler commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

evantahler commented Nov 16, 2024

Uh oh!

kevin1024 commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

evantahler commented Nov 14, 2024 •

edited

Loading