WSGIWrapper violates WSGI protocol

# TL;DR

There's been and still are several bugs in how `hypercorn` handles the edge cases of WSGI, because WSGI is incredibly complicated. I've opened a fix PR with lots of new tests.

# Overview

Between `v0.17.3` and `v0.18.0` there was work done on the `WSGIWrapper` class to attempt to make it more compatible with the WSGI specification, specifically [PEP-3333](https://peps.python.org/pep-3333/). Even more specifically, the timing of calls made to [start_response](https://peps.python.org/pep-3333/#the-start-response-callable). This issue was raised in #320 and a fix was implemented in #321. Unfortunately, the implementation is closer to correct but still violates the WSGI protocol in a way that broke our existing (and valid) applications. While investigating our own broken CI, I found multiple WSGI violations that can be fixed with some slight tweaks. I'll attempt to outline all 4 relevant bugs with very simple examples that exercise the problems.

## The WSGI Specification

An excerpt from the WSGI specification, emphasis mine. I'll reference specific lines of this later.

> However, the start_response callable **must not actually transmit the response headers**. Instead, it must store them for the server or gateway to transmit only **after the first iteration** of the application return value that yields a **non-empty bytestring**, or upon the application’s first invocation of the write() callable. In other words, response headers must not be sent until there is actual body data available, **or until the application’s returned iterable is exhausted**. (The only possible exception to this rule is if the response headers explicitly include a Content-Length of zero.)

> This delaying of response header transmission is to ensure that buffered and asynchronous **applications can replace their originally intended output with error output**, up until the last possible moment. For example, the application may need to change the response status from "200 OK" to "500 Internal Error", if an error occurs while the body is being generated within an application buffer.

> The exc_info argument, if supplied, must be a Python sys.exc_info() tuple. This argument should be supplied by the application only if start_response is being called by an error handler. If exc_info is supplied, and no HTTP headers have been output yet, start_response should replace the currently-stored HTTP response headers with the newly-supplied ones, thus allowing the application to "change its mind" about the output when an error has occurred.

> However, if exc_info is provided, and the HTTP headers have already been sent, start_response must raise an error, and should re-raise using the exc_info tuple. That is: `raise exc_info[1].with_traceback(exc_info[2])`

> ...

> The application **may call start_response more than once**, if and only if the exc_info argument is provided. More precisely, it is a fatal error to call start_response without the exc_info argument if start_response has already been called within the current invocation of the application. This includes the case where the first call to start_response raised an error. (See the example CGI gateway above for an illustration of the correct logic.)

## Bug 1: Generator Applications

This bug was the one fixed in #321, which is that WSGI applications don't need to actually call `start_response` when you "call" them. They can be things like classes and generators, which would be initialized rather than called like a function. These won't execute any of the contained code until you start to iterate on them. Because of htis, you can't transmit headers until you've started iterating on the return value from calling the application function.

Unfortunately, the fix for this in #321 caused Bug 2. We need to regressing on Bug 1 while we fix that.

The following example is a very simple generator that is a valid WSGI app, and crashes `hypercorn<=0.17.3` but not `hypercorn==0.18.0`.

```python
def wsgi_app_generator(environ: dict, start_response: Callable) -> Generator[bytes, None, None]:
    """
    A synchronous generator usable as a valid WSGI Application.

    Notably, the WSGI specification ensures only that start_response() is called
    before the first item is returned from the iterator. It does not have to
    be immediately called when app(environ, start_response) is called.

    Using a generator for a WSGI app will delay calling start_response() until after
    something begins iterating on it, so only invoking the app and not iterating on
    the returned iterable will not be sufficient to get the status code and headers.

    It is also valid to send multiple chunks of data, but the status code and headers
    must be sent before the first non-empty chunk of body data is sent.

    Therefore it is not valid to send the status code and headers before iterating on
    the returned generator. It is only valid to send status code and headers during
    iteration of the generator, immediately after the first non-empty byte
    string is returned, but before continuing to iterate further.
    """
    start_response("200 OK", [("X-Test-Header", "Test-Value")])
    yield b"Hello, "
    yield b"world!"

    # hypercorn==0.18.0 works with no issues
    #
    # hypercorn==0.17.3 produces the following traceback, which was fixed in hypercorn==0.18.0
    #
    # Traceback (most recent call last):
    # File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 27, in _handle
    #     await app(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
    #     await self.handle_http(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 83, in handle_http
    #     await sync_spawn(self.run_app, environ, partial(call_soon, send))
    # File ".../python3.13/concurrent/futures/thread.py", line 59, in run
    #     result = self.fn(*self.args, **self.kwargs)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 109, in run_app
    #     raise RuntimeError("WSGI app did not call start_response")
    # RuntimeError: WSGI app did not call start_response
```

## Bug 2: HEAD Requests

There exist less common HTTP methods like HEAD which do not return a response body. When implementing a WSGI app that supports these methods, it's common to return an empty iterable. An empty list suffices, as would a generator which never reaches any yield statements before exiting. The important information is all sent in the headers and status code, via `start_response`.

The fix implemented in #321 moved the transmission of headers to inside the `for` loop, which unfortunately gets passed over if there are 0 elements in the application return value.

> In other words, response headers must not be sent until there is actual body data available, **or until the application’s returned iterable is exhausted**.

The correct behavior involves checking if the response headers have been sent after the `for` loop exhausts the iterable, and if they haven't yet been sent, sending them right at the end of the function. You can't move the logic from the for loop to the end, because applications that do return a non-empty body need the headers to be sent before we start sending that body. Therefore, you have to actually check in 2 distinct places and duplicate the logic for sending headers.

The following examples are both a function and a generator that each are valid WSGI apps, and crashes `hypercorn==0.18.0` but not `hypercorn<=0.17.3`. Notably the generator with no body will crash both versions, but for different reasons.

```python
def wsgi_app_no_body(environ: dict, start_response: Callable) -> list[bytes]:
    """
    A WSGI Application that does not yield up any body chunks when iterated on.

    This is most common when supporting HTTP methods such as HEAD, which is identical
    to GET except that the server MUST NOT return a message body in the response.

    The iterable returned by this app will have no contents, immediately exiting
    any for loops attempting to iterate on it. Even though no body was returned
    from the application, this is still a valid HTTP request and MUST send the
    status code and headers as the response. Failing to do so violates the
    WSGI, ASGI, and HTTP specifications.

    Therefore, the status code and headers must be sent after the iteration completes,
    as it is not valid to send them only during iteration. If headers are only sent
    within the body of the for loop, this application will cause the server to fail
    to send this information at all. However, care must be taken to check
    whether the status code and headers were already sent during the iteration process,
    as they may have been sent during the iteration process for applications with
    non-empty bodies. If this isn't accounted for they will be sent twice in error.
    """
    start_response("200 OK", [("X-Test-Header", "Test-Value")])
    return []

    # hypercorn==0.17.3 works with no issues
    #
    # hypercorn==0.18.0 produces the following traceback
    #
    # Traceback (most recent call last):
    # File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 28, in _handle
    #     await app(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
    #     await self.handle_http(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 84, in handle_http
    #     await send({"type": "http.response.body", "body": b"", "more_body": False})
    # File ".../python3.13/site-packages/hypercorn/protocol/http_stream.py", line 241, in app_send
    #     raise UnexpectedMessageError(self.state, message["type"])
    # hypercorn.utils.UnexpectedMessageError: Unexpected message type, http.response.body given the state ASGIHTTPState.REQUEST


def wsgi_app_generator_no_body(environ: dict, start_response: Callable) -> Generator[bytes, None, None]:
    """
    A synchronous generator usable as a valid WSGI Application, which
    does not yield up any body chunks when iterated on.

    This is a very complicated edge case. It is most commonly found when building a
    generator based WSGI app with support for HTTP methods such as HEAD, which is
    identical to GET except that the server MUST NOT return a message body in the response.

    1. The application is subject to the same delay in calling start_response until
    after the server has begun iterating on the returned generator object.

    2. The status code and headers are also not available during iteration, as the
    empty generator will immediately end any for loops that attempt to iterate on it.

    3. Even though no body was returned from the application, this is still a valid
    HTTP request and MUST send the status code and headers as the response. Failing
    to do so violates the WSGI, ASGI, and HTTP specifications.

    Therefore, the status code and headers must be sent after the iteration completes,
    as it is not valid to send them only during iteration. If headers are only sent
    within the body of the for loop, this application will cause the server to fail
    to send this information at all. However, care must be taken to check
    whether the status code and headers were already sent during the iteration process,
    as they may have been sent during the iteration process for applications with
    non-empty bodies. If this isn't accounted for they will be sent twice in error.
    """
    start_response("200 OK", [("X-Test-Header", "Test-Value")])
    if False:
        yield b""  # Unreachable yield makes this an empty generator  # noqa

    # hypercorn==0.18.0 produces the following traceback
    #
    # Traceback (most recent call last):
    # File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 28, in _handle
    #     await app(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
    #     await self.handle_http(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 84, in handle_http
    #     await send({"type": "http.response.body", "body": b"", "more_body": False})
    # File ".../python3.13/site-packages/hypercorn/protocol/http_stream.py", line 241, in app_send
    #     raise UnexpectedMessageError(self.state, message["type"])
    # hypercorn.utils.UnexpectedMessageError: Unexpected message type, http.response.body given the state ASGIHTTPState.REQUEST

    # hypercorn==0.17.3 produces the following traceback, which differs from the above
    #
    # Traceback (most recent call last):
    # File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 27, in _handle
    #     await app(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
    #     await self.handle_http(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 83, in handle_http
    #     await sync_spawn(self.run_app, environ, partial(call_soon, send))
    # File ".../python3.13/concurrent/futures/thread.py", line 59, in run
    #     result = self.fn(*self.args, **self.kwargs)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 109, in run_app
    #     raise RuntimeError("WSGI app did not call start_response")
    # RuntimeError: WSGI app did not call start_response
```

## Bug 3: Multiple Delayed start_response Calls

In WSGI it's valid to call `start_response` a 2nd or 3rd time to replace the status code and headers with an error if your application encounters an error while handling the request. This is only allowed if the headers have not yet been sent.

> This delaying of response header transmission is to ensure that buffered and asynchronous **applications can replace their originally intended output with error output**, up until the last possible moment. For example, the application may need to change the response status from "200 OK" to "500 Internal Error", if an error occurs while the body is being generated within an application buffer.

> ...

> The application **may call start_response more than once**, if and only if the exc_info argument is provided.

In `hypercorn==0.17.3` the headers were sent immediately after calling the application, meaning that this wasn't possible to do while iterating on the application return value. The fix in #321 moved this logic further into the process, which was closer to the right moment, but missed one key detail from the WSGI spec.

> It (the start_response callable) must store them (the headers and status code) for the server or gateway to transmit only **after the first iteration** of the application return value that yields a **non-empty bytestring**

This is a subtle detail that can make the server incompatible with a valid WSGI application. WSGI reserves the right to "change its mind" up until a non-empty bytestring is returned, but `hypercorn==0.18.0` sends the headers after receiving a chunk with a value of `b""` during the first one (or more) iterations. To be fully compliant, the conditional inside the `for` loop that iterates on the application return value **must** check that the chunk yielded by the iterable is non-empty.

The following example is a very simple generator that is a valid WSGI app, but sends the incorrect header information on `hypercorn==0.18.0`. It also crashes `hypercorn<=0.17.3` due to it being a generator and Bug 1, which is uninteresting.

```python
def wsgi_app_generator_delayed_start_response(environ: dict, start_response: Callable) -> Generator[bytes, None, None]:
    """
    A synchronous generator usable as a valid WSGI Application, which calls start_response
    a second time after yielding up empty chunks of body.
    
    This application exercises the ability for WSGI apps to change their status code
    right up until the last possible second before the first non-empty chunk of body is
    sent. The status code and headers must be buffered until the first non-empty chunk of body
    is yielded by this generator, and should be overwritable until that time.
    """
    # Initial 200 OK status that will be overwritten before any non-empty chunks of body are sent
    start_response("200 OK", [("X-Test-Header", "Old-Value")])
    yield b""

    try:
        raise ValueError
    except ValueError:
        # start_response may be called more than once before the first non-empty byte string
        # is yielded by this generator. However, it is a fatal error to call start_response()
        # a second time without passing the exc_info argument.
        start_response("500 Internal Server Error", [("X-Test-Header", "New-Value")], exc_info=sys.exc_info())

    yield b"Hello, "
    yield b"world!"

    # hypercorn==0.18.0 does not crash, but sends the incorrect header information
    # 200 OK
    # x-test-header: Old-Value
    #
    # hypercorn==0.17.3 produces the following traceback due to not handling generators correctly
    #
    # Traceback (most recent call last):
    # File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 27, in _handle
    #     await app(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
    #     await self.handle_http(scope, receive, send, sync_spawn, call_soon)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 83, in handle_http
    #     await sync_spawn(self.run_app, environ, partial(call_soon, send))
    # File ".../python3.13/concurrent/futures/thread.py", line 59, in run
    #     result = self.fn(*self.args, **self.kwargs)
    # File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 109, in run_app
    #     raise RuntimeError("WSGI app did not call start_response")
    # RuntimeError: WSGI app did not call start_response
```


## Bug 4: Improper start_response Calls Don't Raise An Exception

There are (at least) 2 situations where `start_response` is supposed to raise an exception according to the WSGI spec.

> However, if exc_info is provided, and the HTTP headers have already been sent, start_response must raise an error, and should re-raise using the exc_info tuple. That is: `raise exc_info[1].with_traceback(exc_info[2])`

> It is a fatal error to call start_response without the exc_info argument if start_response has already been called within the current invocation of the application.

Currently there are no checks for these errors in `start_response`. If called multiple times, it will silently change the stored headers and status code even if they've already been sent, thus failing to transmit that information to the client.

The following example incorrectly allows multiple calls to `start_response` on `hypercorn<=0.18.0`.

```python
def wsgi_app_multiple_start_response_no_exc_info(
    environ: dict, start_response: Callable
) -> list[bytes]:
    """
    An invalid WSGI Application, which calls start_response a second time 
    without passing an exception tuple in via the exc_info argument.

    This is considered a fatal error in the WSGI specification and should raise an exception.
    """

    # Calling start_response multiple times without exc_info should raise an error
    start_response("200 OK")
    start_response("202 Accepted")
    return []
```

The following additional example fails to reraise the caught exception when passed into `start_response` as `exc_info`, which would cause the application to continue attempting to send the body from an error handler but without updating the 200 status code.

```python
def wsgi_app_generator_multiple_start_response_after_body(
    environ: dict, start_response: Callable
) -> Generator[bytes, None, None]:
    """
    An invalid WSGI Application, which calls start_response a second time 
    after the first non-empty byte string is returned. This should reraise the exception
    as the headers and status code have already been sent.

    This is considered a fatal error in the WSGI specification and should raise an exception.
    """

    # Calling start_response multiple times without exc_info should raise an error
    start_response("200 OK")
    yield b"Hello, world!"

    try:
        raise ValueError
    except ValueError:
        # start_response may not be called again after the first non-empty byte string is returned
        # 
        # It is a fatal error to call start_response() a second time without passing an exception
        # tuple in via the exc_info argument, so ensure we do that to avoid raising the wrong
        # exception.
        start_response(
            "500 Internal Server Error", [("X-Test-Header", "New-Value")], exc_info=sys.exc_info()
        )
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WSGIWrapper violates WSGI protocol #331

TL;DR

Overview

The WSGI Specification

Bug 1: Generator Applications

Bug 2: HEAD Requests

Bug 3: Multiple Delayed start_response Calls

Bug 4: Improper start_response Calls Don't Raise An Exception

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

WSGIWrapper violates WSGI protocol #331

Description

TL;DR

Overview

The WSGI Specification

Bug 1: Generator Applications

Bug 2: HEAD Requests

Bug 3: Multiple Delayed start_response Calls

Bug 4: Improper start_response Calls Don't Raise An Exception

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions