-
Notifications
You must be signed in to change notification settings - Fork 132
Description
TL;DR
There's been and still are several bugs in how hypercorn handles the edge cases of WSGI, because WSGI is incredibly complicated. I've opened a fix PR with lots of new tests.
Overview
Between v0.17.3 and v0.18.0 there was work done on the WSGIWrapper class to attempt to make it more compatible with the WSGI specification, specifically PEP-3333. Even more specifically, the timing of calls made to start_response. This issue was raised in #320 and a fix was implemented in #321. Unfortunately, the implementation is closer to correct but still violates the WSGI protocol in a way that broke our existing (and valid) applications. While investigating our own broken CI, I found multiple WSGI violations that can be fixed with some slight tweaks. I'll attempt to outline all 4 relevant bugs with very simple examples that exercise the problems.
The WSGI Specification
An excerpt from the WSGI specification, emphasis mine. I'll reference specific lines of this later.
However, the start_response callable must not actually transmit the response headers. Instead, it must store them for the server or gateway to transmit only after the first iteration of the application return value that yields a non-empty bytestring, or upon the application’s first invocation of the write() callable. In other words, response headers must not be sent until there is actual body data available, or until the application’s returned iterable is exhausted. (The only possible exception to this rule is if the response headers explicitly include a Content-Length of zero.)
This delaying of response header transmission is to ensure that buffered and asynchronous applications can replace their originally intended output with error output, up until the last possible moment. For example, the application may need to change the response status from "200 OK" to "500 Internal Error", if an error occurs while the body is being generated within an application buffer.
The exc_info argument, if supplied, must be a Python sys.exc_info() tuple. This argument should be supplied by the application only if start_response is being called by an error handler. If exc_info is supplied, and no HTTP headers have been output yet, start_response should replace the currently-stored HTTP response headers with the newly-supplied ones, thus allowing the application to "change its mind" about the output when an error has occurred.
However, if exc_info is provided, and the HTTP headers have already been sent, start_response must raise an error, and should re-raise using the exc_info tuple. That is:
raise exc_info[1].with_traceback(exc_info[2])
...
The application may call start_response more than once, if and only if the exc_info argument is provided. More precisely, it is a fatal error to call start_response without the exc_info argument if start_response has already been called within the current invocation of the application. This includes the case where the first call to start_response raised an error. (See the example CGI gateway above for an illustration of the correct logic.)
Bug 1: Generator Applications
This bug was the one fixed in #321, which is that WSGI applications don't need to actually call start_response when you "call" them. They can be things like classes and generators, which would be initialized rather than called like a function. These won't execute any of the contained code until you start to iterate on them. Because of htis, you can't transmit headers until you've started iterating on the return value from calling the application function.
Unfortunately, the fix for this in #321 caused Bug 2. We need to regressing on Bug 1 while we fix that.
The following example is a very simple generator that is a valid WSGI app, and crashes hypercorn<=0.17.3 but not hypercorn==0.18.0.
def wsgi_app_generator(environ: dict, start_response: Callable) -> Generator[bytes, None, None]:
"""
A synchronous generator usable as a valid WSGI Application.
Notably, the WSGI specification ensures only that start_response() is called
before the first item is returned from the iterator. It does not have to
be immediately called when app(environ, start_response) is called.
Using a generator for a WSGI app will delay calling start_response() until after
something begins iterating on it, so only invoking the app and not iterating on
the returned iterable will not be sufficient to get the status code and headers.
It is also valid to send multiple chunks of data, but the status code and headers
must be sent before the first non-empty chunk of body data is sent.
Therefore it is not valid to send the status code and headers before iterating on
the returned generator. It is only valid to send status code and headers during
iteration of the generator, immediately after the first non-empty byte
string is returned, but before continuing to iterate further.
"""
start_response("200 OK", [("X-Test-Header", "Test-Value")])
yield b"Hello, "
yield b"world!"
# hypercorn==0.18.0 works with no issues
#
# hypercorn==0.17.3 produces the following traceback, which was fixed in hypercorn==0.18.0
#
# Traceback (most recent call last):
# File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 27, in _handle
# await app(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
# await self.handle_http(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 83, in handle_http
# await sync_spawn(self.run_app, environ, partial(call_soon, send))
# File ".../python3.13/concurrent/futures/thread.py", line 59, in run
# result = self.fn(*self.args, **self.kwargs)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 109, in run_app
# raise RuntimeError("WSGI app did not call start_response")
# RuntimeError: WSGI app did not call start_responseBug 2: HEAD Requests
There exist less common HTTP methods like HEAD which do not return a response body. When implementing a WSGI app that supports these methods, it's common to return an empty iterable. An empty list suffices, as would a generator which never reaches any yield statements before exiting. The important information is all sent in the headers and status code, via start_response.
The fix implemented in #321 moved the transmission of headers to inside the for loop, which unfortunately gets passed over if there are 0 elements in the application return value.
In other words, response headers must not be sent until there is actual body data available, or until the application’s returned iterable is exhausted.
The correct behavior involves checking if the response headers have been sent after the for loop exhausts the iterable, and if they haven't yet been sent, sending them right at the end of the function. You can't move the logic from the for loop to the end, because applications that do return a non-empty body need the headers to be sent before we start sending that body. Therefore, you have to actually check in 2 distinct places and duplicate the logic for sending headers.
The following examples are both a function and a generator that each are valid WSGI apps, and crashes hypercorn==0.18.0 but not hypercorn<=0.17.3. Notably the generator with no body will crash both versions, but for different reasons.
def wsgi_app_no_body(environ: dict, start_response: Callable) -> list[bytes]:
"""
A WSGI Application that does not yield up any body chunks when iterated on.
This is most common when supporting HTTP methods such as HEAD, which is identical
to GET except that the server MUST NOT return a message body in the response.
The iterable returned by this app will have no contents, immediately exiting
any for loops attempting to iterate on it. Even though no body was returned
from the application, this is still a valid HTTP request and MUST send the
status code and headers as the response. Failing to do so violates the
WSGI, ASGI, and HTTP specifications.
Therefore, the status code and headers must be sent after the iteration completes,
as it is not valid to send them only during iteration. If headers are only sent
within the body of the for loop, this application will cause the server to fail
to send this information at all. However, care must be taken to check
whether the status code and headers were already sent during the iteration process,
as they may have been sent during the iteration process for applications with
non-empty bodies. If this isn't accounted for they will be sent twice in error.
"""
start_response("200 OK", [("X-Test-Header", "Test-Value")])
return []
# hypercorn==0.17.3 works with no issues
#
# hypercorn==0.18.0 produces the following traceback
#
# Traceback (most recent call last):
# File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 28, in _handle
# await app(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
# await self.handle_http(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 84, in handle_http
# await send({"type": "http.response.body", "body": b"", "more_body": False})
# File ".../python3.13/site-packages/hypercorn/protocol/http_stream.py", line 241, in app_send
# raise UnexpectedMessageError(self.state, message["type"])
# hypercorn.utils.UnexpectedMessageError: Unexpected message type, http.response.body given the state ASGIHTTPState.REQUEST
def wsgi_app_generator_no_body(environ: dict, start_response: Callable) -> Generator[bytes, None, None]:
"""
A synchronous generator usable as a valid WSGI Application, which
does not yield up any body chunks when iterated on.
This is a very complicated edge case. It is most commonly found when building a
generator based WSGI app with support for HTTP methods such as HEAD, which is
identical to GET except that the server MUST NOT return a message body in the response.
1. The application is subject to the same delay in calling start_response until
after the server has begun iterating on the returned generator object.
2. The status code and headers are also not available during iteration, as the
empty generator will immediately end any for loops that attempt to iterate on it.
3. Even though no body was returned from the application, this is still a valid
HTTP request and MUST send the status code and headers as the response. Failing
to do so violates the WSGI, ASGI, and HTTP specifications.
Therefore, the status code and headers must be sent after the iteration completes,
as it is not valid to send them only during iteration. If headers are only sent
within the body of the for loop, this application will cause the server to fail
to send this information at all. However, care must be taken to check
whether the status code and headers were already sent during the iteration process,
as they may have been sent during the iteration process for applications with
non-empty bodies. If this isn't accounted for they will be sent twice in error.
"""
start_response("200 OK", [("X-Test-Header", "Test-Value")])
if False:
yield b"" # Unreachable yield makes this an empty generator # noqa
# hypercorn==0.18.0 produces the following traceback
#
# Traceback (most recent call last):
# File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 28, in _handle
# await app(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
# await self.handle_http(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 84, in handle_http
# await send({"type": "http.response.body", "body": b"", "more_body": False})
# File ".../python3.13/site-packages/hypercorn/protocol/http_stream.py", line 241, in app_send
# raise UnexpectedMessageError(self.state, message["type"])
# hypercorn.utils.UnexpectedMessageError: Unexpected message type, http.response.body given the state ASGIHTTPState.REQUEST
# hypercorn==0.17.3 produces the following traceback, which differs from the above
#
# Traceback (most recent call last):
# File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 27, in _handle
# await app(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
# await self.handle_http(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 83, in handle_http
# await sync_spawn(self.run_app, environ, partial(call_soon, send))
# File ".../python3.13/concurrent/futures/thread.py", line 59, in run
# result = self.fn(*self.args, **self.kwargs)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 109, in run_app
# raise RuntimeError("WSGI app did not call start_response")
# RuntimeError: WSGI app did not call start_responseBug 3: Multiple Delayed start_response Calls
In WSGI it's valid to call start_response a 2nd or 3rd time to replace the status code and headers with an error if your application encounters an error while handling the request. This is only allowed if the headers have not yet been sent.
This delaying of response header transmission is to ensure that buffered and asynchronous applications can replace their originally intended output with error output, up until the last possible moment. For example, the application may need to change the response status from "200 OK" to "500 Internal Error", if an error occurs while the body is being generated within an application buffer.
...
The application may call start_response more than once, if and only if the exc_info argument is provided.
In hypercorn==0.17.3 the headers were sent immediately after calling the application, meaning that this wasn't possible to do while iterating on the application return value. The fix in #321 moved this logic further into the process, which was closer to the right moment, but missed one key detail from the WSGI spec.
It (the start_response callable) must store them (the headers and status code) for the server or gateway to transmit only after the first iteration of the application return value that yields a non-empty bytestring
This is a subtle detail that can make the server incompatible with a valid WSGI application. WSGI reserves the right to "change its mind" up until a non-empty bytestring is returned, but hypercorn==0.18.0 sends the headers after receiving a chunk with a value of b"" during the first one (or more) iterations. To be fully compliant, the conditional inside the for loop that iterates on the application return value must check that the chunk yielded by the iterable is non-empty.
The following example is a very simple generator that is a valid WSGI app, but sends the incorrect header information on hypercorn==0.18.0. It also crashes hypercorn<=0.17.3 due to it being a generator and Bug 1, which is uninteresting.
def wsgi_app_generator_delayed_start_response(environ: dict, start_response: Callable) -> Generator[bytes, None, None]:
"""
A synchronous generator usable as a valid WSGI Application, which calls start_response
a second time after yielding up empty chunks of body.
This application exercises the ability for WSGI apps to change their status code
right up until the last possible second before the first non-empty chunk of body is
sent. The status code and headers must be buffered until the first non-empty chunk of body
is yielded by this generator, and should be overwritable until that time.
"""
# Initial 200 OK status that will be overwritten before any non-empty chunks of body are sent
start_response("200 OK", [("X-Test-Header", "Old-Value")])
yield b""
try:
raise ValueError
except ValueError:
# start_response may be called more than once before the first non-empty byte string
# is yielded by this generator. However, it is a fatal error to call start_response()
# a second time without passing the exc_info argument.
start_response("500 Internal Server Error", [("X-Test-Header", "New-Value")], exc_info=sys.exc_info())
yield b"Hello, "
yield b"world!"
# hypercorn==0.18.0 does not crash, but sends the incorrect header information
# 200 OK
# x-test-header: Old-Value
#
# hypercorn==0.17.3 produces the following traceback due to not handling generators correctly
#
# Traceback (most recent call last):
# File ".../python3.13/site-packages/hypercorn/asyncio/task_group.py", line 27, in _handle
# await app(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 51, in __call__
# await self.handle_http(scope, receive, send, sync_spawn, call_soon)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 83, in handle_http
# await sync_spawn(self.run_app, environ, partial(call_soon, send))
# File ".../python3.13/concurrent/futures/thread.py", line 59, in run
# result = self.fn(*self.args, **self.kwargs)
# File ".../python3.13/site-packages/hypercorn/app_wrappers.py", line 109, in run_app
# raise RuntimeError("WSGI app did not call start_response")
# RuntimeError: WSGI app did not call start_responseBug 4: Improper start_response Calls Don't Raise An Exception
There are (at least) 2 situations where start_response is supposed to raise an exception according to the WSGI spec.
However, if exc_info is provided, and the HTTP headers have already been sent, start_response must raise an error, and should re-raise using the exc_info tuple. That is:
raise exc_info[1].with_traceback(exc_info[2])
It is a fatal error to call start_response without the exc_info argument if start_response has already been called within the current invocation of the application.
Currently there are no checks for these errors in start_response. If called multiple times, it will silently change the stored headers and status code even if they've already been sent, thus failing to transmit that information to the client.
The following example incorrectly allows multiple calls to start_response on hypercorn<=0.18.0.
def wsgi_app_multiple_start_response_no_exc_info(
environ: dict, start_response: Callable
) -> list[bytes]:
"""
An invalid WSGI Application, which calls start_response a second time
without passing an exception tuple in via the exc_info argument.
This is considered a fatal error in the WSGI specification and should raise an exception.
"""
# Calling start_response multiple times without exc_info should raise an error
start_response("200 OK")
start_response("202 Accepted")
return []The following additional example fails to reraise the caught exception when passed into start_response as exc_info, which would cause the application to continue attempting to send the body from an error handler but without updating the 200 status code.
def wsgi_app_generator_multiple_start_response_after_body(
environ: dict, start_response: Callable
) -> Generator[bytes, None, None]:
"""
An invalid WSGI Application, which calls start_response a second time
after the first non-empty byte string is returned. This should reraise the exception
as the headers and status code have already been sent.
This is considered a fatal error in the WSGI specification and should raise an exception.
"""
# Calling start_response multiple times without exc_info should raise an error
start_response("200 OK")
yield b"Hello, world!"
try:
raise ValueError
except ValueError:
# start_response may not be called again after the first non-empty byte string is returned
#
# It is a fatal error to call start_response() a second time without passing an exception
# tuple in via the exc_info argument, so ensure we do that to avoid raising the wrong
# exception.
start_response(
"500 Internal Server Error", [("X-Test-Header", "New-Value")], exc_info=sys.exc_info()
)