gh-145264: Do not ignore excess Base64 data after the first padded quad#145267
gh-145264: Do not ignore excess Base64 data after the first padded quad#145267serhiy-storchaka wants to merge 3 commits intopython:mainfrom
Conversation
…ded quad Base64 decoder (see binascii.a2b_base64(), base64.b64decode(), etc) no longer ignores excess data after the first padded quad in non-strict (default) mode. Instead, in conformance with RFC 4648, it ignores the pad character, "=", if it is present before the end of the encoded data.
50967e0 to
0229b06
Compare
| */ | ||
| goto done; | ||
| } | ||
| if (!strict_mode || ignorechar(BASE64_PAD, ignorechars, ignorecache)) { |
There was a problem hiding this comment.
add a comment in this block linking to the RFC section.
| @@ -0,0 +1,4 @@ | |||
| Base64 decoder (see :func:`binascii.a2b_base64`, :func:`base64.b64decode`, etc) no | |||
| longer ignores excess data after the first padded quad in non-strict | |||
| (default) mode. Instead, in conformance with :rfc:`4648`, it ignores | |||
There was a problem hiding this comment.
I guess this is in accordance with the MAY in https://datatracker.ietf.org/doc/html/rfc4648#section-3.3 about ignoring PADs as non-alphabet data? it'd be good to cite the specific section.
Lib/test/test_binascii.py
Outdated
| # Test excess data exceptions | ||
| def assertExcessData(data, non_strict_expected, | ||
| ignore_padchar_expected=None): | ||
| def assertExcessData(data, non_strict_expected): |
There was a problem hiding this comment.
rename this from non_strict_expected to just expected.
There was a problem hiding this comment.
In strict mode you get an error. You get that value only in non-strict mode, either when strict_mode=False, or when ignorechars contains "=".
But I agree that expected is shorter. The old name was even longer: non_strict_mode_expected_result.
|
|
||
| assertExcessData(b'ab==c', b'i') | ||
| assertExcessData(b'ab==cd', b'i', b'i\xb7\x1d') | ||
| assertExcessData(b'abc=d', b'i\xb7', b'i\xb7\x1d') |
There was a problem hiding this comment.
this test used to highlight the difference between strict and non-strict mode. we should keep a test highlighting that.
There was a problem hiding this comment.
In strict mode we get an error. We get a result only when strict_mode=False or new argument ignorechars contains "=", and they given different results. Now this difference has been fixed.
Base64 decoder (see binascii.a2b_base64(), base64.b64decode(), etc) no longer ignores excess data after the first padded quad in non-strict (default) mode. Instead, in conformance with RFC 4648, it ignores the pad character, "=", if it is present before the end of the encoded data.