Is there an existing issue for this?
Current Behavior
The code in hmac.py (in particular, the get_hmac() function) is being used as part of HashedElGamalCiphertext for two very different purposes.
For the first purpose, computing an HMAC, it seems to be correct. Example, where the spec says to use HMAC (e.g., $c_2 = HMAC(k_0, c_0 | c_1)$), the code says:
mac = get_hmac(mac_key, to_mac)
This is fine.
The ElectionGuard spec notes that the actual encryption of the message ($c_1 = m_1 \oplus k_1 | m_2 \oplus k_2 | \cdots | m_{b_m} \oplus k_{b_m}$) is supposed to use a NIST 800-108-compliant key derivation function (KDF). The Python code is not compliant with the NIST spec. In particular, the code that creates those $k_i$ values does this:
data_key = get_hmac(
session_key.to_hex_bytes(),
encryption_seed.to_hex_bytes(),
bit_length,
(i + 1),
)
So, what's going on inside get_hmac? Its input to the HMAC function is ultimately the byte concatenation of three values: start_byte + msg + end_byte where start_byte is the index $i$, and end_byte is the message length. The NIST spec says:
K(i) := PRF (KI, [i] || Label || 0x00 || Context || [L])
Suggested fixes below.
Expected Behavior
It's confusing to use get_hmac to serve two radically different purposes. First and foremost, it would be good to have a KDF class that implements the NIST spec correctly and then use it to compute the $k_i$'s. Anyway, even if you don't do that, there are a few things that you do need to do:
- You're not doing anything for the NIST-defined
Label or Context strings (okay for a start, but these strings serve a purpose and should be defined and used)
- You're not using the zero byte after the
Label (Suggestion: add the byte.)
- You're using precisely 32 bits to encode the two numbers ($i$ and $L$), when it's possible (if unlikely) that this code might be used to encrypt more than 4 billion blocks. (Suggestion: use a variable-length encoding for lengths.)
- You should not be converting keys to hexadecimal bytes. Those are originally
ElementModQ values, which already fit into 32 bytes. This encoding process yields more bits than HMAC-SHA256 normally supports for its key values. (Suggestion: extract the internal BigInteger as exactly 32 big-endian bytes.)
Steps To Reproduce
No response
Environment
No response
Anything else?
No response
Is there an existing issue for this?
Current Behavior
The code in
hmac.py(in particular, theget_hmac()function) is being used as part ofHashedElGamalCiphertextfor two very different purposes.For the first purpose, computing an HMAC, it seems to be correct. Example, where the spec says to use HMAC (e.g., $c_2 = HMAC(k_0, c_0 | c_1)$), the code says:
This is fine.
The ElectionGuard spec notes that the actual encryption of the message ($c_1 = m_1 \oplus k_1 | m_2 \oplus k_2 | \cdots | m_{b_m} \oplus k_{b_m}$ ) is supposed to use a NIST 800-108-compliant key derivation function (KDF). The Python code is not compliant with the NIST spec. In particular, the code that creates those $k_i$ values does this:
So, what's going on inside$i$ , and
get_hmac? Its input to the HMAC function is ultimately the byte concatenation of three values:start_byte + msg + end_bytewherestart_byteis the indexend_byteis the message length. The NIST spec says:Suggested fixes below.
Expected Behavior
It's confusing to use$k_i$ 's. Anyway, even if you don't do that, there are a few things that you do need to do:
get_hmacto serve two radically different purposes. First and foremost, it would be good to have aKDFclass that implements the NIST spec correctly and then use it to compute theLabelorContextstrings (okay for a start, but these strings serve a purpose and should be defined and used)Label(Suggestion: add the byte.)ElementModQvalues, which already fit into 32 bytes. This encoding process yields more bits than HMAC-SHA256 normally supports for its key values. (Suggestion: extract the internal BigInteger as exactly 32 big-endian bytes.)Steps To Reproduce
No response
Environment
No response
Anything else?
No response