Skip to content

init: add tpm2 unlock enhancements on top of upstream 35ab72d#370

Closed
enihcam wants to merge 4 commits into
anatol:masterfrom
enihcam:fix/tpm2-unlock-dedup
Closed

init: add tpm2 unlock enhancements on top of upstream 35ab72d#370
enihcam wants to merge 4 commits into
anatol:masterfrom
enihcam:fix/tpm2-unlock-dedup

Conversation

@enihcam
Copy link
Copy Markdown

@enihcam enihcam commented May 27, 2026

Summary

This PR adds additional TPM2 unlock enhancements on top of upstream commit 35ab72d.

Changes

Core fixes (built on upstream 35ab72d)

  • openTPM: fallback from /dev/tpmrm0 to /dev/tpm0 for systems with legacy TPM interface
  • tpmAwaitReady: extend timeout 3s -> 5s for slower TPM firmware initialization
  • flattenSystemdTPM2: handle nested systemd-tpm2/systemd_tpm2 token structures (systemd v255+ stores tokens with nested JSON)

Unit tests

  • TestFlattenSystemdTPM2: nested token flattening (8 cases)
  • TestPCRStringParsing: PCR list parsing ("10+13", "0+7", etc)
  • TestPolicyHashParsing: hex/base64 policy hash parsing

Upstream dedup

Upstream 35ab72d already covers:

  • extractSRKHandle: IESYS_RESOURCE_SERIALIZE parsing for persistent SRK
  • tpm2PINAuthValue: salted (v255+) vs unsalted PIN derivation

This PR adds only the additional fixes not in upstream.

Fixes

Fixes #233

Comment thread init/luks.go Outdated
Copy link
Copy Markdown
Owner

@anatol anatol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixing multiple features in one commit makes it more difficult to inspect the git history in the future.

The commit needs to be split into 4 commits - one per fix/feature.

@anatol
Copy link
Copy Markdown
Owner

anatol commented May 27, 2026

cc @pilotstew who also worked on tpm

@anatol
Copy link
Copy Markdown
Owner

anatol commented May 27, 2026

@enihcam also please provide more information in the git commit description - what exactly requires all the changes - why the timeout needs to be increated? why do you need to flatten the tpm structure? etc.. What exactly the reason for these changes?

@pilotstew
Copy link
Copy Markdown
Contributor

Couple of things standout. I can offer some comments here in a little bit. I'm deep into reworking how we handle fido messaging for quiet. The 2 pr's I submitted are steps toward that. As soon as I finish these I'll take a look. I know .13 is getting close and I'll try to get these in soon.

@pilotstew
Copy link
Copy Markdown
Contributor

Before I look at the rest. Master's recoverSystemdTPM2Password already base64-encodes the unsealed secret (init/luks.go:722), and systemd-cryptenroll stores it in that same base64 form (base64memcrypt_keyslot_add_by_volume_key in cryptenroll-tpm2.c). So for a real cryptenroll-enrolled token, branch (B) "raw as is" in tryVariants should already match the slot, and branch (A) shouldn't fire at all...its len(secret) == 32 gate fails since the secret reaching it is 44 chars.

Could you share how the token in your repro was created, systemd-cryptenroll invocation + version, or otherwise? Want to make sure I understand the failure case before commenting on the rest.

zaiteki added 2 commits May 28, 2026 09:40
- openTPM: fallback from /dev/tpmrm0 to /dev/tpm0
- tpmAwaitReady: extend timeout 3s -> 5s
- flattenSystemdTPM2: handle nested systemd-tpm2 token structures
- recoverTokenPassword: add tryVariants helper for multiple secret formats

Unit tests:
- TestFlattenSystemdTPM2
- TestTPM2PINAuthValueEmptyCases
- TestPolicyHashParsing
- TestPCRStringParsing
…ants

Separates secret transformation (5 variants) from unsealing logic.
Addresses review comment: password extraction should be separated from unsealing.
@enihcam enihcam force-pushed the fix/tpm2-unlock-dedup branch from d213c81 to d4955ce Compare May 28, 2026 01:40
@enihcam
Copy link
Copy Markdown
Author

enihcam commented May 28, 2026

@anatol Here's more detail on the changes:

1. Timeout 3s → 5s

Some TPM devices/firmware take longer than 3 seconds to initialize on cold boot. The error "no tpm devices found after 3 seconds" was reported by users with slower TPM initialization. 5s gives buffer without significant boot time impact.

2. Nested token flattening

systemd v255+ sometimes stores tokens with nested JSON structures:

{"data": {"tpm2-blob": "...", "tpm2-pcrs": "10+13"}}

instead of flat:

{"tpm2-blob": "...", "tpm2-pcrs": "10+13"}

Without flattening, booster couldn't read fields like tpm2-blob, tpm2-pcrs, etc. — resulting in "failed to recover systemd-tpm2 password".

3. /dev/tpm0 fallback

The original code only tried /dev/tpmrm0 (resource manager interface). Some systems only have /dev/tpm0 (legacy interface). Adding fallback handles both.

4. Multiple secret format variants

TPM unsealing produces raw 32 bytes, but different tools store these differently:

  • Some store raw bytes
  • Some Base64-encode
  • Some add trailing newlines

The tryVariants helper tries all common formats to maximize compatibility across systemd versions and tooling.

recoverSystemdTPM2Password already returns base64-encoded password,
so tryVariants was dead code for standard systemd-cryptenroll tokens.
Simplify to use tryPassphraseAgainstSlots like upstream.
@enihcam
Copy link
Copy Markdown
Author

enihcam commented May 28, 2026

@pilotstew You're correct. Looking at recoverSystemdTPM2Password (line 826), it already returns base64.StdEncoding.EncodeToString(password), so for standard systemd-cryptenroll tokens the secret reaching tryVariants is already base64 (44 chars), and branch (A) len(secret) == 32 never fires.

The tryVariants helper was carried over from bb10root's implementation, but I don't have details on the specific token that required it.

- TestFlattenSystemdTPM2: nested token flattening
- TestPCRStringParsing: PCR list parsing (10+13, 0+7, etc)
- TestPolicyHashParsing: hex/base64 policy hash parsing

Also fix flattenSystemdTPM2 to recursively flatten nested wrappers.
@pilotstew
Copy link
Copy Markdown
Contributor

Good call removing tryVariants — that path matches upstream cleanly now.

3→5s timeout — no objection 👍

As far as I can tell the rest either describe a JSON shape systemd doesn't actually write, or re-do something upstream already handles:

  • /dev/tpm0 fallback...the case where /dev/tpm0 exists without /dev/tpmrm0 is either TPM 1.2 hardware (rejected 3 lines later by GetManufacturer) or the udev race fixed by #190 (closes #116) — where tpm0 is created microseconds before tpmrm0 and tpmAwaitReady is supposed to wait for the latter. Falling back to /dev/tpm0 bypasses that wait and opens the device raw, skipping the in-kernel resource manager — which go-tpm/legacy/tpm2 relies on for transient-handle and session lifecycle. If tpmAwaitReady didn't work in your case, that's the real bug and worth its own issue with the udev event log.

  • flattenSystemdTPM2tpm2_make_luks2_json writes flat top-level keys (~line 8788 in tpm2-util.c); no "data" / "systemd-tpm2" wrapper in any release. Master's recoverSystemdTPM2Password unmarshals the flat form directly.

  • parsePolicyHash base64 fallback — tpm2-policy-hash is encoded with sd_json_variant_new_hex (~line 8775). Hex always; master's hex.DecodeString already matches.

  • parsePCRStringtpm2-pcrs is a JSON integer array per tpm2_make_pcr_json_array (~line 8642); the "10+13" form is the --tpm2-pcrs= CLI input, not the stored shape. Master decodes the field as []int directly.

  • Cross-slot probing in recoverTokenPassword — what's this a fix for? slotsToTry ends up as effectively d.Slots() ordered with t.Slots first, so a TPM2 token's secret gets PBKDF2'd against the FIDO2 slot, the passphrase slot, etc. Each wrong slot is a full PBKDF2 iteration (~1s), which adds boot latency on multi-token devices, and it inverts the scoping d4d28f9 made explicit — tokens unlock their own slots; the keyboard handles the rest.

Are these fixes for a broken token or boot? Do you have a cryptsetup luksDump --dump-json-metadata or the udev log for the TPM case that justify these changes? I could be more specific I understood the issue.

@enihcam
Copy link
Copy Markdown
Author

enihcam commented May 28, 2026

Good call removing tryVariants — that path matches upstream cleanly now.

3→5s timeout — no objection 👍

As far as I can tell the rest either describe a JSON shape systemd doesn't actually write, or re-do something upstream already handles:

  • /dev/tpm0 fallback...the case where /dev/tpm0 exists without /dev/tpmrm0 is either TPM 1.2 hardware (rejected 3 lines later by GetManufacturer) or the udev race fixed by #190 (closes #116) — where tpm0 is created microseconds before tpmrm0 and tpmAwaitReady is supposed to wait for the latter. Falling back to /dev/tpm0 bypasses that wait and opens the device raw, skipping the in-kernel resource manager — which go-tpm/legacy/tpm2 relies on for transient-handle and session lifecycle. If tpmAwaitReady didn't work in your case, that's the real bug and worth its own issue with the udev event log.
  • flattenSystemdTPM2tpm2_make_luks2_json writes flat top-level keys (~line 8788 in tpm2-util.c); no "data" / "systemd-tpm2" wrapper in any release. Master's recoverSystemdTPM2Password unmarshals the flat form directly.
  • parsePolicyHash base64 fallback — tpm2-policy-hash is encoded with sd_json_variant_new_hex (~line 8775). Hex always; master's hex.DecodeString already matches.
  • parsePCRStringtpm2-pcrs is a JSON integer array per tpm2_make_pcr_json_array (~line 8642); the "10+13" form is the --tpm2-pcrs= CLI input, not the stored shape. Master decodes the field as []int directly.
  • Cross-slot probing in recoverTokenPassword — what's this a fix for? slotsToTry ends up as effectively d.Slots() ordered with t.Slots first, so a TPM2 token's secret gets PBKDF2'd against the FIDO2 slot, the passphrase slot, etc. Each wrong slot is a full PBKDF2 iteration (~1s), which adds boot latency on multi-token devices, and it inverts the scoping d4d28f9 made explicit — tokens unlock their own slots; the keyboard handles the rest.

Are these fixes for a broken token or boot? Do you have a cryptsetup luksDump --dump-json-metadata or the udev log for the TPM case that justify these changes? I could be more specific I understood the issue.

thanks for helping me understand the details. it looks like the PR is redundant, so let's close it for now.

@pilotstew
Copy link
Copy Markdown
Contributor

pilotstew commented May 28, 2026

Thanks for working on this @enihcam, and sorry for the confusion @bb10root's #304 was left hanging after I superseded it with #342. If you'd started from #304 before #342 landed, or were still on the v0.12 release, that overlap would have been easy to miss. TPM is a big part of the push to get v0.13 out. Unrelated to TPM but #372 is accepted and I have a small messaging patch ready to go. Once those are merged I suspect @anatol will push v.13 sortly after.

The 3→5s timeout is still worthwhile on its own I may have coded that too tight.

If you want to test the current state in the meantime, the AUR booster-git package builds from master and includes the TPM2 fix.

Thanks again for taking a look at it. Always welcome more reviewers.

@anatol anatol closed this May 29, 2026
@anatol
Copy link
Copy Markdown
Owner

anatol commented May 29, 2026

Thank you @enihcam for looking at booster improvements, we really appreciate your contributions.

Let us know if you see any issues with the current HEAD.

@enihcam enihcam deleted the fix/tpm2-unlock-dedup branch May 29, 2026 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to unlock root partition with tpm2 key

4 participants