init/plymouth: replace plymouth CLI with direct socket IPC by pilotstew · Pull Request #358 · anatol/booster

pilotstew · 2026-05-07T03:19:34Z

Summary

Talk to plymouthd directly over its abstract Unix socket
(\x00/org/freedesktop/plymouthd) instead of fork-exec'ing the
plymouth CLI client. The wire protocol is documented in Plymouth's
ply-boot-protocol.h.

Carved into 5 commits for review:

init/plymouth: replace plymouth CLI with direct socket IPC —
add init/plysocket.go (transport), swap the 5 exec call sites for
ping / show-splash / display-message / quit / update-root-fs. Drop
/usr/bin/plymouth from the initramfs (~80KB binary plus
glib + libply-boot-client transitive deps). No call-site
signature changes; plymouthAskPassword stays on the exec path
for now and converts in commit 5 when ctx threading lands.
init/plymouth: redirect plymouthd stderr to /dev/kmsg —
inherited fds are closed (FD_CLOEXEC) when booster exec's to
systemd, so a plymouthd that inherits booster's stderr receives
EPIPE on its next stderr write and dies before
plymouth-start.service can attach. Routing to /dev/kmsg
survives the handoff and lands the diagnostic output in the
kernel ring buffer.
init/plymouth: make plymouthMessage fire-and-forget —
plymouthd is single-threaded; a synchronous in-process
display-message call would block on splash state, slowing
concurrent unlock work. Wrap in a goroutine.
init/plymouth: thread ctx through waitForPlymouthInit —
small signature change so unlock paths can bail when a sibling
token has already cleared the volume, instead of blocking on
plymouthd init for a volume that's already being unlocked.
init/plymouth: ctx-aware password prompt with cancel-on-hangup
(the headline) — plymouthAskPassword(ctx, prompt) plus a
serialization mutex. On ctx cancel the underlying socket is closed;
the goroutine returns cleanly. plymouthd builds whose
connection-hangup handler tears down pending prompts also dismiss
the on-screen UI on close — older builds leave the UI visible
until the splash is otherwise cleared, but boot proceeds correctly
either way (matching upstream's prior behaviour minus the orphaned
plymouth subprocess and leaked goroutine the exec path produced).
askPasswordWithFallback skips the console fallback when ctx is
already cancelled, so an already-unlocking volume doesn't flash a
stray console prompt.

Why

Drops /usr/bin/plymouth and its transitive deps from the
initramfs.
Cancellation hygiene: a sibling-token-wins ctx cancel now closes
the socket and the prompt goroutine returns immediately. The prior
exec path left both the subprocess and the goroutine blocked
indefinitely.
Ctx-aware Plymouth path completes the cancellation arc started by
init/luks: convert done channel to context.Context #354 / init/luks: extend ctx cancellation to FIDO2-PIN and TPM2-PIN prompts #356 — keyboard, FIDO2-PIN, and TPM2-PIN prompts already
cancel on autounlock; this extends the same to the splash prompt.

No new dependencies. Tests for splash messaging (#357) still pass
unchanged.

Upstream Plymouth dependency

The cancel-on-hangup behavior in commit 5 — dismissing the on-screen
prompt when the booster client disconnects — depends on plymouthd
having a connection-hangup handler that tears down pending prompts.
That fix is Plymouth MR !393,
currently awaiting upstream review (closes Plymouth issues #125 and
#126).

This PR does not require !393 to land first. Without it, booster
still cleans up its end of the socket on cancel — boot proceeds
correctly, the goroutine returns, no orphaned subprocess. The only
visible difference is a stale prompt that lingers on the splash until
plymouth quits later in boot. Pure visual polish, not a regression vs.
the exec path.

Once !393 is in users' plymouthd, the splash prompt also dismisses
cleanly, completing the UX that #354 / #355 / #356 brought to the
console side.

Test plan

go build ./init ./generator clean
TestPromptVolumeUnlocked and TestTokenFriendlyName pass
Image builds with the smaller plymouth payload (no
/usr/bin/plymouth in the cpio)
Boot test on host with concurrent FIDO2 + TPM2 + keyboard
unlock paths
Boot test on a plymouthd build without the connection-hangup
handler — confirm no behavioural regression vs. exec path

Talk to plymouthd directly over its abstract Unix socket instead of fork-exec'ing the plymouth CLI client for ping, show-splash, display-message, quit, and update-root-fs. The wire protocol is documented in Plymouth's ply-boot-protocol.h. plysocket.go encapsulates dial / send / recv with three helpers: - plymouthSendRecv(frame): raw frame, 1-byte response - plymouthCmd(typ, arg): NUL-terminated argument frame, expects ACK - plymouthPingOnce(): single ping, returns true on ACK Drop /usr/bin/plymouth from the initramfs (~80KB binary plus glib + libply-boot-client transitive deps). plymouthd alone is sufficient now that init speaks the protocol directly. No call-site signature changes; plymouthAskPassword stays on the exec path for now and converts in a later commit when ctx threading lands.

Inherited file descriptors are closed (FD_CLOEXEC) when booster exec's to systemd, so a plymouthd that inherits booster's stderr will receive EPIPE on its next stderr write and die — before systemd's plymouth-start.service can attach to the existing session. Open /dev/kmsg explicitly and assign it to cmd.Stderr so plymouthd's diagnostic output ends up in the kernel ring buffer (visible in journalctl -k post-boot) and survives the handoff to systemd.

Wrap the display-message socket call in a goroutine. plymouthd is single-threaded and can stall during render setup or while a password prompt is on screen; a synchronous in-process call would block the calling goroutine on splash state, slowing concurrent unlock work.

Change waitForPlymouthInit() to take a ctx and return an error so unlock paths can bail when a sibling token has already cleared the volume — instead of blocking on plymouthd startup for a volume that's already being unlocked. requestKeyboardPassword previously waited unconditionally on plymouth-init then re-checked ctx; merging the two into a single ctx-aware select removes the redundant check and lets cancellation propagate cleanly through the wait.

plymouthAskPassword now takes a ctx. On cancellation the underlying socket is closed; the in-flight read returns and the goroutine exits cleanly, holding no resources from the dropped prompt. In plymouthd builds whose connection-hangup handler tears down pending prompts, the daemon also dismisses the on-screen prompt UI on the same socket close, completing the UX. Older plymouthd builds leave the prompt UI visible on the splash until something else clears it; boot still proceeds correctly — matching upstream's prior behaviour minus the orphaned plymouth subprocess and leaked goroutine the exec path produced. Serialize calls under plymouthPasswordMu so concurrent unlock goroutines don't stack two prompts on the splash. The mutex pairs with a re-check of ctx.Err() after acquire to skip our prompt entirely if the volume was unlocked while we were waiting on the lock — avoids flashing a UI for an already-unlocking volume. askPasswordWithFallback honors ctx cancellation by returning ctx.Err() without falling back to the console reader. Falling back would print a prompt to /dev/console that the LUKS unlock loop has already abandoned, which is at best confusing and at worst lets the user type a passphrase that gets discarded.

Address review feedback on PR anatol#358: add a file-level reference covering the upstream protocol header, frame format, the full verb table (with the six booster uses called out), and server response bytes.

anatol · 2026-05-07T19:42:30Z

Thank you very much for this change!

Adds a new NOTES subsection covering the concurrent-unlock model that landed across PRs anatol#350, anatol#353, anatol#355, anatol#356, anatol#357, anatol#358, and anatol#362: PIN-token serialization in ascending LUKS2 token-ID order, cancel-on-win semantics for keyboard/FIDO2-PIN/TPM2-PIN prompts on both the console and the Plymouth splash (with the MR !393 caveat for older Plymouth builds), and the per-token 3-attempt PIN cap with empty-PIN skip. Trims two paragraphs from the existing 'Password entry' subsection (auto-dismiss and PIN attempts) now that the new section covers them in fuller context. 'Password entry' keeps the Ctrl+W / Ctrl+U / Tab edit-key reference.

Adds a new NOTES subsection covering the concurrent-unlock model that landed across PRs #350, #353, #355, #356, #357, #358, and #362: PIN-token serialization in ascending LUKS2 token-ID order, cancel-on-win semantics for keyboard/FIDO2-PIN/TPM2-PIN prompts on both the console and the Plymouth splash (with the MR !393 caveat for older Plymouth builds), and the per-token 3-attempt PIN cap with empty-PIN skip. Trims two paragraphs from the existing 'Password entry' subsection (auto-dismiss and PIN attempts) now that the new section covers them in fuller context. 'Password entry' keeps the Ctrl+W / Ctrl+U / Tab edit-key reference.

pilotstew added 5 commits May 6, 2026 22:15

anatol reviewed May 7, 2026

View reviewed changes

Comment thread init/plysocket.go

init/plysocket: document the boot protocol surface

970172d

Address review feedback on PR anatol#358: add a file-level reference covering the upstream protocol header, frame format, the full verb table (with the six booster uses called out), and server response bytes.

anatol merged commit 629d841 into anatol:master May 7, 2026

anatol mentioned this pull request May 7, 2026

Slow unlock and passphrase request remains during boot latchset/clevis#150

Open

pilotstew deleted the pr/plymouth-socket-ipc branch May 9, 2026 02:53

pilotstew mentioned this pull request May 14, 2026

docs/manpage: restructure root discovery/crypttab and add worked examples #363

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

init/plymouth: replace plymouth CLI with direct socket IPC#358

init/plymouth: replace plymouth CLI with direct socket IPC#358
anatol merged 6 commits into
anatol:masterfrom
pilotstew:pr/plymouth-socket-ipc

pilotstew commented May 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

anatol commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pilotstew commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Upstream Plymouth dependency

Test plan

Uh oh!

Uh oh!

anatol commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pilotstew commented May 7, 2026 •

edited

Loading