Skip to content

Replace eyre catch-alls with structured error types in application crates#1761

Open
Ericson2314 wants to merge 1 commit into
NixOS:masterfrom
obsidiansystems:error-types-2
Open

Replace eyre catch-alls with structured error types in application crates#1761
Ericson2314 wants to merge 1 commit into
NixOS:masterfrom
obsidiansystems:error-types-2

Conversation

@Ericson2314

@Ericson2314 Ericson2314 commented May 21, 2026

Copy link
Copy Markdown
Member

See commit message

Comment thread subprojects/hydra-builder/src/grpc.rs Outdated
.tls_config(tls)?
.connect()
.await
.context("Failed to establish connection with Channel")?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we loose error context in cases like this where you prune the error message? I found it quiet useful to have errors at every step.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think the WithContext thing needs to be used in more places.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution is to divide errors by cause, not type. The information that would be put in context should be put in the message with #[error("message")], and the variant should include the relevant source with #[source].

current_bytes: fs_err::read_to_string(cgroups_path.join("memory.current"))?
.trim()
.parse()
.context("memory current parsing failed")?,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here. I haven't read the whole error hierarchy to understand what impact this has on error messages.

@Mic92

Mic92 commented May 22, 2026

Copy link
Copy Markdown
Member

Every function now returns the narrowest error type that covers its actual failure modes, rather than a catch-all anyhow::Error. This makes it possible to distinguish infrastructure failures (DB down, store unreachable) from logic errors (step not found in queue, drv lookup miss) at the type level.

Among other benefits, this will assist the "fail-fast" PR in deciding what is fatal vs recoverable, and making sure those decisions are consistent and enforced.

Error types are co-located with the functions that produce them, using split impl State blocks where needed. The fine-grained sub-enums (e.g. ResolutionError, StepLookupError, MachineLookupError, DrvLookupError) might seem like overkill in a single 2500-line state/mod.rs, but they are designed so that a future file split (e.g. state/resolution.rs, state/step_completion.rs) would give each sub-module its own natural error type, with StateLogicError in mod.rs just re-exporting them. The error hierarchy becomes the module hierarchy.

Similarly, the builder's per-phase error types (ImportError, UploadError, BuildOutputParseError, ResultInfoError) are each defined next to the free functions they belong to, and JobFailure ties them together by build phase. A future split into import.rs, upload.rs, etc. would be straightforward.

New error-context crate provides WithContext<E>, a recursive enum (Root(E) | Context { context, source }) that adds context layers without changing the type — a typed replacement for anyhow::context(). StateError is a newtype around WithContext<StateErrorKind>.

anyhow is removed from workspace dependencies entirely. Examples use it as a non-workspace [dev-dependencies] only.

Could you write pull request messages yourself, if you expect me to review them?

@Ericson2314 Ericson2314 marked this pull request as draft May 27, 2026 18:10
@Ericson2314 Ericson2314 force-pushed the error-types-2 branch 3 times, most recently from bc36c74 to f4fb0fc Compare June 11, 2026 00:00
@Ericson2314 Ericson2314 changed the title Replace anyhow with structured error types in application crates Replace eyre catch-alls with structured error types in application crates Jun 11, 2026
…crates

The previous commits moved the application crates off `anyhow` onto
`eyre`/`color-eyre`, but only at the reporting boundary — most function
signatures still returned a catch-all `eyre::Result`. Now every function
returns the narrowest error type that covers its actual failure modes.
This makes it possible to distinguish infrastructure failures (DB down,
store unreachable) from logic errors (step not found in queue, drv
lookup miss) at the type level.

Among other benefits, this will assist the "fail-fast" PR in deciding
what is fatal vs recoverable, and making sure those decisions are
consistent and enforced.

Error types are co-located with the functions that produce them, using
split `impl State` blocks where needed. The fine-grained sub-enums (e.g.
`ResolutionError`, `StepLookupError`, `MachineLookupError`,
`DrvLookupError`) might seem like overkill in a single 2500-line
`state/mod.rs`, but they are designed so that a future file split (e.g.
`state/resolution.rs`, `state/step_completion.rs`) would give each
sub-module its own natural error type, with `StateLogicError` in
`mod.rs` just re-exporting them. The error hierarchy becomes the module
hierarchy.

Similarly, the builder's per-phase error types (`BuildError`,
`ImportError`, `UploadError`, `BuildOutputParseError`,
`ResultInfoError`) are each defined next to the free functions they
belong to, and `JobFailure` ties them together by build phase. A future
split into `import.rs`, `upload.rs`, etc. would be straightforward.

In the same spirit, `NarInfoError::InvalidValue` in `binary-cache` no
longer erases its cause behind a `Box<dyn Error>`; a new
`InvalidValueError` enum names the two concrete parse failures
(`ParseStorePathError`, `ParseIntError`) that can actually occur.

`color-eyre` remains only at the binary boundary: `main` still returns
`color_eyre::Result` so fatal errors render with the full source chain,
and a few `BuilderError` variants (`LoadNixConfig`,
`ReadingSystemInfo`, `HandlingRequest`) still carry `eyre::Report`
sources where the underlying code has not been typed yet. Examples
switch from `anyhow` to `eyre` (non-workspace `[dev-dependencies]`),
keeping `anyhow` out of the dependency tree.
@Ericson2314 Ericson2314 marked this pull request as ready for review June 16, 2026 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants