Implement drop-safe struct_extensions builder #42

cpubot · 2025-12-01T20:03:12Z

Background

The struct_extensions attribute, introduced in v0.1.0, provides convenience methods for manual SchemaRead implementations. In particular, it provides the following methods on the derived struct:

uninit_<field_name>_mut: Gets a mutable MaybeUninit projection to the <field_name> slot.
read_<field_name>: Reads into a MaybeUninit’s <field_name> slot from the given Reader.
write_uninit_<field_name>: Writes a MaybeUninit’s <field_name> slot with the given value.

While this does significantly reduce boilerplate, it doesn't do anything to help users with drop safety.

Consider the following example:

#[derive(SchemaRead, SchemaWrite)]
#[wincode(struct_extensions)]
struct Bytes {
    bytes: Vec<u8>,
    ar: Vec<[u8; 32]>,
}

struct Data {
    bytes: Bytes,
}

impl<'de> SchemaRead<'de> for Data {
    type Dst = Data;

    fn read(reader: &mut impl Reader<'de>, dst: &mut MaybeUninit<Self::Dst>) -> ReadResult<()> {
        let mut bytes = MaybeUninit::<Bytes>::uninit();
        Bytes::read_bytes(reader, &mut bytes)?;
        Bytes::read_ar(reader, &mut bytes)?;
        let init = unsafe { bytes.assume_init() };
        dst.write(Data { bytes: init });
        Ok(())
    }
}

Because there is no tracking of initialized fields, an error in the middle of deserializing will not drop initialized fields. Specifically these lines from the above example:

Bytes::read_bytes(reader, &mut bytes)?;
// If the following line errors, the above Vec<u8> leaks
Bytes::read_ar(reader, &mut bytes)?;

This isn't a problem with struct_extensions per se, but rather an implicit challenge when manually initializing MaybeUninit structs.

This PR

This PR reworks struct_extensions to provide drop-safe "Builder" structs that take care of field initialization bookkeeping. In particular, it generates a struct that takes a mutable reference to the MaybeUninit<T>, and provides the same functionality as the original, but with automatic initialization tracking and a Drop implementation that drops all initialized fields. The same code as above, with the builder:

#[derive(SchemaRead, SchemaWrite)]
#[wincode(struct_extensions)]
struct Bytes {
    bytes: Vec<u8>,
    ar: Vec<[u8; 32]>,
}

struct Data {
    bytes: Bytes,
}

impl<'de> SchemaRead<'de> for Data {
    type Dst = Data;

    fn read(reader: &mut impl Reader<'de>, dst: &mut MaybeUninit<Self::Dst>) -> ReadResult<()> {
        let mut bytes = MaybeUninit::<Bytes>::uninit();
        let mut bytes_builder = BytesUninitBuilder::from_maybe_uninit_mut(&mut bytes);
        bytes_builder.read_bytes(reader)?;
        bytes_builder.read_ar(reader)?;
        bytes_builder.finish()
        let init = unsafe { bytes.assume_init() };
        dst.write(Data { bytes: init });
        Ok(())
    }
}

This code is completely drop safe. If deserialization fails, the initialized fields are dropped when the builder is dropped. builder.assume_init_forget forgets the builder so that drop logic is no longer executed, so that we can subsequently assume_init on the underlying MaybeUninit.

There are additional affordances for more complicated cases, like multiple nested structs, for example.

#[derive(SchemaRead, SchemaWrite)]
#[wincode(struct_extensions)]
struct Bytes {
    bytes: Vec<u8>,
    ar: Vec<[u8; 32]>,
}

#[derive(SchemaWrite, SchemaRead)]
#[wincode(struct_extensions)]
struct Data {
    bytes: Bytes,
    other_byte_vec: Vec<u8>,
}

struct Outer {
    data: Data,
}

impl<'de> SchemaRead<'de> for Outer {
    type Dst = Outer;

    fn read(reader: &mut impl Reader<'de>, dst: &mut MaybeUninit<Self::Dst>) -> ReadResult<()> {
        let data = unsafe { &mut *(&raw mut (*dst.as_mut_ptr()).data).cast::<MaybeUninit<Data>>() };
        let mut data_builder = DataUninitBuilder::from_maybe_uninit_mut(data);
        unsafe {
            // Marks `bytes` field of `Data` as initialized if the function succeeds.
            data_builder.init_bytes_with(|bytes| {
                // Individual fields of `Bytes` are marked as initialized as reading progresses.
                // Initialized fields are dropped in reverse order on error or panic.
                let mut bytes_builder = BytesUninitBuilder::from_maybe_uninit_mut(bytes);
                bytes_builder.read_bytes(reader)?.read_ar(reader)?;
                bytes_builder.finish();
                Ok(())
            })?;
        }
        data_builder.read_other_byte_vec(reader)?;
        data_builder.finish();
        // Outer now fully initialized.
        Ok(())
    }
}

This could alternatively be written like the following (rather than init_<field>_with function), where nested struct is manually marked as initialized:

#[derive(SchemaRead, SchemaWrite)]
#[wincode(struct_extensions)]
struct Bytes {
    bytes: Vec<u8>,
    ar: Vec<[u8; 32]>,
}

#[derive(SchemaWrite, SchemaRead)]
#[wincode(struct_extensions)]
struct Data {
    bytes: Bytes,
    other_byte_vec: Vec<u8>,
}

struct Outer {
    data: Data,
}

impl<'de> SchemaRead<'de> for Outer {
    type Dst = Outer;

    fn read(reader: &mut impl Reader<'de>, dst: &mut MaybeUninit<Self::Dst>) -> ReadResult<()> {
        let data = unsafe { &mut *(&raw mut (*dst.as_mut_ptr()).data).cast::<MaybeUninit<Data>>() };
        let mut data_builder = DataUninitBuilder::from_maybe_uninit_mut(data);
        // Project `bytes` field of `Data` into a `&mut MaybeUninit<Bytes>`.
        let mut bytes_builder =
            BytesUninitBuilder::from_maybe_uninit_mut(data_builder.uninit_bytes_mut());
        // Individual fields of `Bytes` are marked as initialized as reading progresses.
        // Initialized fields are dropped in reverse order on error or panic.
        bytes_builder.read_bytes(reader)?.read_ar(reader)?;
        // `Bytes` is fully initialized, so we forget the builder.
        bytes_builder.finish();

        // `bytes` field of `Data` can now be marked as initialized
        // so that it's dropped if the next field fails to read.
        unsafe { data_builder.assume_init_bytes() };

        data_builder.read_other_byte_vec(reader)?;

        // `Data` is fully initialized, so we forget the builder.
        data_builder.finish();
        // Outer now fully initialized.
        Ok(())
    }
}

wincode/src/schema/mod.rs

wincode-derive/src/schema_read.rs

kskalski · 2025-12-02T02:06:35Z

wincode-derive/src/schema_read.rs

    let dst = get_src_dst(args);
    let impl_generics = append_de_lifetime(&args.generics);
    let (_, ty_generics, where_clause) = args.generics.split_for_impl();
+    let builder_ident = format_ident!("{struct_ident}UninitBuilder");


isn't polluting the namespace a concern here? should those builder be put into some wincode_gen mod-scope... just thinking, maybe it's not necessary

I don't think it's necessarily a problem because the struct name is suffixed with UninitBuilder, so unlikely to conflict. This feature is also opt-in, and the docs specify that it will be added to the same namespace, so I think it's okay personally.

wincode-derive/src/schema_read.rs

kskalski · 2025-12-03T00:00:33Z

wincode-derive/src/schema_read.rs

+            quote!(#builder_bit_set_ty::MAX)
+        } else {
+            let field_bits = LitInt::new(&fields.len().to_string(), Span::call_site());
+            quote!(((1 as #builder_bit_set_ty) << #field_bits) - 1)


I imagine this could be calculated as fully compile-time constant, right?

let field_init_bits = LitInt::new(((1u128 << fields.len() - 1).to_string(), Span::call_site()); quote!((#field_init_bits as #builder_bit_set_ty))

It'll be constant folded by the compiler https://rust.godbolt.org/z/qaPE5Krbj

ok, if you think this makes the library code simpler, though with explicit calculation we could get rid of the branch with quote!(#builder_bit_set_ty::MAX) and just have one code always calculating the constant

kskalski

Looks good overall, though I do still have some concerns about naming for generated code functions.

kskalski · 2025-12-03T00:13:50Z

wincode-derive/src/schema_read.rs

+                /// Assuming the source `MaybeUninit<T>` when the content is not yet fully initialized causes undefined behavior.
+                /// It is up to the caller to guarantee that the `MaybeUninit<T>` really is in an initialized state.
+                #[inline]
+                #vis const unsafe fn assume_init_forget(self) {


Not sure I like this name either... since it doesn't actually run the assume_init* on underlying MaybeUninit, which the name might suggest.
Maybe it should be something like finish_init_forget or init_done_forget ?

We could make it return the underlying &mut MaybeUnnit and call it into_inner_done, which might be convenient in some cases (this way the other into_assume_init_mut would basically be equivalent to into_inner_done().assume_init_mut()), but also be a bit confusing if user just calls it ignoring the result.

We could technically just call this function forget and remove unsafe as a simple convenience over mem::forget. The user still has to assume_init_mut on the original reference or assume_init on the original MaybeUninit<T> to get access to T -- the contract of both of those methods requires that they ensure the underlying T is fully initialized.

I would also be fine with just unsafe finish. Finish implies that we're done with the builder, and the fact that it's unsafe requires the user to understand the safety assumptions.

finish seems fine, not sure it really needs unsafe, since it doesn't do anything unsafe by itself and the user should be aware of need to initialize all parts of memory when starting to use the builder in the first place. It's also a call that always needs to be made unless into_assume_init_mut is used, so forcing to add another unsafe block seems off.

I would be in fact more concerned with how to ensure user do not forget to call finished.

+1 for finish and no unsafe.

I don't think we need to ensure it's called?

Cool, I'll call it finish. I like that

I don't think we need to ensure it's called?

We actually do -- otherwise the drop logic will execute, which will call drop on all the fields. UB if the user then calls assume_init or any of the assume_init* variants

kskalski · 2025-12-03T00:20:26Z

wincode-derive/src/schema_read.rs

+
+            /// Write a value to the maybe uninitialized field.
+            #[inline]
+            #vis const fn #write_uninit_field_ident(&mut self, val: #target) -> &mut Self {


I think the choice of read_field_name and write_field_name names is a bit confusing here (though I guess it's already inherited from previous design of struct_extensions), since read operates on Reader, while write doesn't have anything to do with wincode io Writer.

I would definitely prefer to change write into something like store_field_name or init_field_name, which would also convey the message that field is not manually initialized

I think write is fine, it matches the semantics of MaybeUninit::write

That's a fair argument, there are some issues with that though - MaybeUninit also has _read method, which means something completely different than read generated for builder.
I think because in context of wincode functions doing read and write have their specific meaning, we should try to avoid naming things with different purpose that way.

builder .read_a(reader)? .read_b(reader)? .write_c(test.c);

is basically an instant head-scratcher... are we reading a few things into the builder and then writing out the builder contents to Writer stored in test.c ?

I've thought about this too in the past and I think I've discussed it with zach.
I think write should definitely be write, it's read I'm not sure about.

That said this PR is fixing a concrete bug, and I think we should land it. We
can improve the naming later imo, this is 0.2.2 we have a long way to 1.0

I don't think it's confusing if you look at the type signature or read the function documentation. Would you prefer read_a_from or write_from_reader?

personally I would avoid the names that have overloaded meaning around the same API / snippets of code and in those cases would opt for init_a and init_a_from_reader.

alessandrod

I'm approving as this fixes a bug. I do think the naming can probably be improved, but I don't have good proposals at the moment, don't think that calling write store is right, and just generally think we can improve the naming as we keep iterating

alessandrod · 2025-12-04T02:06:03Z

wincode-derive/src/schema_read.rs

+                /// Assuming the source `MaybeUninit<T>` when the content is not yet fully initialized causes undefined behavior.
+                /// It is up to the caller to guarantee that the `MaybeUninit<T>` really is in an initialized state.
+                #[inline]
+                #vis const unsafe fn assume_init_forget(self) {


+1 for finish and no unsafe.

I don't think we need to ensure it's called?

alessandrod · 2025-12-04T02:09:35Z

wincode-derive/src/schema_read.rs

+
+            /// Write a value to the maybe uninitialized field.
+            #[inline]
+            #vis const fn #write_uninit_field_ident(&mut self, val: #target) -> &mut Self {


I've thought about this too in the past and I think I've discussed it with zach.
I think write should definitely be write, it's read I'm not sure about.

That said this PR is fixing a concrete bug, and I think we should land it. We
can improve the naming later imo, this is 0.2.2 we have a long way to 1.0

kskalski

Ok, we should be open to naming adjustments in the future then. Otherwise LGTM

Implement drop-safe struct_extensions builder

38a8131

cpubot requested a review from alessandrod December 1, 2025 20:03

kskalski reviewed Dec 2, 2025

View reviewed changes

Add is_init method to builder

38323b1

cpubot mentioned this pull request Dec 2, 2025

Implement drop guard for wincode VersionedMessage impl anza-xyz/agave#9366

Merged

assume_init_mut -> into_assume_init_mut

f3b6751

mergify bot mentioned this pull request Dec 2, 2025

v3.1: Implement drop guard for wincode VersionedMessage impl (backport of #9366) anza-xyz/agave#9369

Merged

kskalski reviewed Dec 3, 2025

View reviewed changes

alessandrod previously approved these changes Dec 4, 2025

View reviewed changes

assume_init_forget -> finish

d071082

cpubot dismissed alessandrod’s stale review via d071082 December 4, 2025 02:15

alessandrod previously approved these changes Dec 4, 2025

View reviewed changes

kskalski previously approved these changes Dec 4, 2025

View reviewed changes

Update docs to use updated method names

6975db9

cpubot dismissed stale reviews from kskalski and alessandrod via 6975db9 December 4, 2025 15:31

ensure we don't overflow with left shifts

1e0d30e

bw-solana approved these changes Dec 4, 2025

View reviewed changes

cpubot merged commit 2e64626 into master Dec 4, 2025
2 checks passed

cpubot deleted the struct_extensions_builder branch December 4, 2025 17:04

cpubot mentioned this pull request Dec 4, 2025

Add wincode support for Entry components anza-xyz/solana-sdk#461

Merged

Implement drop-safe struct_extensions builder #42

Implement drop-safe struct_extensions builder #42

Uh oh!

Conversation

cpubot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

This PR

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kskalski left a comment

Choose a reason for hiding this comment

Uh oh!

kskalski Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kskalski Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpubot Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alessandrod left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kskalski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cpubot commented Dec 1, 2025 •

edited

Loading

kskalski Dec 3, 2025 •

edited

Loading

kskalski Dec 4, 2025 •

edited

Loading

cpubot Dec 4, 2025 •

edited

Loading