-
Notifications
You must be signed in to change notification settings - Fork 13
Implement drop-safe struct_extensions builder #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| let dst = get_src_dst(args); | ||
| let impl_generics = append_de_lifetime(&args.generics); | ||
| let (_, ty_generics, where_clause) = args.generics.split_for_impl(); | ||
| let builder_ident = format_ident!("{struct_ident}UninitBuilder"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't polluting the namespace a concern here? should those builder be put into some wincode_gen mod-scope... just thinking, maybe it's not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessarily a problem because the struct name is suffixed with UninitBuilder, so unlikely to conflict. This feature is also opt-in, and the docs specify that it will be added to the same namespace, so I think it's okay personally.
| quote!(#builder_bit_set_ty::MAX) | ||
| } else { | ||
| let field_bits = LitInt::new(&fields.len().to_string(), Span::call_site()); | ||
| quote!(((1 as #builder_bit_set_ty) << #field_bits) - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine this could be calculated as fully compile-time constant, right?
let field_init_bits = LitInt::new(((1u128 << fields.len() - 1).to_string(), Span::call_site());
quote!((#field_init_bits as #builder_bit_set_ty))There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be constant folded by the compiler https://rust.godbolt.org/z/qaPE5Krbj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, if you think this makes the library code simpler, though with explicit calculation we could get rid of the branch with quote!(#builder_bit_set_ty::MAX) and just have one code always calculating the constant
kskalski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, though I do still have some concerns about naming for generated code functions.
wincode-derive/src/schema_read.rs
Outdated
| /// Assuming the source `MaybeUninit<T>` when the content is not yet fully initialized causes undefined behavior. | ||
| /// It is up to the caller to guarantee that the `MaybeUninit<T>` really is in an initialized state. | ||
| #[inline] | ||
| #vis const unsafe fn assume_init_forget(self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I like this name either... since it doesn't actually run the assume_init* on underlying MaybeUninit, which the name might suggest.
Maybe it should be something like finish_init_forget or init_done_forget ?
We could make it return the underlying &mut MaybeUnnit and call it into_inner_done, which might be convenient in some cases (this way the other into_assume_init_mut would basically be equivalent to into_inner_done().assume_init_mut()), but also be a bit confusing if user just calls it ignoring the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could technically just call this function forget and remove unsafe as a simple convenience over mem::forget. The user still has to assume_init_mut on the original reference or assume_init on the original MaybeUninit<T> to get access to T -- the contract of both of those methods requires that they ensure the underlying T is fully initialized.
I would also be fine with just unsafe finish. Finish implies that we're done with the builder, and the fact that it's unsafe requires the user to understand the safety assumptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finish seems fine, not sure it really needs unsafe, since it doesn't do anything unsafe by itself and the user should be aware of need to initialize all parts of memory when starting to use the builder in the first place. It's also a call that always needs to be made unless into_assume_init_mut is used, so forcing to add another unsafe block seems off.
I would be in fact more concerned with how to ensure user do not forget to call finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for finish and no unsafe.
I don't think we need to ensure it's called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I'll call it finish. I like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to ensure it's called?
We actually do -- otherwise the drop logic will execute, which will call drop on all the fields. UB if the user then calls assume_init or any of the assume_init* variants
|
|
||
| /// Write a value to the maybe uninitialized field. | ||
| #[inline] | ||
| #vis const fn #write_uninit_field_ident(&mut self, val: #target) -> &mut Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the choice of read_field_name and write_field_name names is a bit confusing here (though I guess it's already inherited from previous design of struct_extensions), since read operates on Reader, while write doesn't have anything to do with wincode io Writer.
I would definitely prefer to change write into something like store_field_name or init_field_name, which would also convey the message that field is not manually initialized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think write is fine, it matches the semantics of MaybeUninit::write
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair argument, there are some issues with that though - MaybeUninit also has _read method, which means something completely different than read generated for builder.
I think because in context of wincode functions doing read and write have their specific meaning, we should try to avoid naming things with different purpose that way.
builder
.read_a(reader)?
.read_b(reader)?
.write_c(test.c);is basically an instant head-scratcher... are we reading a few things into the builder and then writing out the builder contents to Writer stored in test.c ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about this too in the past and I think I've discussed it with zach.
I think write should definitely be write, it's read I'm not sure about.
That said this PR is fixing a concrete bug, and I think we should land it. We
can improve the naming later imo, this is 0.2.2 we have a long way to 1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's confusing if you look at the type signature or read the function documentation. Would you prefer read_a_from or write_from_reader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally I would avoid the names that have overloaded meaning around the same API / snippets of code and in those cases would opt for init_a and init_a_from_reader.
alessandrod
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving as this fixes a bug. I do think the naming can probably be improved, but I don't have good proposals at the moment, don't think that calling write store is right, and just generally think we can improve the naming as we keep iterating
wincode-derive/src/schema_read.rs
Outdated
| /// Assuming the source `MaybeUninit<T>` when the content is not yet fully initialized causes undefined behavior. | ||
| /// It is up to the caller to guarantee that the `MaybeUninit<T>` really is in an initialized state. | ||
| #[inline] | ||
| #vis const unsafe fn assume_init_forget(self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for finish and no unsafe.
I don't think we need to ensure it's called?
|
|
||
| /// Write a value to the maybe uninitialized field. | ||
| #[inline] | ||
| #vis const fn #write_uninit_field_ident(&mut self, val: #target) -> &mut Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about this too in the past and I think I've discussed it with zach.
I think write should definitely be write, it's read I'm not sure about.
That said this PR is fixing a concrete bug, and I think we should land it. We
can improve the naming later imo, this is 0.2.2 we have a long way to 1.0
kskalski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we should be open to naming adjustments in the future then. Otherwise LGTM
Background
The
struct_extensionsattribute, introduced inv0.1.0, provides convenience methods for manualSchemaReadimplementations. In particular, it provides the following methods on the derived struct:uninit_<field_name>_mut: Gets a mutableMaybeUninitprojection to the<field_name>slot.read_<field_name>: Reads into aMaybeUninit’s<field_name>slot from the given Reader.write_uninit_<field_name>: Writes aMaybeUninit’s<field_name>slot with the given value.While this does significantly reduce boilerplate, it doesn't do anything to help users with drop safety.
Consider the following example:
Because there is no tracking of initialized fields, an error in the middle of deserializing will not drop initialized fields. Specifically these lines from the above example:
This isn't a problem with
struct_extensionsper se, but rather an implicit challenge when manually initializingMaybeUninitstructs.This PR
This PR reworks
struct_extensionsto provide drop-safe "Builder" structs that take care of field initialization bookkeeping. In particular, it generates a struct that takes a mutable reference to theMaybeUninit<T>, and provides the same functionality as the original, but with automatic initialization tracking and aDropimplementation that drops all initialized fields. The same code as above, with the builder:This code is completely drop safe. If deserialization fails, the initialized fields are dropped when the builder is dropped.
builder.assume_init_forgetforgets the builder so that drop logic is no longer executed, so that we can subsequentlyassume_initon the underlyingMaybeUninit.There are additional affordances for more complicated cases, like multiple nested structs, for example.
This could alternatively be written like the following (rather than
init_<field>_withfunction), where nested struct is manually marked as initialized: