-
Notifications
You must be signed in to change notification settings - Fork 338
fix(go): adjustments to the sliceSerializer #2760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@lzaeh could you merge main branch first? there are some conflict |
go/fory/primitive_array.go
Outdated
| r = reflect.New(type_).Elem() | ||
|
|
||
| case reflect.Interface: | ||
| arrT := reflect.ArrayOf(length, reflect.TypeOf(int8(0))) // [length]int8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't use [n]T for deserialization, the n used by user is fixed at compile-time, but the array length can be different. We should use slice []numeric for primitive array type in fory type system.
For golang serialization, we just don't allow [n]T being deserialized. Or only allow [n]T being serialized, but this seems useless. If you can't deserialzie it, serialize it seems unmeaningful too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thanks for telling me this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't use
[n]Tfor deserialization, thenused by user is fixed at compile-time, but the array length can be different. We should use slice[]numericfor primitive array type in fory type system.For golang serialization, we just don't allow
[n]Tbeing deserialized. Or only allow[n]Tbeing serialized, but this seems useless. If you can't deserialzie it, serialize it seems unmeaningful too
Sorry for the delay—I just wrapped up my graduation stuff. If all arrays are deserialized into slices, should arrays inside structs also be deserialized as slices? If so, I think the tests will need to be updated as well. I was considering this while planning comprehensive tests for the contents of commonArray.@chaokunyang
i will try asap tomorrow, sorry for the delay. |
597f2ae to
390b623
Compare
Scope of this changeThis change removes array support only along the usage path, reflected in:
What I did not changeI’m unsure whether we should purge array-related constructs from the type system itself. Fully removing them would touch a lot of code. Keeping the definitions around but unused might serve as a transitional/contingency option, even if arrays are effectively deprecated right now. Cross-language test adjustmentsFor TestXLangSerializer: since other languages may interact with the same unit as Python, I removed the array-related code paths on both the Go and Python sides within this test. I then added a separate test_cross_language_serializer_go case in Python’s cross-language tests so that Go still covers this part. If this setup isn’t appropriate, please let me know. Remaining work (Relatively higher-priority tasks, Just my opinion)Update the struct field ordering logic accordingly. After updating the Python module, two struct tests failed due to hash mismatches. I haven’t changed the handling of complex structs, which means the code path for fields that are arrays within complex structs remains untouched—and this test includes arrays. If I tweak just one spot now, it’ll be hard to localize the issue. It’s better to make the corresponding changes after I spend time learning the new sorting/ordering optimization algorithm, so we can update everything in sync. |
Recently, we have updated the hash calculation algorithm. Maybe this PR(#2814) can solve this problem |
|
|
||
|
|
||
| @cross_language_test | ||
| def test_cross_language_serializer_go(data_file_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need a new test for go specially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use same tests as java used. Otherwise, we still don't get full xlang serialization features
|
Could you refer to both java and rust implementation? In fory rust, both array and vec are supported, they can all be serialized as a list, and deserialize a list into target type. We should do similiar thing in go too. For fory rust, you can refer to |
Sorry—I was tied up with some real-life errands and didn’t see this message in time. I’ve implemented a version along the lines you suggested by removing the array serializers. Details: (1) Primitive arrays: unless the user provides a specific destination type (only array or slice are considered valid), deserialization checks the type and length and deserializes accordingly; otherwise (e.g., when the target is interface{}) we always deserialize to a slice. (2) I think the support in (1) is still necessary. While we don’t recommend arrays to users, if a struct field is defined as an array (rather than interface{}), deserialization will work fine because of (1). (3) One confusing point: the complex-struct tests had been failing for a while. For various reasons I kept pulling the latest code and reinstalling pyfory. I did the same this time, and before I had a chance to dig into the complex-struct issue, the tests started passing again. I only opened GitHub after finishing this part of the code—bad workflow on my side T_T . I’ll follow your suggestions going forward. Thanks! |
|
Some files only appeared as modified when I synced to the new version—I didn’t actually change them. They won't be treated as new code during the merge. I’ll keep studying those two files and figure out suitable approach for Go. This version is only based on earlier discussion and hasn’t yet incorporated learnings from the other language implementations. :( |
Why?
Previously, my understanding of the type system was insufficient, which led to confusion between array and slice usage. As a result, the deserialization step of the slice serializer did not work properly in cross-language scenarios due to missing alignment rules (for example, when the types were structurally identical and should have taken the fast path, the previous version failed to correctly interpret the byte stream).
What does this PR do?
Serializer-related changes:
primitive_array.go. If you think it’s not appropriate, I can revert the change.Test-related changes:
Rationale:
Known Issues
After pulling the latest code (which includes modifications related to field order hashing in structs), the complex struct tests that passed 10 mins ago now fail due to Python-side deserialization errors.
I will follow up on this issue to determine whether the cause lies in the list serializer or incomplete handling of struct field order consistency.
Cross-language list serialization tests (serializing in one language, deserializing in another—Python ↔ Go, Java ↔ Go) revealed that Go’s int and float64 types cannot be correctly deserialized in the other languages.
The slice serializer currently mimics Python’s fast-path logic, but the matching rules for integer and floating-point types are somewhat ambiguous. My understanding here may be incomplete, and this part likely needs further refinement.
Related issues
Does this PR introduce any user-facing change?
Benchmark