fix: use deterministic proto marshalling in SetResponse and RegisterTrigger#1960
fix: use deterministic proto marshalling in SetResponse and RegisterTrigger#1960
Conversation
…rigger
anypb.New() uses default proto.Marshal() which does not guarantee
deterministic byte output for messages containing map fields, because
Go iterates maps in random order. In distributed quorum-based systems,
this causes nodes to produce different request hashes for logically
identical data, preventing quorum from being reached.
Replace anypb.New() with anypb.MarshalFrom() using
proto.MarshalOptions{Deterministic: true} in both SetResponse and
RegisterTrigger, ensuring all nodes produce byte-identical
serializations.
Add a regression test that verifies SetResponse produces stable output
across 100 iterations for a proto message with map fields.
|
👋 vreff, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
📊 API Diff Results
|
There was a problem hiding this comment.
Pull request overview
This PR addresses non-deterministic protobuf Any serialization caused by map field iteration order in Go, which can break hash-based quorum agreement in distributed workflows. It introduces deterministic marshalling for capability responses and trigger events to ensure byte-identical payloads across nodes.
Changes:
- Replaced
anypb.New()with deterministicanypb.MarshalFrom(..., proto.MarshalOptions{Deterministic: true})inSetResponseandRegisterTrigger. - Added a shared helper
marshalAnyDeterministic()to centralize deterministicAnywrapping logic. - Added a regression test to ensure
SetResponseproduces deterministic bytes for protos containing map fields.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pkg/capabilities/utils.go | Uses deterministic protobuf marshalling when wrapping responses/triggers into anypb.Any. |
| pkg/capabilities/utils_test.go | Adds a determinism regression test for SetResponse using a message with map fields. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| t.Run("deterministic marshalling with map fields", func(t *testing.T) { | ||
| // Proto messages with map fields can serialize in different orders | ||
| // because Go map iteration order is randomized. SetResponse must | ||
| // produce identical bytes across calls so that distributed nodes | ||
| // reach quorum on the same hash. | ||
| msg := &pb.CapabilityConfig{ | ||
| MethodConfigs: map[string]*pb.CapabilityMethodConfig{ | ||
| "alpha": {}, | ||
| "bravo": {}, | ||
| "charlie": {}, | ||
| "delta": {}, | ||
| "echo": {}, | ||
| }, | ||
| } | ||
|
|
||
| var firstBytes []byte | ||
| for i := 0; i < 100; i++ { | ||
| resp := capabilities.CapabilityResponse{} | ||
| err := capabilities.SetResponse(&resp, true, msg, nil) | ||
| require.NoError(t, err) | ||
| require.NotNil(t, resp.Payload) | ||
|
|
||
| b, err := proto.Marshal(resp.Payload) | ||
| require.NoError(t, err) | ||
|
|
||
| if firstBytes == nil { | ||
| firstBytes = b | ||
| } else { | ||
| assert.Equal(t, firstBytes, b, "SetResponse produced non-deterministic bytes on iteration %d", i) | ||
| } | ||
| } |
There was a problem hiding this comment.
The determinism regression test only covers the SetResponse path. Since this PR also changes RegisterTrigger to use deterministic Any wrapping, it would be good to add a similar assertion for the migrated trigger-event path (e.g., emit a trigger message with map fields and verify the produced TriggerEvent.Payload bytes are identical across multiple events/calls) to prevent regressions specifically in RegisterTrigger.
| require.NoError(t, err) | ||
| require.NotNil(t, resp.Payload) | ||
|
|
||
| b, err := proto.Marshal(resp.Payload) |
There was a problem hiding this comment.
We use deterministic marshaling for CapabilityResponse as a whole, which should propagate to the nested any proto: https://github.com/smartcontractkit/chainlink-common/blob/main/pkg/capabilities/pb/capabilities_helpers.go#L35
I think this test is not representative and the problem doesn't exist. Am I wrong?
Problem
anypb.New()uses the defaultproto.Marshal()which does not guarantee deterministic byte output for messages containingmapfields, because Go iterates maps in random order. In distributed quorum-based systems, this causes nodes to produce different request hashes for logically identical data, preventing quorum from being reached.Fix
Replace
anypb.New()withanypb.MarshalFrom()usingproto.MarshalOptions{Deterministic: true}in both:SetResponse(capability response path)RegisterTrigger(trigger event path)A new helper
marshalAnyDeterministic()centralizes this logic.Testing
SetResponse100 times with a proto containing 5 map entries and asserts byte-identical output every time