Describe the issue
When re-registering a workflow with the same version, FlyteAdmin returns INTERNAL: RST_STREAM closed stream. HTTP/2 error code: INTERNAL_ERROR instead of ALREADY_EXISTS or INVALID_ARGUMENT. This happens silently — no error is logged server-side.
Root cause (two bugs):
-
Non-deterministic workflow digest: ValidateWorkflow in workflow_compiler.go iterates wf.Nodes (a Go map) without sorting keys. Go map iteration order is randomized, so the same workflow produces different CompiledWorkflowClosure outputs across compilations. FlyteAdmin's digest comparison (bytes.Equal) fails, incorrectly taking the "different structure" code path.
-
Oversized gRPC error message: The "different structure" code path in errors.go (NewWorkflowExistsDifferentStructureError) computes a jsondiff of two large compiled workflow closures and includes the full diff in the gRPC status description. For large workflows (e.g. with 400+ JAR dependencies), this diff exceeds gRPC's default 4MB MaxSendMsgSize. gRPC-Go rejects the response at the transport layer with RST_STREAM INTERNAL_ERROR — no server-side log is produced.
Steps to reproduce
- Register a workflow with version
v1 (succeeds)
- Re-register the same workflow with version
v1 but with a trivially different template (e.g. add a metadata tag)
- Client receives
INTERNAL: RST_STREAM closed stream. HTTP/2 error code: INTERNAL_ERROR
- FlyteAdmin logs nothing
Even without step 2's modification, identical workflows can trigger this because the digest is non-deterministic — it depends on Go map iteration order.
Expected behavior
- Identical workflow + same version →
ALREADY_EXISTS
- Different workflow + same version →
INVALID_ARGUMENT with a bounded error message
Additional context
createTask is not affected because task compilation does not iterate Go maps non-deterministically
- The issue is particularly impactful for CI systems that use content-based versioning (e.g. Bazel) where the same version may be re-registered when only dependencies change
Relevant code
- Digest comparison:
flyteadmin/pkg/manager/impl/workflow_manager.go (CreateWorkflow → bytes.Equal(workflowDigest, existingWorkflowModel.Digest))
- Non-deterministic map iteration:
flytepropeller/pkg/compiler/workflow_compiler.go lines 220 and 235 (for nodeID, n := range wf.Nodes)
- Unbounded error message:
flyteadmin/pkg/errors/errors.go (NewWorkflowExistsDifferentStructureError → jsondiff.Compare → strings.Join with no size limit)
Proposed fix
PR: #7211
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
Describe the issue
When re-registering a workflow with the same version, FlyteAdmin returns
INTERNAL: RST_STREAM closed stream. HTTP/2 error code: INTERNAL_ERRORinstead ofALREADY_EXISTSorINVALID_ARGUMENT. This happens silently — no error is logged server-side.Root cause (two bugs):
Non-deterministic workflow digest:
ValidateWorkflowinworkflow_compiler.goiterateswf.Nodes(a Go map) without sorting keys. Go map iteration order is randomized, so the same workflow produces differentCompiledWorkflowClosureoutputs across compilations. FlyteAdmin's digest comparison (bytes.Equal) fails, incorrectly taking the "different structure" code path.Oversized gRPC error message: The "different structure" code path in
errors.go(NewWorkflowExistsDifferentStructureError) computes ajsondiffof two large compiled workflow closures and includes the full diff in the gRPC status description. For large workflows (e.g. with 400+ JAR dependencies), this diff exceeds gRPC's default 4MBMaxSendMsgSize. gRPC-Go rejects the response at the transport layer withRST_STREAM INTERNAL_ERROR— no server-side log is produced.Steps to reproduce
v1(succeeds)v1but with a trivially different template (e.g. add a metadata tag)INTERNAL: RST_STREAM closed stream. HTTP/2 error code: INTERNAL_ERROREven without step 2's modification, identical workflows can trigger this because the digest is non-deterministic — it depends on Go map iteration order.
Expected behavior
ALREADY_EXISTSINVALID_ARGUMENTwith a bounded error messageAdditional context
createTaskis not affected because task compilation does not iterate Go maps non-deterministicallyRelevant code
flyteadmin/pkg/manager/impl/workflow_manager.go(CreateWorkflow →bytes.Equal(workflowDigest, existingWorkflowModel.Digest))flytepropeller/pkg/compiler/workflow_compiler.golines 220 and 235 (for nodeID, n := range wf.Nodes)flyteadmin/pkg/errors/errors.go(NewWorkflowExistsDifferentStructureError→jsondiff.Compare→strings.Joinwith no size limit)Proposed fix
PR: #7211
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?