Skip to content

fix(backend): inject user identity header in scheduled-workflow-controller for multi-user auth#13472

Open
jaewak wants to merge 14 commits into
kubeflow:masterfrom
jaewak:fix/scheduledworkflow-controller-auth-header
Open

fix(backend): inject user identity header in scheduled-workflow-controller for multi-user auth#13472
jaewak wants to merge 14 commits into
kubeflow:masterfrom
jaewak:fix/scheduledworkflow-controller-auth-header

Conversation

@jaewak
Copy link
Copy Markdown
Contributor

@jaewak jaewak commented Jun 1, 2026

Description of your changes:

Problem

In multi-user mode, the scheduled-workflow-controller calls CreateRun via gRPC to the API server when "Always use latest pipeline version" is enabled on a recurring run. The controller attaches an Authorization: Bearer <token> header (when an SA token volume is mounted), but it never sets the user identity header (e.g. kubeflow-userid).

The API server's HTTPHeaderAuthenticator requires this header and rejects the request with:

Unauthenticated: Request header error: there is no user identity header

This was masked in clusters that mount a service-account token volume (where TokenReviewAuthenticator succeeds), but customers without that volume see recurring runs silently fail to create child runs.

Solution

Add two new opt-in CLI flags to the scheduled-workflow-controller:

Flag Description Default
--userIdentityHeader User identity gRPC metadata key (e.g. kubeflow-userid) "" (disabled)
--userIdentityValue Value for the metadata key (e.g. the controller's SA identity) "" (disabled)

When both flags are set, the controller injects the key/value into the outgoing gRPC metadata alongside the existing Bearer token. When either is empty, nothing is injected — preserving backward compatibility for single-user deployments.

The configured key is normalized to lowercase and validated against gRPC metadata key rules at controller startup, so a malformed --userIdentityHeader fails fast with a clear error instead of causing failed requests during reconciliation.

The multi-user kustomize deployment patch is updated to pass these flags using the controller's own service account identity, derived from the pod's namespace via the Kubernetes downward API.

Changes

  • backend/src/crd/controller/scheduledworkflow/controller.go — Added userIdentityHeader/userIdentityValue fields to the Controller struct; wired into NewController (which now normalizes and validates the key, failing fast on an invalid one); inject the key/value into outgoing gRPC metadata in submitNewWorkflowIfNotAlreadySubmitted after the existing Bearer token block.
  • backend/src/crd/controller/scheduledworkflow/main.go — Added flag.StringVar registrations for the two new flags (help text clarifies the value is injected as outgoing gRPC metadata); passed to NewController.
  • manifests/kustomize/base/installs/multi-user/scheduled-workflow/deployment-patch.yaml — Added NAMESPACE_SELF env var (downward API) and args: with --userIdentityHeader=kubeflow-userid and --userIdentityValue=system:serviceaccount:$(NAMESPACE_SELF):ml-pipeline-scheduledworkflow.
  • backend/src/crd/controller/scheduledworkflow/controller_auth_header_test.go (new) — Unit tests covering:
    • Injection when both fields are set (key/value present in outgoing metadata).
    • No injection when both fields are empty.
    • No injection when only one field is set.
    • Coexistence with the existing Authorization: Bearer token.
    • Metadata key normalization/validation (lowercasing, accepted and rejected characters).
    • NewController rejecting a malformed key.

Checklist:

In multi-user mode, the scheduled-workflow-controller calls CreateRun via
gRPC to the API server. Previously, only an Authorization: Bearer <token>
header was attached (when a service-account token file was present). The
API server's HTTPHeaderAuthenticator also requires a user identity header
(e.g. kubeflow-userid), and rejects the request with 'Unauthenticated:
Request header error: there is no user identity header' when the token
volume is not mounted.

This commit adds two new CLI flags to the controller:
  --userIdentityHeader: the header name (e.g. kubeflow-userid)
  --userIdentityValue:  the value to set (e.g. the controller SA identity)

When both are set, the controller injects the header into the gRPC
outgoing metadata alongside the existing Bearer token. Both flags default
to empty strings, making the feature fully opt-in -- existing single-user
deployments are unaffected.

The multi-user kustomize deployment patch is updated to pass these flags
using the controller's own service account identity, derived from the
pod namespace via the downward API.

Signed-off-by: jaewak <jaewan.0907@gmail.com>
Copilot AI review requested due to automatic review settings June 1, 2026 20:55
@google-oss-prow google-oss-prow Bot requested a review from droctothorpe June 1, 2026 20:55
@google-oss-prow
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign chensun for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow
Copy link
Copy Markdown

Hi @jaewak. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adds configurable user-identity metadata injection to the ScheduledWorkflow controller so it can authenticate to the Pipelines API in multi-user mode using a specified identity header/value.

Changes:

  • Add --userIdentityHeader / --userIdentityValue flags and pass them into the controller.
  • Inject the configured identity metadata into outgoing gRPC requests when creating runs.
  • Update the multi-user scheduled-workflow deployment patch to set the flags using the controller pod namespace, and add unit tests covering header injection behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
manifests/kustomize/base/installs/multi-user/scheduled-workflow/deployment-patch.yaml Configures the controller pod to derive its namespace and pass identity args for multi-user authorization.
backend/src/crd/controller/scheduledworkflow/main.go Adds CLI flags for identity header/value and wires them into controller construction.
backend/src/crd/controller/scheduledworkflow/controller.go Stores identity config on the controller and appends it to outgoing gRPC metadata.
backend/src/crd/controller/scheduledworkflow/controller_auth_header_test.go Adds tests validating identity metadata injection and coexistence with bearer token auth.

Comment thread backend/src/crd/controller/scheduledworkflow/controller.go
Comment thread backend/src/crd/controller/scheduledworkflow/main.go Outdated
jaewak and others added 5 commits June 1, 2026 17:03
Address review feedback on the user identity injection:
- Normalize the configured userIdentityHeader to lowercase and validate it
  against gRPC metadata key rules in NewController, failing fast at startup
  instead of risking failed requests during reconciliation.
- Clarify the --userIdentityHeader/--userIdentityValue flag help to state
  that the value is injected as outgoing gRPC metadata (not an HTTP header).
- Add unit tests for the key normalization/validation helper and for
  NewController rejecting a malformed key.

Signed-off-by: jaewak <jaewan.0907@gmail.com>
The multi-user deployment patch set the user identity flags via the pod
spec 'args' field. The controller image has no ENTRYPOINT and invokes the
binary through its CMD (a shell that maps env vars to flags), so setting
'args' replaced the entire CMD and the container tried to exec
'--userIdentityHeader=...' as the binary, crash-looping with:

  exec: "--userIdentityHeader=kubeflow-userid": executable file not found in $PATH

Follow the existing convention instead: expose the configuration as the
USER_IDENTITY_HEADER / USER_IDENTITY_VALUE env vars and map them to the
--userIdentityHeader / --userIdentityValue flags in the Dockerfile CMD.
The multi-user patch now sets those env vars (defaulting to empty, so
single-user deployments remain unaffected).

Signed-off-by: jaewak <jaewan.0907@gmail.com>
@google-oss-prow
Copy link
Copy Markdown

@jaewak: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jaewak and others added 8 commits June 3, 2026 09:22
Signed-off-by: jaewak <jaewan.0907@gmail.com>
Signed-off-by: jaewak <jaewan.0907@gmail.com>
Signed-off-by: jaewak <jaewan.0907@gmail.com>
Signed-off-by: jaewak <jaewan.0907@gmail.com>
Signed-off-by: jaewak <jaewan.0907@gmail.com>
Signed-off-by: jaewak <jaewan.0907@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants