Skip to content

contrib: add Google Cloud Storage driver for external payload storage#2366

Open
brucearctor wants to merge 1 commit into
temporalio:mainfrom
brucearctor:feat/gcs-storage-driver
Open

contrib: add Google Cloud Storage driver for external payload storage#2366
brucearctor wants to merge 1 commit into
temporalio:mainfrom
brucearctor:feat/gcs-storage-driver

Conversation

@brucearctor
Copy link
Copy Markdown
Contributor

Summary

Add a GCS-backed StorageDriver implementation under contrib/google/gcsdriver, mirroring the existing S3 driver architecture (contrib/aws/s3driver).

Resolves #2364

Two-module structure

Module Purpose
gcsdriver SDK-agnostic driver + Client interface (zero GCP dependency)
gcsdriver/gcssdk Concrete Client backed by cloud.google.com/go/storage

Key features

  • Content-addressable keys (SHA-256) with namespace/workflow/activity scoping
  • Concurrent uploads via errgroup with two-phase validation
  • Dynamic bucket selection via BucketFunc
  • Integrity verification on retrieval
  • Bucket-existence caching (sync.Map) to avoid redundant RPCs

ObjectExists behavior note

The GCS Go client returns storage.ErrObjectNotExist for both a missing object in a valid bucket and for any object in a missing bucket. This implementation performs an explicit Bucket.Attrs() check (with caching) to distinguish the two cases, returning (false, error) for a missing bucket.

This differs from the S3 driver which returns (false, nil) for a missing bucket because the AWS SDK maps HeadObject on a missing bucket to a generic NotFound. We believe returning an error is more correct per the Client interface contract ("a non-nil error only when the existence of the object cannot be determined"), but can align with S3 behavior if the maintainers prefer consistency.

Tests

  • 38 driver-level unit tests (bucket selection, content addressing, round-trips, error paths)
  • 10 GCS SDK integration tests using fake-gcs-server (put/get, existence, bucket-not-found, caching, large objects, full driver round-trip)
  • All tests pass with go 1.24.0

Add a GCS-backed StorageDriver implementation under contrib/google/gcsdriver,
mirroring the existing S3 driver architecture (contrib/aws/s3driver).

Two-module structure:
  - gcsdriver: SDK-agnostic driver + Client interface (zero GCP dependency)
  - gcsdriver/gcssdk: Concrete Client backed by cloud.google.com/go/storage

Key features:
  - Content-addressable keys (SHA-256) with namespace/workflow/activity scoping
  - Concurrent uploads via errgroup with two-phase validation
  - Dynamic bucket selection via BucketFunc
  - Integrity verification on retrieval

ObjectExists behavior note:
  The GCS Go client returns storage.ErrObjectNotExist for both a missing
  object in a valid bucket AND for any object in a missing bucket. This
  implementation performs an explicit Bucket.Attrs() check (with caching)
  to distinguish the two cases, returning (false, error) for a missing
  bucket. This differs from the S3 driver which returns (false, nil) for
  a missing bucket because the AWS SDK maps HeadObject on a missing bucket
  to a generic NotFound. We believe returning an error is more correct per
  the Client interface contract ('a non-nil error only when the existence
  of the object cannot be determined'), but can align with S3 behavior if
  the maintainers prefer consistency.

Resolves temporalio#2364
@brucearctor brucearctor requested a review from a team as a code owner May 24, 2026 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Google Cloud Storage external storage driver in contrib (mirror of aws/s3driver)

1 participant