Skip to content

Refresh Iceberg partition specs periodically#38408

Merged
ahmedabu98 merged 1 commit into
apache:masterfrom
AtharvUrunkar:fix-spec-refresh-cache
May 20, 2026
Merged

Refresh Iceberg partition specs periodically#38408
ahmedabu98 merged 1 commit into
apache:masterfrom
AtharvUrunkar:fix-spec-refresh-cache

Conversation

@AtharvUrunkar
Copy link
Copy Markdown
Contributor

Summary

Fixes #38337

This change updates AssignDestinationsAndPartitions to periodically refresh cached Iceberg partition specs instead of loading them only once per worker instance.

Previously, PartitionKey and BeamRowWrapper instances were cached indefinitely, which could result in stale partition specs being used after table spec updates.

Changes

  • Added periodic refresh logic using a refresh interval
  • Added refresh timestamp tracking per table
  • Reloaded partition specs after refresh interval expiration
  • Preserved existing caching behavior between refreshes
  • Added explicit non-null validation for cached objects before usage

Testing

  • ./gradlew spotlessApply
  • ./gradlew :sdks:java:io:iceberg:test

Note: Local Iceberg test execution on Windows encountered unrelated environment/runtime issues (URISyntaxException / Unsafe initialization errors) likely tied to Windows + Java toolchain compatibility rather than this isolated change.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

Gemini encountered an error creating the summary. You can try again by commenting /gemini summary.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Assigning reviewers:

R: @kennknowles for label java.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@github-actions
Copy link
Copy Markdown
Contributor

Reminder, please take a look at this pr: @kennknowles

@github-actions
Copy link
Copy Markdown
Contributor

Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment assign to next reviewer:

R: @Abacn for label java.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

@damccorm
Copy link
Copy Markdown
Contributor

@ahmedabu98 would you mind taking a look at this one?

Copy link
Copy Markdown
Contributor

@ahmedabu98 ahmedabu98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, LGTM

@ahmedabu98 ahmedabu98 merged commit 2375fed into apache:master May 20, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: [IcebergIO] Fetch a fresh spec periodically in hash distribution

3 participants