-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat: Add Iceberg partition name generation utility #15443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add Iceberg partition name generation utility #15443
Conversation
✅ Deploy Preview for meta-velox canceled.
|
| std::string toPartitionString(int32_t value, const TypePtr& type) | ||
| const override; | ||
|
|
||
| /// Converts a Timestamp partition key to its string representation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converts a Timestamp partition key to its string representation.
This sentence is redundant. Consider dropping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
|
|
||
| namespace facebook::velox::connector::hive::iceberg { | ||
|
|
||
| constexpr int32_t kEpochYear = 1970; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this into the only function where it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
| if (type->isDate()) { | ||
| return DATE()->toString(value); | ||
| } | ||
| return folly::to<std::string>(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: fmt::to_string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
| } | ||
| } | ||
|
|
||
| std::string IcebergPartitionPath::toPartitionString( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps, add a check for transform type. I assume it must be kIdentity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right
| EXPECT_EQ( | ||
| test( | ||
| TransformType::kIdentity, | ||
| StringView("\x48\x65\x6c\x6c\x6f"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider adding helper method testVarbinary that takes a single std::string argument. This will make code shorter and easier to read and allow testing long strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion.
|
|
||
| TEST(IcebergPartitionPathTest, identityTimestamp) { | ||
| EXPECT_EQ( | ||
| test(TransformType::kIdentity, Timestamp(0, 0), TIMESTAMP()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add testTimestamp helper method that takes a single Timestamp argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
|
|
||
| TEST(IcebergPartitionPathTest, truncateString) { | ||
| EXPECT_EQ( | ||
| test(TransformType::kTruncate, StringView("abc"), VARCHAR()), "abc"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume tranform type doesn't matter here. The results would be the same for kIdentity and kTruncate. If so, consider adding a helper method testString(std::string) which converts string using kIdentity and kTruncate, assert that results are the same and returns that result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
I added a helper function testString.
| namespace { | ||
|
|
||
| template <typename T> | ||
| std::string test(TransformType transform, T value, const TypePtr& type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are not testing anything here, let's rename to something like toPath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
219ec2a to
2dee873
Compare
Add Iceberg partition path utility for partition path naming.
Introduces
IcebergPartitionPath, a utility class that converts partition keys totheir string representations for use in Iceberg partition directory paths.
The implementation follows the behavior of the Apache Iceberg Java library.