Add SliceUtf8.toTitleCase() by wendigo · Pull Request #177 · airlift/slice

wendigo · 2025-02-10T08:27:19Z

this is based on the logic of the toLower and toUpper case

wendigo · 2025-03-25T13:25:55Z

@martint @dain ptal

wendigo · 2025-04-14T13:32:27Z

@martint PTAL

PPL-TBSK · 2025-08-14T07:03:51Z

@wendigo @martint any news please ?

dain

Looks good to me

wendigo · 2026-06-08T09:09:24Z

@dain I've pushed new implementation. PTAL

coderabbitai · 2026-06-08T09:12:55Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7f93a8ab-e52f-4c0c-a036-ee7ce4a8b291

📥 Commits

Reviewing files that changed from the base of the PR and between a0818bd and 53d58d9.

📒 Files selected for processing (2)

src/main/java/io/airlift/slice/SliceUtf8.java
src/test/java/io/airlift/slice/TestSliceUtf8.java

🚧 Files skipped from review as they are similar to previous changes (2)

src/main/java/io/airlift/slice/SliceUtf8.java
src/test/java/io/airlift/slice/TestSliceUtf8.java

📝 Walkthrough

Walkthrough

This pull request adds UTF-8 title-casing support to SliceUtf8. It introduces a TITLE_CODE_POINTS lookup table in static initialization, adds public toTitleCase methods for Slice and byte-array ranges, and implements a code-point-based transformation that title-cases the first cased code point of each whitespace-delimited word and lowercases subsequent code points. Invalid UTF-8 sequences are copied verbatim and do not reset word-start state. Output allocation is lazy; unchanged input is returned as a wrapped view. Tests validate behavior, invalid-UTF-8 handling, and no-op wrapping.

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/test/java/io/airlift/slice/TestSliceUtf8.java (1)
756-775: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Consider adding test coverage for toTitleCase with invalid bytes at start and middle positions.

The assertions for toLowerCase/toUpperCase test invalid sequences at three positions: end only, start + end ("FOO"), and surrounding ("FOO" + invalid + "BAR"). However, toTitleCase only covers the end position. This matters for title-casing because the start-of-word state should persist across invalid sequences.
🧪 Suggested additional assertions
+        // invalid sequence at start followed by text should not start a new word
+        assertThat(toTitleCase(wrappedBuffer(concat(invalidSequence, new byte[] {'F', 'O', 'O'}))))
+                .isEqualTo(wrappedBuffer(concat(invalidSequence, new byte[] {'F', 'o', 'o'})));
+
+        // invalid sequence in middle should not start a new word
+        assertThat(toTitleCase(wrappedBuffer(concat(new byte[] {'F', 'O', 'O'}, invalidSequence, new byte[] {'B', 'A', 'R'}))))
+                .isEqualTo(wrappedBuffer(concat(new byte[] {'F', 'o', 'o'}, invalidSequence, new byte[] {'b', 'a', 'r'})));

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 227aa4ea-f292-45cf-a25b-5bbce119c3a5

📥 Commits

Reviewing files that changed from the base of the PR and between f72c712 and d8d6530.

📒 Files selected for processing (2)

src/main/java/io/airlift/slice/SliceUtf8.java
src/test/java/io/airlift/slice/TestSliceUtf8.java

dain

Looks good to me, but maybe squash to one commit

wendigo mentioned this pull request Feb 10, 2025

Add function to convert strings to title case. trinodb/trino#2942

Open

dain approved these changes Oct 22, 2025

View reviewed changes

Comment thread src/main/java/io/airlift/slice/SliceUtf8.java

Comment thread src/test/java/io/airlift/slice/TestSlice.java Outdated

wendigo force-pushed the serafin/tile-case branch from 3e1949e to d8d6530 Compare June 8, 2026 09:07

wendigo requested a review from dain June 8, 2026 09:07

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

wendigo force-pushed the serafin/tile-case branch from d8d6530 to a0818bd Compare June 8, 2026 09:25

dain reviewed Jun 9, 2026

View reviewed changes

Add support for SliceUtf8.toTitleCase

53d58d9

wendigo force-pushed the serafin/tile-case branch from a0818bd to 53d58d9 Compare June 9, 2026 01:42

wendigo merged commit 12ace0b into airlift:master Jun 9, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SliceUtf8.toTitleCase()#177

Add SliceUtf8.toTitleCase()#177
wendigo merged 1 commit into
airlift:masterfrom
wendigo:serafin/tile-case

wendigo commented Feb 10, 2025

Uh oh!

wendigo commented Mar 25, 2025

Uh oh!

wendigo commented Apr 14, 2025

Uh oh!

PPL-TBSK commented Aug 14, 2025

Uh oh!

dain left a comment

Uh oh!

Uh oh!

Uh oh!

wendigo commented Jun 8, 2026

Uh oh!

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

Walkthrough

Uh oh!

coderabbitai Bot left a comment

Uh oh!

dain left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wendigo commented Feb 10, 2025

Uh oh!

wendigo commented Mar 25, 2025

Uh oh!

wendigo commented Apr 14, 2025

Uh oh!

PPL-TBSK commented Aug 14, 2025

Uh oh!

dain left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wendigo commented Jun 8, 2026

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

dain left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading