Skip to content

fix: don't ICE on LineOverflow when the line contains multibyte characters#6902

Open
shulaoda wants to merge 1 commit into
rust-lang:mainfrom
shulaoda:05-22-fix_don_t_ice_on_lineoverflow_when_the_line_contains_multibyte_characters
Open

fix: don't ICE on LineOverflow when the line contains multibyte characters#6902
shulaoda wants to merge 1 commit into
rust-lang:mainfrom
shulaoda:05-22-fix_don_t_ice_on_lineoverflow_when_the_line_contains_multibyte_characters

Conversation

@shulaoda
Copy link
Copy Markdown
Contributor

@shulaoda shulaoda commented May 22, 2026

Summary

Fixes the rustfmt half of #6850: an ICE on lines containing multibyte UTF-8 characters when error_on_line_overflow triggers a LineOverflow diagnostic.

What was wrong

FormatLines::char (in src/formatting.rs) tallies visual columns:

self.line_len += if c == '\t' { self.config.tab_spaces() } else { 1 };

ErrorKind::LineOverflow(found, max) carries those column counts and is used by the Display impl to render the user message (maximum: 100, found: 126). However FormattingError::format_len returned the same (max, found - max) tuple to the snippet renderer, which passed it to annotate_snippets::Level::Error.span() — which interprets Range<usize> as byte offsets into the snippet source string.

On lines with multibyte characters, the byte index lands inside a codepoint and annotate-snippets panics slicing &line_buffer[range]:

end byte index 100 is not a char boundary; it is inside '☃' (bytes 98..101)

Required follow-up: bump annotate-snippets 0.11.4 → 0.11.5 (#6903)

This PR fixes the rustfmt half of #6850 and adds a regression test using a 95-snowman line — large enough to overflow max_width = 100, small enough to stay within annotate-snippets' default truncation threshold and exercise only the rustfmt-side bug.

The issue's full 120-snowman reproducer trips two independent col-vs-byte bugs: the first is the rustfmt-side one fixed above; the second hides in annotate-snippets 0.11.4 itself, in its long-line truncation path at src/renderer/display_list.rs:351–373:

let mut taken = 0;
let code: String = text.chars().skip(left).take_while(|ch| {
    let next = unicode_width::UnicodeWidthChar::width(*ch).unwrap_or(1);
    if taken + next > right - left { return false; }
    taken += next;          // taken is visual width
    true
}).collect();
if self.margin.was_cut_right(line_len) {
    code[..taken.saturating_sub(3)].fmt(f)?;  // ← taken used as byte index
    "...".fmt(f)?;
}

Same class of bug: taken accumulates unicode_width (columns) but code[..taken.saturating_sub(3)] is a byte slice on a UTF-8 String. On multibyte content the boundary falls inside a codepoint and panics at display_list.rs:369. Before this PR, the rustfmt-side bug fired first and masked this one; with the rustfmt-side bug fixed, the issue's full reproducer hits this upstream one instead.

annotate-snippets fixed this in 0.11.5, so the follow-up bump is just a Cargo.lock change — Cargo.toml already declares annotate-snippets = "0.11", semver-compatible with 0.11.5. No API adaptation, no breaking change. Tracked in a separate PR to keep this one focused on the rustfmt-side fix.

@rustbot rustbot added the S-waiting-on-author Status: awaiting some action (such as code changes or more information) from the author. label May 22, 2026
Comment thread src/formatting.rs
if col >= target_col {
return idx;
}
col += if ch == '\t' { tab_spaces } else { 1 };
Copy link
Copy Markdown
Contributor

@ytmimi ytmimi May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily something we need to address in this PR since you're trying to follow the existing logic, but the rendered width of any \t characters isn't going to be fixed to config.tab_spaces(). It'll vary depending on where the \t is in the line.

View changes since the review

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a PR #6904 to track it

Comment thread src/lib.rs Outdated
@shulaoda shulaoda marked this pull request as ready for review May 22, 2026 19:48
@rustbot rustbot added S-waiting-on-review Status: awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: awaiting some action (such as code changes or more information) from the author. labels May 22, 2026
@shulaoda shulaoda requested a review from ytmimi May 22, 2026 19:49
@ytmimi
Copy link
Copy Markdown
Contributor

ytmimi commented May 22, 2026

@shulaoda we prefer not to include merge commits. Please rebase your changes on the latest main or squash these commits.

@ytmimi
Copy link
Copy Markdown
Contributor

ytmimi commented May 22, 2026

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: awaiting review from the assignee but also interested parties. labels May 22, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 22, 2026

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@shulaoda shulaoda force-pushed the 05-22-fix_don_t_ice_on_lineoverflow_when_the_line_contains_multibyte_characters branch from 94dfb73 to fdc6240 Compare May 22, 2026 22:45
@shulaoda
Copy link
Copy Markdown
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: awaiting some action (such as code changes or more information) from the author. labels May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: awaiting review from the assignee but also interested parties.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants