tests: add whitespace tests for vertical tab behavior#155028
tests: add whitespace tests for vertical tab behavior#155028rust-bors[bot] merged 1 commit intorust-lang:mainfrom
Conversation
|
rustbot has assigned @dingxiangfei2009. Use Why was this reviewer chosen?The reviewer was selected based on:
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| let x = 5; | ||
| let y = 10; | ||
| let z = x + y; | ||
|
|
There was a problem hiding this comment.
Since vertical tab doesn't show up in GitHub's PR review rendering, please put a comment above each line containing the whitespace.
You might want to add lines with each of the 11 permitted whitespace characters:
https://doc.rust-lang.org/reference/whitespace.html
And then some lines with the other 14 disallowed whitespace characters (the ones from this list marked White_Space, that aren't in the first list):
https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt
There was a problem hiding this comment.
Done! Added a comment above each line with an invisible whitespace character so they are visible in the diff. Also expanded the test to cover all 11 permitted Pattern_White_Space characters inline, and listed the 14 disallowed Unicode White_Space characters in comments since placing them between tokens would cause a compile error.
There was a problem hiding this comment.
You can create a separate test that fails, and put //@ check-fail at the top of the file:
https://rustc-dev-guide.rust-lang.org/tests/directives.html#controlling-outcome-expectations
This is how to test whitespace that is not allowed in Rust source code.
There was a problem hiding this comment.
Thanks for the feedback! I’ve added the failing UI test and updated it to match the full stderr output, including the help message. I’ll go through it again and make the remaining tweaks.
| @@ -0,0 +1,22 @@ | |||
| // This test checks that split_ascii_whitespace does NOT split on | |||
There was a problem hiding this comment.
I'm not sure if this test is relevant to the compiler?
There was a problem hiding this comment.
Fair point. The test documents the gap between what the lexer accepts and what the stdlib gives you. Happy to remove it if you think it doesn't belong here.
|
|
||
| Tests on `where` clauses. See [Where clauses | Reference](https://doc.rust-lang.org/reference/items/generics.html#where-clauses). | ||
|
|
||
| ## `whitespace` |
There was a problem hiding this comment.
This will need an explanation of why the whitespace tests are needed. It's a good place to mention that is_ascii_whitespace and is_whitespace in the standard library don't match the Rust language's definition of whitespace.
There was a problem hiding this comment.
Added! The README now explains that the Rust lexer uses Unicode Pattern_White_Space, which differs from both is_ascii_whitespace (WhatWG, skips vertical tab) and is_whitespace (Unicode White_Space, broader set). That context makes it clearer why these tests exist
| // the standard library's is_ascii_whitespace does NOT include vertical | ||
| // tab, following the WhatWG Infra Standard instead. | ||
| // | ||
| // See: https://github.com/rust-lang/rust-project-goals/issues/53 |
There was a problem hiding this comment.
Where did you get this link? It's not the Outreachy tracking issue.
There was a problem hiding this comment.
Got it, Fixed it to point to the correct Outreachy tracking issue.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1561721 to
47fb045
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
You can put links to incomplete PRs in your final application, and you can continue to work on those PRs. What happens to your PR is up to the maintainers of the tool you're modifying or testing. Sometimes PRs (or parts of PRs) don't get accepted for good reasons, and that's ok. Sometimes there are review delays because reviewers are busy, and that's also ok. |
|
Hi @teor2345 , |
|
The next step is a review by someone with Rust merge rights. Please be patient, it is normal for reviewers to take 2 weeks or more to review your PR. If it has been more than 2 weeks, you can tag the assigned reviewer, or re-assign another review using the instructions at the top of this PR. |
|
Got it, thank you for the clarification. I’ll be patient and follow up if needed after some time. |
|
Thanks! I updated the PR description to link correct place: rustfoundation/interop-initiative#53 |
…yukang tests: add whitespace tests for vertical tab behavior This PR adds two small tests to highlight how vertical tab (\x0B) is handled differently across Rust's whitespace definitions. The Rust lexer treats vertical tab as whitespace (Unicode Pattern_White_Space), while `split_ascii_whitespace` follows the WhatWG Infra Standard and does not include vertical tab. These tests make that difference visible and easier to understand. See: rustfoundation/interop-initiative#53
…yukang tests: add whitespace tests for vertical tab behavior This PR adds two small tests to highlight how vertical tab (\x0B) is handled differently across Rust's whitespace definitions. The Rust lexer treats vertical tab as whitespace (Unicode Pattern_White_Space), while `split_ascii_whitespace` follows the WhatWG Infra Standard and does not include vertical tab. These tests make that difference visible and easier to understand. See: rustfoundation/interop-initiative#53
|
@bors squash |
This comment has been minimized.
This comment has been minimized.
* tests: add whitespace tests for vertical tab behavior Add two small tests to highlight how vertical tab is handled differently. - vertical_tab_lexer.rs checks that the lexer treats vertical tab as whitespace - ascii_whitespace_excludes_vertical_tab.rs shows that split_ascii_whitespace does not split on it This helps document the difference between the Rust parser (which accepts vertical tab) and the standard library’s ASCII whitespace handling. See: rust-lang/rust-project-goals#53 * tests: add ignore-tidy-tab directive to whitespace tests * tests: expand vertical tab lexer test to cover all Pattern_White_Space chars * tests: add whitespace/ README entry explaining lexer vs stdlib mismatch * Update ascii_whitespace_excludes_vertical_tab.rs * Update ascii_whitespace_excludes_vertical_tab.rs make sure tabs and spaces are well checked * Update ascii_whitespace_excludes_vertical_tab.rs * fix tidy: add whitespace README entry * Update README.md with missing full stop * Update ascii_whitespace_excludes_vertical_tab.rs * fix tidy: use full path format for whitespace README entry * fix tidy: README order, trailing newlines in whitespace tests * fix: add run-pass directive and restore embedded whitespace bytes * fix tidy: remove duplicate whitespace README entry * Add failing UI test for invalid whitespace (zero width space) This adds a //@ check-fail test to ensure that disallowed whitespace characters like ZERO WIDTH SPACE are rejected by the Rust lexer. * git add tests/ui/whitespace/invalid_whitespace.rs git commit -m "Fix tidy: add trailing newline" git push * Fix tidy: add trailing newline * Update invalid_whitespace.rs * Update invalid_whitespace.rs * Clean up whitespace in invalid_whitespace.rs Remove unnecessary blank lines in invalid_whitespace.rs * Update invalid_whitespace.rs * Clarify ZERO WIDTH SPACE usage in test Update comment to clarify usage of ZERO WIDTH SPACE. * Improve error messages for invalid whitespace Updated error messages to clarify the issue with invisible characters. * Modify invalid_whitespace test for clarity Update test to check for invalid whitespace characters. * Resolve unknown token error in invalid_whitespace.rs Fix whitespace issue causing unknown token error. * Remove invisible character from variable assignment Fix invisible character issue in variable assignment. * Improve error message for invalid whitespace Updated error message to clarify invisible characters. * Improve error handling for invisible characters Updated error message for invisible characters in code. * Document error for unknown token due to whitespace Add error message for invalid whitespace in code * Update error message for invalid whitespace handling * Modify invalid_whitespace.rs for whitespace checks Updated the test to check for invalid whitespace handling. * Correct whitespace in variable declaration Fix formatting issue by adding space around '=' in variable declaration. * Update error message for invalid whitespace * Update invalid_whitespace.stderr * Refine error handling for invalid whitespace test Update the error messages for invalid whitespace in the test. * Update invalid_whitespace.rs * Fix whitespace issues in invalid_whitespace.rs * Update invalid_whitespace.stderr file * Clean up whitespace in invalid_whitespace.rs Removed unnecessary blank lines from the test file. * Update invalid_whitespace.stderr
43f045c to
c2c486a
Compare
|
|
This PR was contained in a rollup (#155593), which was closed. |
|
@bors r=chenyukang |
Rollup of 6 pull requests Successful merges: - #155028 (tests: add whitespace tests for vertical tab behavior) - #155582 (Rewrite `FlatMapInPlace`.) - #151194 (Fix wrong suggestion for returning async closure) - #154377 (Fix `#[expect(dead_code)]` liveness propagation) - #155572 (Move diagnostic attribute target checks from check_attr) - #155586 (Ensure we don't feed owners from ast lowering if we ever make that query tracked)
Rollup merge of #155028 - Brace1000:whitespace-tests, r=chenyukang tests: add whitespace tests for vertical tab behavior This PR adds two small tests to highlight how vertical tab (\x0B) is handled differently across Rust's whitespace definitions. The Rust lexer treats vertical tab as whitespace (Unicode Pattern_White_Space), while `split_ascii_whitespace` follows the WhatWG Infra Standard and does not include vertical tab. These tests make that difference visible and easier to understand. See: rustfoundation/interop-initiative#53
View all comments
This PR adds two small tests to highlight how vertical tab (\x0B)
is handled differently across Rust's whitespace definitions.
The Rust lexer treats vertical tab as whitespace (Unicode
Pattern_White_Space), while
split_ascii_whitespacefollows theWhatWG Infra Standard and does not include vertical tab.
These tests make that difference visible and easier to understand.
See: rustfoundation/interop-initiative#53