Skip to content

Conversation

@lordeji
Copy link

@lordeji lordeji commented Dec 27, 2025

TL;DR

PR #8604 was incomplete. It was indeed embedding the locales but they weren't used and get_locales_dir() implementation was still looking for a path which throw an error in setup_localization() which force to fallback to english, even if the locales were really embedded.


Implementation :

I mainly reformated setup_localization() and init_localization() to have implementations specific to debug and release.
From what I gathered from the codebase, in debug we look for the .ftl files (and eventually fallback to embedded english) and in release it seems that there was a sort of legacy version that was looking for a folder relative to the executable (which was the source of the error).
The debug implementations is mainly the same, the release implementations follows the same structure except that we get the localized text from get_embedded_locale().
The implementation comes from the preexisting function create_english_bundle_from_embedded() where I replaced the "en-US" identifier to a parameter one.


Testing :

For consistency, I tested on 5 packages for each configuration (cp, split, truncate, mv and ls).
I was also using the devcontainer provided.
I compiled :

cargo build -p <packages...>
cargo build --release -p <packages...>
cargo build
cargo build --release

With no error/warning and then ran :

export LANG=fr_FR.UTF-8
./target/debug/<packages...> --help
./target/debug/coreutils <packages...> --help
./target/release/<packages...> --help
./target/release/coreutils <packages...> --help

Which would print French help text for all the packages in all mode.
I then ran the tests :

cargo nextest run --all-features -p uucore --no-fail-fast

With result : 313 tests run: 312 passed, 1 failed, 2 skipped

The only failed test is uucore features::proc_info::tests::test_pid_entry but it also fails when testing in main. Maybe the error is related to the devcontainer.


Edits :

  • While looking at the workflows, I saw that the script for "Test Make installation" step (in l10n_installation_test) was not set properly. Locale should be set before building or the embedding won't be done.
  • Since l10n_installation_test was still throwing an error, I looked back and saw I forgot to update the "Test Cargo installation" step that did not even have locales set.

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/cp/cp-mv-enotsup-xattr. tests/cp/cp-mv-enotsup-xattr is passing on 'main'. Maybe you have to rebase?
Note: The gnu test tests/csplit/csplit-io-err was skipped on 'main' but is now failing.

@codspeed-hq
Copy link

codspeed-hq bot commented Dec 27, 2025

CodSpeed Performance Report

Merging #9879 will degrade performance by 4.47%

Comparing lordeji:actually-use-embedded-locale (2e2305e) with main (8c10759)1

Summary

⚡ 13 improvements
❌ 1 regression
✅ 113 untouched
⏩ 6 skipped2

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Efficiency
wc_bytes_synthetic[500] 185.5 µs 168.9 µs +9.85%
cksum_blake3 209 µs 192.7 µs +8.45%
b64_decode_ignore_garbage_synthetic 163.5 µs 148.3 µs +10.27%
b64_decode_synthetic 166.6 µs 150.7 µs +10.51%
b64_encode_synthetic 162 µs 146.9 µs +10.31%
factor_multiple_u64s[2] 211.7 ms 187.4 ms +13%
rm_single_file 119.1 ms 109.5 ms +8.81%
sort_ascii_c_locale 21.5 ms 22.5 ms -4.47%
split_number_chunks 294.6 µs 277 µs +6.36%
mv_force_overwrite 144 ms 127.2 ms +13.28%
mv_single_file 135.3 ms 127.1 ms +6.41%
cp_large_file[16] 361.3 µs 347.3 µs +4.03%
shuf_repeat_sampling[50000] 5.1 ms 4.6 ms +10.45%
shuf_lines[100000] 31.6 ms 26.9 ms +17.58%

Footnotes

  1. No successful run was found on main (b3d05b6) during the generation of this report, so 8c10759 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@oech3
Copy link
Contributor

oech3 commented Dec 28, 2025

Update makefile to not install locales in release because they're useless.

People installing from source does not require all locales, but distibutor still needs all locales.

@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/cp/cp-mv-enotsup-xattr. tests/cp/cp-mv-enotsup-xattr is passing on 'main'. Maybe you have to rebase?
Note: The gnu test tests/csplit/csplit-io-err was skipped on 'main' but is now failing.

@lordeji lordeji force-pushed the actually-use-embedded-locale branch from 7b12e54 to 8b1d9f6 Compare December 28, 2025 12:52
@lordeji
Copy link
Author

lordeji commented Dec 28, 2025

@oech3 Sorry, didn't know this. Rebased without the GNUmakefile commit.

@oech3
Copy link
Contributor

oech3 commented Dec 28, 2025

Is non-embedded .ftl still supported at here?

@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/tail/assert is no longer failing!

@lordeji
Copy link
Author

lordeji commented Dec 28, 2025

@oech3 Not in release mode.
I may have misinterpreted the code intention to only use embedded locales in release mode.
If needed, i can fix the original get_locales_dir() function that was throwing an error in release and then add fallback to .ftl files.

@oech3
Copy link
Contributor

oech3 commented Dec 28, 2025 via email

@lordeji
Copy link
Author

lordeji commented Dec 28, 2025

@oech3 I'm truly sorry for the confusion but it seems that there is conflicting ideas between you and PR #8604.
It says :

It embeds the corresponding locale file if it exists, in addition to the mandatory English (en-US) fallback.

But embedded (non english) locales were not programmed to be used, making me think there was a bug.
There is surely something I'm missing but why embedding system locale if it's meant only for English ?

@oech3
Copy link
Contributor

oech3 commented Dec 28, 2025

why embedding system locale if it's meant only for English ?

For performance ? cc: @sylvestre @WaterWhisperer

@oech3
Copy link
Contributor

oech3 commented Dec 28, 2025

If fallback is not supported, I think it is bug (at least for packager).

@oech3
Copy link
Contributor

oech3 commented Dec 28, 2025

$ strace -e trace=openat -- /bin/true
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/libonig.so.5", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/proc/self/maps", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/bin/true/en-US.ftl", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (ディレクトリではありません)
openat(AT_FDCWD, "/usr/bin/true/ja-JP.ftl", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (ディレクトリではありません)
openat(AT_FDCWD, "/usr/bin/true/en-US.ftl", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (ディレクトリではありません)
openat(AT_FDCWD, "/usr/bin/true/ja-JP.ftl", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (ディレクトリではありません)
+++ exited with 0 +++

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/tty/tty-eof is no longer failing!

@lordeji lordeji force-pushed the actually-use-embedded-locale branch from 396f6b3 to b4b55d0 Compare December 29, 2025 01:35
@github-actions
Copy link

GNU testsuite comparison:

Note: The gnu test tests/id/smack was skipped on 'main' but is now failing.
Note: The gnu test tests/mkdir/smack-no-root was skipped on 'main' but is now failing.
Note: The gnu test tests/mkdir/smack-root was skipped on 'main' but is now failing.

- Update setup_localization() to have implementation specific to debug or release.
- Update create_english_bundle_from_embedded() to not be specific to english.
- Update init_localization() to have implementations specific to debug or release.
- Delete get_locales_dir() release mode implementation because it was useless.
In "Test Make installation" step, set french locale
BEFORE building with make.
In "Test Cargo installation" step, add french locale
env var before installation.
If not, the binary is built with no embedded french locale.
@lordeji lordeji force-pushed the actually-use-embedded-locale branch from 497bd5b to 2e2305e Compare December 29, 2025 19:09
@sylvestre
Copy link
Contributor

would it be possible to add a github action check to verify that it works correctly? thanks

@sylvestre
Copy link
Contributor

also, i worried about the binary size with all locales, did you look at this ?

@oech3
Copy link
Contributor

oech3 commented Dec 31, 2025

In my understanding, we fallbacks to nun-embedded ftl instead of embedding all langs to the binary. No?

@sylvestre
Copy link
Contributor

looking at comment #0 it isn't clear to me :)

@lordeji
Copy link
Author

lordeji commented Jan 1, 2026

@sylvestre @oech3 For the binary size, it shouldn't change anything because PR #8604 was already embedding them. It was the first thing I checked when debugging. OUT_DIR/embedded_locales.rs had both english and french locales, but only english were used.
I did not explicitely compare, I will check size difference when I can.
I will also add a github action, just need to learn of it works before so it may takes a bit more time.

I did not fallback to .ftl files because I thought that, for performance, in release the goal was to embed all the locales. It's my fault, I misinterpreted the PR and the documentation was only talking about english which was contradicting the embedding of french locales and I thought it wasn't updated.
I will update my PR with fallback to .ftl files but this time try embedded before so that it doesn't error in get_locales_dir() and always fallback to embedded english.

Before making the changes, if that's not too much to ask, could someone write my the EXACT intent of how you want it to works ? There is no real documentation and when I tried cargo install --path . --root ./build no locales were built so I assume it's only a GNUmakefile option (with install-locales). I need these informations :

  1. Looking at the GNUmakefile, locales are installed in $(DATAROOTDIR)/locales/<pkg> with a default of $(PREFIX)/share. I see that resolve_locales_dir_from_exe_dir() is trying to handle 3 cases : <bindir>/locales/<prog> / <prefix>/share/locales/<prog> / <bindir>/<prog>. Are these 3 cases enough and I should just revert the deletion or should I check more edgecases ? I was thinking that we could call cargo with DATAROOTDIR as an local env variable and then get it at compile time with const DATAROOTDIR: &str = env!("DATAROOTDIR");
  2. In install-locales target I see that we do not install english locales (if [ "$$(basename "$$locale_file")" != "en-US.ftl" ];) should this behavior be extended to detected system locale ?

Sorry for the long message but I'm on holidays and I won't have much time to code or answer so the quicker we can resolve misunderstandings, the more efficient I will be.

Happy new years by the way !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants