Skip to content

fix: replace UTF-8 em dashes with ASCII dashes in partition CSV files#133

Merged
Keralots merged 1 commit into
Keralots:mainfrom
VintageCar:fix/gbk-encoding-partition-csv
Jun 29, 2026
Merged

fix: replace UTF-8 em dashes with ASCII dashes in partition CSV files#133
Keralots merged 1 commit into
Keralots:mainfrom
VintageCar:fix/gbk-encoding-partition-csv

Conversation

@VintageCar

Copy link
Copy Markdown
Contributor

PlatformIO's espressif32 platform reads partition CSV files using Python's default encoding, which is GBK on Chinese Windows systems (and CP932/CP949 on Japanese/Korean Windows). The UTF-8 em dash character (U+2014, encoded as 0xE2 0x80 0x94) in comment lines cannot be decoded by these codecs, causing a UnicodeDecodeError during the build process:

UnicodeDecodeError: 'gbk' codec can't decode byte 0x94

This error occurs at the checkprogsize step after successful linking, making it particularly frustrating as the build appears to succeed until the final verification stage.

Fix: Replace all UTF-8 em dashes (—) with ASCII double dashes (--) in partition CSV comments, making these files pure ASCII. This ensures they can be read under any locale encoding without errors, and the comment meaning remains intact.

Files changed:

  • partitions_4mb.csv: 1 occurrence
  • partitions_16mb.csv: 2 occurrences
  • partitions_8mb.csv: no change needed (already pure ASCII)

PlatformIO's espressif32 platform reads partition CSV files using
Python's default encoding, which is GBK on Chinese Windows systems
(and CP932/CP949 on Japanese/Korean Windows). The UTF-8 em dash
character (U+2014, encoded as 0xE2 0x80 0x94) in comment lines
cannot be decoded by these codecs, causing a UnicodeDecodeError
during the build process:

  UnicodeDecodeError: 'gbk' codec can't decode byte 0x94

This error occurs at the checkprogsize step after successful linking,
making it particularly frustrating as the build appears to succeed
until the final verification stage.

Fix: Replace all UTF-8 em dashes (—) with ASCII double dashes (--)
in partition CSV comments, making these files pure ASCII. This
ensures they can be read under any locale encoding without errors,
and the comment meaning remains intact.

Files changed:
- partitions_4mb.csv: 1 occurrence
- partitions_16mb.csv: 2 occurrences
- partitions_8mb.csv: no change needed (already pure ASCII)
@Keralots Keralots merged commit 3b45902 into Keralots:main Jun 29, 2026
@VintageCar VintageCar deleted the fix/gbk-encoding-partition-csv branch June 30, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants