Skip to content

Fix: Solana memo toBytes() corrupts hex-looking memos#33

Merged
mrtnetwork merged 2 commits into
mrtnetwork:mainfrom
KabaDH:fix/solana-memo-tobytes-utf8
Jun 19, 2026
Merged

Fix: Solana memo toBytes() corrupts hex-looking memos#33
mrtnetwork merged 2 commits into
mrtnetwork:mainfrom
KabaDH:fix/solana-memo-tobytes-utf8

Conversation

@KabaDH

@KabaDH KabaDH commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Fix: Solana memo toBytes() corrupts hex-looking memos

Summary

MemoLayout.toBytes() encoded the memo with the hybrid StringUtils.toBytes()
(hex-or-UTF-8 by content sniffing), while fromBuffer and toJson treat a memo
as a plain UTF-8 string. Memos whose text happens to look like hex were silently
corrupted, and some produced invalid UTF-8 that the SPL Memo program rejects.

This PR switches the write path to a plain UTF-8 encode, making it symmetric with
the read path and consistent with the Memo program semantics.

Changes

  • lib/solana/src/instructions/memo/layouts/memo.dart
   @override
   List<int> toBytes() {
-    return StringUtils.toBytes(memo);
+    return StringUtils.encode(memo); // UTF-8, symmetric with fromBuffer
   }

Impact

  • Normal text memos (URLs, sentences, anything with non-hex chars or odd length)
    are unaffected — they already used the UTF-8 branch.
  • Behavior changes only for memos matching ^(0x|0X)?([0-9A-Fa-f]{2})+$: the
    literal text is now preserved instead of being reinterpreted as raw bytes.
  • Callers that relied on passing a hex string to embed raw bytes should pass the
    decoded bytes explicitly; that was undocumented behavior that could yield
    invalid UTF-8.

Testing

Added test/solana/tests/memo/layout_test.dart covering round-trip, byte-level
UTF-8, multi-byte characters, empty memo, and toJson consistency. All pass; the
suite fails on the pre-fix code.

StringUtils.toBytes sniffs content and hex-decodes memos matching
^(0x|0X)?([0-9A-Fa-f]{2})+$, breaking the round-trip with fromBuffer
(UTF-8) and producing invalid UTF-8 the SPL Memo program rejects.
Switch to StringUtils.encode (UTF-8) so the write path is symmetric
with the read path and matches Memo program semantics.

Add layout tests covering hex-looking memos, byte-level UTF-8,
multi-byte characters, empty memo, and toJson consistency.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@mrtnetwork

mrtnetwork commented Jun 18, 2026

Copy link
Copy Markdown
Owner

@KabaDH Hi, I think we should change the memo variable to bytes (or a hex string) instead of UTF-8. When we use allowInvalidOrMalformed with non-UTF-8 data, the decoded value can produce different bytes, which isn't acceptable in some situations, such as Web3, where the exact byte representation must be preserved.

MemoLayout kept the memo as a String and converted on both sides, which
was lossy: toBytes() hex-sniffed via StringUtils.toBytes (corrupting
hex-looking memos) and fromBuffer decoded with allowInvalidOrMalformed
(replacing non-UTF-8 bytes with U+FFFD). Either way the exact on-chain
byte representation was not preserved, which is unacceptable for Web3
signing/hashing.

Make raw bytes the source of truth:
- store List<int> memoBytes (asImmutableBytes, the codebase convention),
  so neither the input list nor the toBytes() result can mutate it
- MemoLayout.fromString for UTF-8 text, memo getter for best-effort read
- fromBuffer keeps bytes as-is; fromBuffer(toBytes(x)) == x is lossless
- toJson emits best-effort UTF-8 text with a 0x hex fallback, matching
  the ed25519 layout convention, so the view never disagrees with bytes

Add layout tests: string round-trip, byte-level UTF-8, multi-byte chars,
lossless non-UTF-8 round-trip, immutability, memo getter, empty memo,
and toJson text/hex behavior.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@KabaDH

KabaDH commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@mrtnetwork Done.
MemoLayout now stores raw List<int> memoBytes instead of a String, so the exact bytes are preserved. Added
fromString + memo getter for text; toJson falls back to 0x hex like ed25519.
Note: breaking — constructor now takes memoBytes. Tests cover the non-UTF-8 round-trip.

@mrtnetwork mrtnetwork merged commit 486242d into mrtnetwork:main Jun 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants