Try to find if a file is corrupt and throw if writing more than 4 GiB#29
Open
Try to find if a file is corrupt and throw if writing more than 4 GiB#29
Conversation
See iLCSoft/LCIO#218, the SIO format stores record and block data lengths as 32 bit unsigned integers (4 bytes, max ~4 GiB). These changes will try to find if the issue exists when reading a file and prevent it from happening when writing.
tmadlener
approved these changes
Mar 5, 2026
Contributor
tmadlener
left a comment
There was a problem hiding this comment.
I think this is a good first step. Given that we already have some files with excessive lengths written, I suspect that we have to potentially come back to "properly" fix this, once people get the new error when writing files.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The issue in iLCSoft/LCIO#218 (at least one of them) is that the SIO format stores record and block data lengths as 32 bit unsigned integers (4 bytes, max ~4 GiB). When writing a record bigger than that, the
std::size_tlength is narrowed to unsigned int before being written to disk, producing a wrapped (modulo 2^32) value. Data is written correctly, but the length fields are wrong. I'm not brave enough to change the format, so I have added some guards to find it and prevent it from happening. If the file format can be changed easily, then either the lengths are changed to 64 bits or another 32 bit unsigned integer is added.BEGINRELEASENOTES
ENDRELEASENOTES
Possibly fix (by not letting it happen) iLCSoft/LCIO#218
Now, when dumping the file, a better error message is printed: