Open
Conversation
Contributor
|
Here's a simpler version using the new bytestring builder: import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as BL
import qualified Data.ByteString.Builder as B
import qualified Data.ByteString.Builder.Prim as BP
import Data.ByteString.Builder.Prim ((>$<), (>*<))
import Data.Word (Word8)
escapeBSNulls :: B.ByteString -> BL.ByteString
escapeBSNulls = B.toLazyByteString . BP.primMapByteStringBounded conv
where
conv :: BP.BoundedPrim Word8
conv = BP.condB (==0) (BP.liftFixedToBounded replacement)
(BP.liftFixedToBounded BP.word8)
replacement :: BP.FixedPrim a
replacement = const ('\\', ('0', ('0', '0')))
>$< BP.char8 >*< BP.char8 >*< BP.char8 >*< BP.char8In my benchmark on a 13M data file |
Contributor
|
Note that the large factor is likely an effect of using test data with a lot of \0s (an executable file) not from using such a large input (it's actually an even bigger factor for a 10k prefix of the same binary test file). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
this uses a lot less memory than the concatMap one. in our app there are queries with length of 10k chars (inserting files to db) and the concatMap one was creating huge amounts of small bytestrings, which would consume additional 1gb of memory and a few extra seconds for gc.