-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
keys(LazyJSON.value(json_file))The above is asymptotically problematic when json_file contain large array of strings.
I run this code
for i in 1:5
items = i * 10000000
json_file = open("/tmp/json", "w")
write(json_file, JSON.json(Dict("a"=> "a", "b"=>repeat(["test"], items))))
json_file = open("/tmp/json")
t = @elapsed collect(keys(LazyJSON.value(json_file)))
println("$items $t")
endAnd as you can see from the result it's far away from linear (the second column is in seconds and the first is the number of items in the array)
10000000 75.786150509
20000000 317.985342906
30000000 724.489721802
40000000 1305.421886045
50000000 2040.987945434
60000000 2977.542937743
Compared to JSON.parse which return
10000000 8.384795834
20000000 18.123253007
30000000 27.854969659
40000000 38.360378806
50000000 51.391322248
60000000 73.577127605
We did some profiling and it seems that most of the time is spent in
Lines 478 to 496 in 53c63f0
| function scan_string(s, i) | |
| i, c = next_ic(s, i) | |
| has_escape = false | |
| while c != '"' | |
| if isnull(c) || c == IOStrings.ASCII_ETB | |
| throw(JSON.ParseError(s, i, c, "input incomplete")) | |
| end | |
| escape = c == '\\' | |
| i, c = next_ic(s, i) | |
| if escape && !(isnull(c) || c == IOStrings.ASCII_ETB) | |
| has_escape = true | |
| i, c = next_ic(s, i) | |
| end | |
| end | |
| return i, has_escape | |
| end |
Metadata
Metadata
Assignees
Labels
No labels
