Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time complexity of JSON serialization/parse

I have a project that use JSON as cross-language serialization to pass around data. Recently the size of the data grows a little huge (10k length list of objects). It takes python standard json library around 20 seconds to serialize the data.

I am working to optimize the time. While switch to other json serializer (cjson, simplejson, ujson) can speed things up quite a bit, I am start to wondering the time complexity of JSON serialization. If the relationship is not linear (say if it is n^2) I can easily chop the data in chunks and reduce the time significantly.

From what I guessed, the complexity should really depends on the input data. But is there a worst-case/average estimation available? A link to reference will be highly appreciated too.

Thanks.

like image 547
xbtsw Avatar asked Oct 15 '25 03:10

xbtsw


1 Answers

I've benchmarked the time complexity with this code:

import json
import random
import time

Ns = 10, 100, 1000, 10000, 100000, 200000, 300000, 600000, 1000000
for N in Ns:
    l = [random.random() for i in xrange(N)]
    t0 = time.time()
    s = json.dumps(l)
    t1 = time.time()
    dt = t1-t0
    print "%s %s" % (N, dt) 

On my machine, the outcome is:

10 7.20024108887e-05
100 0.000385999679565
1000 0.00362801551819
10000 0.036504983902
100000 0.366562128067
200000 0.73614192009
300000 1.09785795212
600000 2.20272803307
1000000 3.6590487957

First column: list length; second column: time for serialization. Plotting (with for example xmgrace) reveals an ideal linear relation.

like image 70
Dr. Jan-Philip Gehrcke Avatar answered Oct 17 '25 06:10

Dr. Jan-Philip Gehrcke



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!