Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create nested JSON from flat csv

Trying to create a 4 deep nested JSON from a csv based upon this example:

Region,Company,Department,Expense,Cost
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123

At each level I would like to sum the costs and include it in the outputted JSON at the relevant level.

The structure of the outputted JSON should look something like this:

    {
          "id": "aUniqueIdentifier", 
          "name": "usually a nodes name", 
          "data": [
                {
                      "key": "some key", 
                      "value": "some value"
                }, 
                {
                      "key": "some other key", 
                      "value": "some other value"
                }
          ], 
          "children": [/* other nodes or empty */ ]
    }

(REF: http://blog.thejit.org/2008/04/27/feeding-json-tree-structures-to-the-jit/)

Thinking along the lines of a recursive function in python but have not had much success with this approach so far... any suggestions for a quick and easy solution greatly appreciated?

UPDATE: Gradually giving up on the idea of the summarised costs because I just can't figure it out :(. I'not much of a python coder yet)! Simply being able to generate the formatted JSON would be good enough and I can plug in the numbers later if I have to.

Have been reading, googling and reading for a solution and on the way have learnt a lot but still no success in creating my nested JSON files from the above CSV strucutre. Must be a simple solution somewhere on the web? Maybe somebody else has had more luck with their search terms????

like image 452
spadeisaspade Avatar asked Jan 22 '26 07:01

spadeisaspade


1 Answers

Here are some hints.

Parse the input to a list of lists with csv.reader:

>>> rows = list(csv.reader(source.splitlines()))

Loop over the list to buildi up your dictionary and summarize the costs. Depending on the structure you're looking to create the build-up might look something like this:

>>> summary = []
>>> for region, company, department, expense, cost in rows[1:]:
    summary.setdefault(*region, company, department), []).append((expense, cost))

Write the result out with json.dump:

>>> json.dump(summary, open('dest.json', 'wb'))

Hopefully, the recursive function below will help get you started. It builds a tree from the input. Please be aware of what type you want your leaves to be in, which we label as the "cost". You'll need to elaborate on the function to build-up the exact structure you intend:

import csv, itertools, json

def cluster(rows):
    result = []
    for key, group in itertools.groupby(rows, key=lambda r: r[0]):
        group_rows = [row[1:] for row in group]
        if len(group_rows[0]) == 2:
            result.append({key: dict(group_rows)})
        else:
            result.append({key: cluster(group_rows)})
    return result

if __name__ == '__main__':
    s = '''\
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123
'''
    rows = list(csv.reader(s.splitlines()))
    r = cluster(rows)
    print json.dumps(r, indent=4)
like image 136
Raymond Hettinger Avatar answered Jan 24 '26 19:01

Raymond Hettinger



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!