Python idiom for creating dict of dict of list

Question

Given this data:

foo kk type1 1 2 3
bar kk type2 3 5 1

I would like to create a dictionary of dictionary of list.

In Perl it's called hash of hash of array. It can be achieve with the following line (executable here https://eval.in/118535)

push @{$hohoa{$name}{$type}},($v1,$v2,$v3);

Output of $hohoa in Perl:

$VAR1 = {
          'bar' => {
                     'type2' => [
                                  '3',
                                  '5',
                                  '1'
                                ]
                   },
          'foo' => {
                     'type1' => [
                                  '1',
                                  '2',
                                  '3'
                                ]
                   }
        };

What's the way to do it in Python?

Update: Why the following for loop variation didn't store all the values?

#!/usr/bin/env python

import sys
import pprint
from collections import defaultdict

outerdict = defaultdict(dict)
with open('data.txt') as infh:
    for line in infh:
        name, _, type_, values = line.split(None, 3)

        valist = values.split();
        for i in range(len(valist)):
            thval = valist[i];
            outerdict[name][type] = thval

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(outerdict)

It prints this:

defaultdict(<type 'dict'>, {'foo': {<type 'type'>: '3'}, 'bar': {<type 'type'>: '1'}})

Update 2: The output seems problematic when the data looks like this:

foo kk type1 1.2 2.10 3.3
bar kk type2 3.2 5.2 1.0

Martijn Pieters · Accepted Answer

It depends on what you are trying to achieve; how many keys should be added to the inner dict?

The simplest way is to just create new dict literals for the inner dict:

outerdict = {}
outerdict[name] = {type_: [v1, v2, v3]}

or you could use dict.setdefault() to materialize the inner dict as needed:

outerdict.setdefault(name, {})[type_] = [v1, v2, v3]

or you could use collections.defaultdict() to have it handle new values for you:

from collections import defaultdict

outerdict = defaultdict(dict)
outerdict[name][type_] = [v1, v2, v3]

When parsing a file line by line, I'd use the latter, albeit a little simplified:

from collections import defaultdict

outerdict = defaultdict(dict)
with open(filename) as infh:
    for line in infh:
        name, _, type_, *values = line.split()
        outerdict[name][type_] = [int(i) for i in values]

This uses Python 3 syntax to capture the remaining whitespace-delimited values on the line past the first 3 into values.

The Python 2 version would be:

with open(filename) as infh:
    for line in infh:
        name, _, type_, values = line.split(None, 3)
        outerdict[name][type_] = map(int, values.split())

where I limited the whitespace split to just 3 splits (giving you 4 values), then splitting the values string separately.

To have the inner-most list accumulate all values for repeated (name, type_) key combinations, you'll need to use a slightly more complex defaultdict setup; one that produces an inner defaultdict() set to produce list values:

outerdict = defaultdict(lambda: defaultdict(list))
with open(filename) as infh:
    for line in infh:
        name, _, type_, values = line.split(None, 3)
        outerdict[name][type_].extend(map(int, values.split()))

For the file you actually posted, I'd use a different approach altogether:

import csv
from itertools import islice

outerdict = defaultdict(lambda: defaultdict(list))

with open('ImmgenCons_all_celltypes_MicroarrayExp.csv', 'rb') as infh:
    reader = csv.reader(infh, skipinitialspace=True)
    # first row contains metadata we need
    celltypes = next(reader, [])[3:]

    # next two rows can be skipped
    next(islice(infh, 2, 2), None)

    for row in reader:
        name = row[1]
        for celltype, value in zip(celltypes, row[3:]):
            outerdict[name][celltype].append(float(value))

Python idiom for creating dict of dict of list

Tags:

python

pdubois

1 Answers

Martijn Pieters

Recent Activity

Donate For Us

Python idiom for creating dict of dict of list

Tags:

python

pdubois

1 Answers

Martijn Pieters

Related questions

Recent Activity

Donate For Us