python multidimensional boolean array?

Question

it would contain at most 1000 x 1000 x 1000 elements, which is too big for python dictionary.

with dict, around 30 x 1000 x 1000 elements, on my machine it already consumed 2gb of memory and everything got stoned.

any modules that can handle 3-dimension array whose value would be only True/False? I check bitarray http://pypi.python.org/pypi/bitarray, which seems reasonable and coded in C, however it seems more like a bit-stream instead of an array, since it supports only 1 dimension.

EnricoGiampieri · Accepted Answer

numpy is your friend:

import numpy as np
a = np.zeros((1000,1000,1000), dtype=bool)
a[1,10,100] = True

Has a memory footprint as little as possible.

EDIT:

If you really need you can also look at the defaultdict class container in the collections module, which doesn't store the values that are of the default value. But if it's not really a must, use numpy.

abarnert · Answer

numpy has already been suggested by EnricoGiampieri, and if you can use this, you should.

Otherwise, there are two choices:

A jagged array, as suggested by NPE, would be a list of list of bitarrays. This allows you to have jagged bounds—e.g., each row could be a different width, or even independently resizable:

bits3d = [[bitarray.bitarray(1000) for y in range(1000)] for x in range(1000)]
myvalue = bits3d[x][y][z]

Alternatively, as suggested by Xymostech, do your own indexing on a 1-D array:

bits3d = bitarray.bitarray(1000*1000*1000)
myvalue = bits3d[x + y*1000 + z*1000*1000]

Either way, you'd probably want to wrap this up in a class, so you can do this:

bits3d = BitArray(1000, 1000, 1000)
myvalue = bits3d[x, y, z]

That's as easy as:

class Jagged3DBitArray(object):
    def __init__(self, xsize, ysize, zsize):
        self.lll = [[bitarray(zsize) for y in range(ysize)] 
                    for x in range(xsize)]
    def __getitem__(self, key):
        x, y, z = key
        return self.lll[x][y][z]
    def __setitem__(self, key, value):
        x, y, z = key
        self.lll[x][y][z] = value

class Fixed3DBitArray(object):
    def __init__(self, xsize, ysize, zsize):
        self.xsize, self.ysize, self.zsize = xsize, ysize, zsize
        self.b = bitarray(xsize * ysize * zsize)
    def __getitem__(self, key):
        x, y, z = key
        return self.b[x + y * self.ysize + z * self.ysize * self.zsize]
    def __setitem__(self, key, value):
        x, y, z = key
        self.b[x + y * self.ysize + z * self.ysize * self.zsize] = value

Of course if you want more functionality (like slicing), you have to write a bit more.

The jagged array will use a bit more memory (after all, you have the overhead of 1M bitarray objects and 1K list objects), and may be a bit slower, but this usually won't make much difference.

The important deciding factor should be whether it's inherently an error for your data to have jagged rows. If so, use the second solution; if it might be useful to have jagged or resizable rows, use the former. (Keeping in mind that I'd use numpy over either solution, if at all possible.)

python multidimensional boolean array?

Tags:

python

arrays

boolean

thkang

2 Answers

EDIT:

EnricoGiampieri

abarnert

Recent Activity

Donate For Us

python multidimensional boolean array?

Tags:

python

arrays

boolean

thkang

2 Answers

EDIT:

EnricoGiampieri

abarnert

Related questions

Recent Activity

Donate For Us