it would contain at most 1000 x 1000 x 1000 elements, which is too big for python dictionary.
with dict, around 30 x 1000 x 1000 elements, on my machine it already consumed 2gb of memory and everything got stoned.
any modules that can handle 3-dimension array whose value would be only True/False? I check bitarray http://pypi.python.org/pypi/bitarray, which seems reasonable and coded in C, however it seems more like a bit-stream instead of an array, since it supports only 1 dimension.
numpy is your friend:
import numpy as np
a = np.zeros((1000,1000,1000), dtype=bool)
a[1,10,100] = True
Has a memory footprint as little as possible.
If you really need you can also look at the defaultdict class container in the collections module, which doesn't store the values that are of the default value. But if it's not really a must, use numpy.
numpy has already been suggested by EnricoGiampieri, and if you can use this, you should.
Otherwise, there are two choices:
A jagged array, as suggested by NPE, would be a list of list of bitarrays. This allows you to have jagged bounds—e.g., each row could be a different width, or even independently resizable:
bits3d = [[bitarray.bitarray(1000) for y in range(1000)] for x in range(1000)]
myvalue = bits3d[x][y][z]
Alternatively, as suggested by Xymostech, do your own indexing on a 1-D array:
bits3d = bitarray.bitarray(1000*1000*1000)
myvalue = bits3d[x + y*1000 + z*1000*1000]
Either way, you'd probably want to wrap this up in a class, so you can do this:
bits3d = BitArray(1000, 1000, 1000)
myvalue = bits3d[x, y, z]
That's as easy as:
class Jagged3DBitArray(object):
def __init__(self, xsize, ysize, zsize):
self.lll = [[bitarray(zsize) for y in range(ysize)]
for x in range(xsize)]
def __getitem__(self, key):
x, y, z = key
return self.lll[x][y][z]
def __setitem__(self, key, value):
x, y, z = key
self.lll[x][y][z] = value
class Fixed3DBitArray(object):
def __init__(self, xsize, ysize, zsize):
self.xsize, self.ysize, self.zsize = xsize, ysize, zsize
self.b = bitarray(xsize * ysize * zsize)
def __getitem__(self, key):
x, y, z = key
return self.b[x + y * self.ysize + z * self.ysize * self.zsize]
def __setitem__(self, key, value):
x, y, z = key
self.b[x + y * self.ysize + z * self.ysize * self.zsize] = value
Of course if you want more functionality (like slicing), you have to write a bit more.
The jagged array will use a bit more memory (after all, you have the overhead of 1M bitarray objects and 1K list objects), and may be a bit slower, but this usually won't make much difference.
The important deciding factor should be whether it's inherently an error for your data to have jagged rows. If so, use the second solution; if it might be useful to have jagged or resizable rows, use the former. (Keeping in mind that I'd use numpy over either solution, if at all possible.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With