I am trying to get the difference between each pair of element of 2 different numpy arrays (size: 1 X 150000). Here is the code used by me:
#the input numpy array are a and b
c = a - b.reshape((-1,1))
#For an array say a=np.array([1,2,7,6]) and b=np.array([1,2,7,6])
# c = array([[ 0, 2, 6, 5],
[-2, 0, 4, 3],
[-6, -4, 0, -1],
[-5, -3, 1, 0]])
I understand why it gives memory error using this code. How to update the code so that I don't get the error?
I tried using itertools combinations_with_replacement to but still not able to get desired results.
You're trying to create an 150000 x 150000 array. I'm not sure which dtype you used but in case of int32 (4 bytes per number) and neglecting overhead of the array you try to allocate:
>>> 150000 * 150000 * 4 # bytes
90000000000
which translates to
>>> 150000 * 150000 * 4 / 1024 / 1024 / 1024 # Gigabytes
83.82
So if you don't have 84GB of (free) RAM you'll get a MemoryError for that operation.
With itertools it's even worse because you need a list which contains one pointer per element (on 64bit computers that's already 8 byte) and depending on your python version and computer each integer requires ~20-30 bytes:
>>> import sys
>>> sys.getsizeof(1)
28
Essentially this would lead to a RAM requirement of:
>>> (28 + 8) * 150000 * 150000 # bytes
810000000000
>>> (28 + 8) * 150000 * 150000 / 1024 / 1024 / 1024 # Gigabytes
754.37
If you have enough hard disk storage you could try to calculate the distance from each point of the one array to all points in the other array and then save that to disk and then go to the next point in the first array and calculate all distances to all points in the second array, and so on. However depending on the way you store the values (for example if you save them in txt format) you might have an even bigger memory consumption. But that just allows you to calculate the distances - you won't be able to keep all the values in the RAM.
The most-straightforward solution for the MemoryError is to buy more RAM (you can calculate how much you need based on the above numbers). If that's not an option you need to change your approach.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With