Given the following code:
import numpy as np
c = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
c = np.array(c)
print((c * c.transpose()).prod())
On my windows machine it returns "-1462091776" (Not sure how it got a negative from all those positives). On ubuntu it returns "131681894400"
Anyone know what's going on here?
Edit: Apparently this is an overflow problem. (Thanks @rafaelc !) But it is reproducible (Also thanks to @richardec for testing that)
So now the question becomes.. is this a bug I should report? Who do I report it to?
I have enough comments that I think an "answer" is warranted.
Not sure how it got a negative from all those positives
As @rafaelc points out, you ran into an integer overflow. You can read more details at the wikipedia link that was provided.
According to this thread, numpy uses the operating system's C long
type as the default dtype
for integers. So when you write this line of code:
c = np.array(c)
The dtype
defaults to numpy's default integer data type, which is the operating system's C long
. The size of a long
in Microsoft's C implementation for Windows is 4 bytes (x8 bits/byte = 32 bits), so your dtype
defaults to a 32-bit integer.
In [1]: import numpy as np
In [2]: np.iinfo(np.int32)
Out[2]: iinfo(min=-2147483648, max=2147483647, dtype=int32)
The largest number a 32-bit, signed integer data type can represent is 2147483647
. If you take a look at your product across just one axis:
In [5]: c * c.T
Out[5]:
array([[ 1, 8, 21],
[ 8, 25, 48],
[21, 48, 81]])
In [6]: (c * c.T).prod(axis=0)
Out[6]: array([ 168, 9600, 81648])
In [7]: 168 * 9600 * 81648
Out[7]: 131681894400
You can see that 131681894400 >> 2147483647
(in mathematics, the notation >>
means "is much, much larger"). Since 131681894400
is much larger than the maximum integer the 32-bit long
can represent, an overflow occurs.
In Linux, a long
is 8 bytes (x8 bits/byte = 64 bits). Why? Here's an SO thread that discusses this in the comments.
No, although it's pretty annoying, I'll admit.
For what it's worth, it's usually a good idea to be explicit about your data types, so next time:
c = np.array(c, dtype='int64')
# or
c = np.array(c, dtype=np.int64)
Again, this isn't a bug, but if it were, you'd open an issue on the numpy github (where you can also peruse the source code). Somewhere in there is proof of how numpy uses the operating system's default C long
, but I don't have it in me to go digging around to find it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With