Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Priniting 16bit minimal float looks not consistent?

Can someone explain why printing float16 minimal produces different results below? Is it by design or a bug?

    In [87]: x=np.finfo(np.float16).min
    
    In [88]: x_array_single=np.array([x])
    
    In [89]: x
    Out[89]: -65500.0
    
    In [90]: x_array_single
    Out[90]: array([-65504.], dtype=float16)
like image 241
zell Avatar asked Jan 31 '26 00:01

zell


2 Answers

EDIT:

Note that you would also get this issue if you printed the first value of the array:

>>> x_array[0]
-65500.0

In the NumPy 1.14.0 Release Notes, it has been written that:

Floating-point arrays and scalars use a new algorithm for decimal representations, giving the shortest unique representation. This will usually shorten float16 fractional output, and sometimes float32 and float128 output. float64 should be unaffected. See the new floatmode option to np.set_printoptions.

That's why the outputs are different.

When you print it as a float32 or as a float64 (or just using the built-in float format which is 64 bits), you get the more precised output:

>>> float(x)
-65504.0

>>> np.float32('-65504')
-65504.0

and:

>>> float(x_array[0])
-65504.0

You can also see the change in the precision here:

>>> np.float16('65500') == np.float16('65504')`
True

>>> np.float32('65500') == np.float32('65504')
False
like image 179
nokla Avatar answered Feb 01 '26 14:02

nokla


Internal representation, in fp16 of -65500 is bytes 255 and 251. See

import struct
struct.unpack('BB', struct.pack('e', -65500))
# (255, 251)

And so is internal representation of -65504

import struct
struct.unpack('BB', struct.pack('e', -65504))
# (255, 251)

In binary, and taking into accound my machine is little endian (so, it shoud be read 251 then 255) that is

1 11110 1111111111

Which is sign -, exponent 30-15=15, and then 1(implicit) + 1/2+1/4+...+/1024 ten bits =

- (2**15) * (1.0 + sum(1/2**k for k in range(1,11)))

-65504 (just to state, floating point representation is an exact science ;-))

And the next possible number with this exponent is therefore

1 11110 1111111110

Whose value is

struct.unpack('e', b'\xfe\xfb')

-65472

Mid point between -65504 and -65472 is -65488. And you can see that indeed, all number smaller than -65488 share the same fp16 representation as -65504. Where as all bigger do not.

struct.unpack('BB', struct.pack('e', -65488.01))
# (255,251)
struct.unpack('BB', struct.pack('e', -65488))
# (254, 251)

Or to use nokla's method (whose answer appeared while I was typing this one)

np.float16(-65488.01)==np.float16(-65504)
# True
np.float16(-65488)==np.float16(-65504)
# False

As for your initial question (I realize both nokla and I answered to "why it is not an error", or "why is it possible", but not really to "why is it so"), well, I guess some display (and only display. From the value point of view, all that is the same thing) favor the "roundest" decimal representation when it has a choice among many equivalent decimal representation, whereas other favor the most central value: -65500.0 is the roundest decimal value among all values represented by bytes (255,251). Whereas -65504 is the mean of all values represented by those bytes.

like image 28
chrslg Avatar answered Feb 01 '26 13:02

chrslg