Pandas Dataframe return index with inaccurate decimals

Question

I have a Pandas Dataframe like this:

                0         1         2         3         4         5       \
    event_at                                                               
    0.00      1.000000  1.000000  1.000000  1.000000  1.000000  1.000000   
    0.01      0.975381  0.959061  0.979856  0.985625  0.986080  0.976601   
    0.02      0.959103  0.932374  0.966486  0.976037  0.976791  0.961114   
    0.03      0.946154  0.911362  0.955820  0.968362  0.969353  0.948785   
    0.04      0.935378  0.894024  0.946924  0.961940  0.963129  0.938518   
    0.05      0.926099  0.879201  0.939248  0.956385  0.957744  0.929672   
    0.06      0.917608  0.865726  0.932212  0.951282  0.952796  0.921574 
    ......
    0.96      0.072472  0.012264  0.117352  0.217737  0.228561  0.082670   
    0.97      0.066553  0.010632  0.109468  0.207225  0.217870  0.076244   
    0.98      0.060532  0.009069  0.101313  0.196119  0.206555  0.069677   
    0.99      0.054657  0.007642  0.093212  0.184828  0.195031  0.063237   
    1.00      0.019128  0.001314  0.039558  0.100442  0.108064  0.023328

I want to get all indexes

>>> df.index
[0.0, 0.01, 0.02, 0.029999999999999999, 0.040000000000000001, 0.050000000000000003, 0.059999999999999998,
...
0.95999999999999996, 0.96999999999999997, 0.97999999999999998, 0.98999999999999999, 1.0]


# What I expect is like:

    [0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06,
        ...
        0.96, 0.97, 0.98, 0.99, 1.0]

This floating point problem makes me get his exception:

>>> df.loc[0.35].values
Traceback (most recent call last):
  File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1395, in _has_valid_type
    error()
  File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1390, in error
    (key, self.obj._get_axis_name(axis)))
KeyError: 'the label [0.35] is not in the [index]'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "J:\Workspace\dataset_loader.py", line 171, in <module>
    print(y_pred_cox_alldep.loc[0.35].values)
  File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1296, in __getitem__
    return self._getitem_axis(key, axis=0)
  File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1466, in _getitem_axis
    self._has_valid_type(key, axis)
  File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1403, in _has_valid_type
    error()
  File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1390, in error
    (key, self.obj._get_axis_name(axis)))
KeyError: 'the label [0.35] is not in the [index]'

MaxU - stop WAR against UA · Accepted Answer

you can do it this way (assuming we want to get a row with a 0.96 index, which is internally represented as 0.95999999999):

In [466]: df.index
Out[466]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0], dtype='float64')

In [467]: df.ix[df.index[np.abs(df.index - 0.96) < 1e-6]]
Out[467]:
             0         1         2         3         4        5
0.96  0.072472  0.012264  0.117352  0.217737  0.228561  0.08267

or, if you can change (round) your index:

In [430]: df.index = [0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0]

In [431]: df
Out[431]:
             0         1         2         3         4         5
0.00  1.000000  1.000000  1.000000  1.000000  1.000000  1.000000
0.01  0.975381  0.959061  0.979856  0.985625  0.986080  0.976601
0.02  0.959103  0.932374  0.966486  0.976037  0.976791  0.961114
0.03  0.946154  0.911362  0.955820  0.968362  0.969353  0.948785
0.04  0.935378  0.894024  0.946924  0.961940  0.963129  0.938518
0.05  0.926099  0.879201  0.939248  0.956385  0.957744  0.929672
0.06  0.917608  0.865726  0.932212  0.951282  0.952796  0.921574
0.96  0.072472  0.012264  0.117352  0.217737  0.228561  0.082670
0.97  0.066553  0.010632  0.109468  0.207225  0.217870  0.076244
0.98  0.060532  0.009069  0.101313  0.196119  0.206555  0.069677
0.99  0.054657  0.007642  0.093212  0.184828  0.195031  0.063237
1.00  0.019128  0.001314  0.039558  0.100442  0.108064  0.023328

In [432]: df.index
Out[432]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0], dtype='float64')

In [433]: df.ix[.96]
... skipped ...
KeyError: 0.96

let's round the index:

In [434]: df.index = df.index.values.round(2)

In [435]: df.index
Out[435]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.96, 0.97, 0.98, 0.99, 1.0], dtype='float64')

In [436]: df.ix[.96]
Out[436]:
0    0.072472
1    0.012264
2    0.117352
3    0.217737
4    0.228561
5    0.082670
Name: 0.96, dtype: float64

UPDATE: starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers.

Pandas Dataframe return index with inaccurate decimals

Tags:

python

pandas

dataframe

numpy

Munichong

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us

Pandas Dataframe return index with inaccurate decimals

Tags:

python

pandas

dataframe

numpy

Munichong

1 Answers

MaxU - stop WAR against UA

Related questions

Recent Activity

Donate For Us