Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

When running my Tukey test, it gives me this error:

Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

My Dataframe Head Output:

    Group    Score
3   A        1.91
4   B        1.7
5   C        1.69
6   D        1.68
7   E        1.49

My Tukey Test Code:

from statsmodels.stats.multicomp import pairwise_tukeyhsd
from statsmodels.stats.multicomp import MultiComparison

mc = MultiComparison(df['Score'], df['Group'])
result = mc.tukeyhsd()

print(result)
print(mc.groupsunique)


> TypeError Traceback (most recent call
> last) <ipython-input-10-705a07612b72> in <module>()
>       1 mc = MultiComparison(df['Score'], df['Group'])
> ----> 2 result = mc.tukeyhsd()
>       3 
>       4 print(result)
>       5 print(mc.groupsunique)
> 
> /usr/local/lib/python3.6/dist-packages/statsmodels/sandbox/stats/multicomp.py
> in tukeyhsd(self, alpha)
>     964         self.groupstats = GroupsStats(
>     965                             np.column_stack([self.data, self.groupintlab]),
> --> 966                             useranks=False)
>     967 
>     968         gmeans = self.groupstats.groupmean
> 
> /usr/local/lib/python3.6/dist-packages/statsmodels/sandbox/stats/multicomp.py
> in __init__(self, x, useranks, uni, intlab)
>     535 
>     536         #temporary until separated and made all lazy
> --> 537         self.runbasic(useranks=useranks)
>     538 
>     539 
> 
> /usr/local/lib/python3.6/dist-packages/statsmodels/sandbox/stats/multicomp.py
> in runbasic(self, useranks)
>     569         else:
>     570             self.xx = x[:,0]
> --> 571         self.groupsum = groupranksum = np.bincount(self.intlab, weights=self.xx)
>     572         #print('groupranksum', groupranksum, groupranksum.shape, self.groupnobs.shape
>     573         # start at 1 for stats.rankdata :
> 
> TypeError: Cannot cast array data from dtype('O') to dtype('float64')
> according to the rule 'safe'

Does anyone know what this means?

like image 580
ee8291 Avatar asked Sep 03 '25 15:09

ee8291


1 Answers

Try replacing the line

mc = MultiComparison(df['Score'], df['Group'])

with

mc = MultiComparison(df['Score'].astype('float'), df['Group'])

If you obtain a failure there, then there is likely a problematic row. You can resolve this by using the following instead:

mc = MultiComparison(pd.to_numeric(df['Score'], errors='coerce'), df['Group'])
like image 147
PMende Avatar answered Sep 05 '25 14:09

PMende