Getting 'pred_table' information for predicted values of a model in 'statsmodels'

Question

In the Python package statsmodels, LogitResults.pred_table can be conveniently used to get a "confusion matrix", for arbitrary an arbitrary threshold t, for a Logit model of the form

mod_fit = sm.Logit.from_formula('Y ~ a + b + c', train).fit() 
...
mod_fit.pred_table(t) 
#Conceptually: pred_table(t, predicted=mod_fit.predict(train), observed=train.Y)

Is there a way to get the equivalent information for test data? For example, if I

pred = mod_fit.predict(test)

how do I get the equivalent of

mod_fit.pred_table(t, predicted=pred, observed=test.Y)

Is there a way to get statsmodels to do this (e.g. a way to build construct a LogitResults instance from pred and train.Y), or does it need to be done "by hand" — and if so how>

jseabold · Accepted Answer

That's a good idea and easy to add. Can you post a github issue about it? You can do this with the following code

import numpy as np
pred = np.array(mod_fit.predict(test) > threshold, dtype=float)
table = np.histogram2d(test.Y, pred, bins=2)[0]

Getting 'pred_table' information for predicted values of a model in 'statsmodels'

Tags:

python

statsmodels

regression

confusion-matrix

orome

1 Answers

jseabold

Recent Activity

Donate For Us

Getting 'pred_table' information for predicted values of a model in 'statsmodels'

Tags:

python

statsmodels

regression

confusion-matrix

orome

1 Answers

jseabold

Related questions

Recent Activity

Donate For Us