Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logistic regression with statsmodels vs scikit-learn: large difference in predictions

I used the Python libraries statsmodels and scikit-learn for a logistic regression and prediction. The class probability prediction results differ quite substantially. I am aware of the fact that the solution is calculated numerically, however, I would have expected the results to differ only slightly. My expectation would have been that both use the logistic function by default - is that correct or do I need to set any options?

This is my statsmodels code:

import numpy as np
from sklearn.linear_model import LogisticRegression
x = np.array([1,2,3,4,5]).reshape((-1, 1))
y = np.array([0,0,1,1,1])
model = LogisticRegression()
model.fit(x, y)
model.predict_proba(np.array([2.5, 7]).reshape(-1,1))
Out:  array([[0.47910045, 0.52089955],
       [0.00820326, 0.99179674]])

I.e. the predictions for class 1 are 0.521 and 0.992.

If I use scikit-learn instead, I get 0.730 and 0.942:

import statsmodels.api as sm
x = [1, 2, 3, 4, 5]
y = [0,0,1,1,1]
model = sm.Logit(y, x)
results = model.fit()
results.summary()
results.predict([2.5, 7])
Out: array([0.73000205, 0.94185834])

(As a sidenote: if I use R instead of Python, the predictions are 0.480 and 1.000, i.e. they are, again, quite different.)

I suspect these differences are not numerical but there is an analytical mathematical reason behind, e.g. different functions that are used. Can someone help?

Thankss!

like image 758
MrOne2 Avatar asked Dec 30 '25 03:12

MrOne2


1 Answers

I have now found the solution. There were two reasons:

(1) scikit-learn uses regularisation by default, which has to be turned off. This is done by changing line 5 in the scikit-learn code to:

model = LogisticRegression(penalty='none')

(2) The one that Yati Raj mentioned - thanks for the hint! Statsmodels does not fit an intercept automatically. This can be changed by adding the line

x = sm.add_constant(x)

in the statsmodels code.

like image 52
MrOne2 Avatar answered Jan 01 '26 19:01

MrOne2