It seems the LogisticRegression implemented in scikit-learn cannot learn the simple boolean functions AND or OR. I would understand XOR giving bad results but AND and OR should be fine. Am I doing something wrong?
from sklearn.linear_model import LogisticRegression, LinearRegression
import numpy as np
bool_and = np.array([0., 0., 0., 1.])
bool_or = np.array([0., 1., 1., 1.])
bool_xor = np.array([0., 1., 1., 0.])
x = np.array([[0., 0.],
[0., 1.],
[1., 0.],
[1., 1.]])
y = bool_and
logit = LogisticRegression()
logit.fit(x,y)
#linear = LinearRegression()
#linear.fit(x, y)
print "expected: ", y
print "predicted:", logit.predict(x)
#print linear.predict(x)
gives the following output:
expected: [0 0 0 1]
predicted: [0 0 0 0]
The problem seems to have to do with regularization. The following makes the classifier work:
logit = LogisticRegression(C=100)
Unfortunately, the documentation is a bit sparse, so I am not sure what the range of the C parameter is.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With