I am trying to learn how to use scikit-learn's MLPClassifier. For a very simple example, I thought I'd try just to get it to learn how to compute the XOR function, since I have done that one by hand as an exercise before.
However, it just spits out zeros after I try to fit the model.
xs = np.array([
0, 0,
0, 1,
1, 0,
1, 1
]).reshape(4, 2)
ys = np.array([0, 1, 1, 0]).reshape(4,)
model = sklearn.neural_network.MLPClassifier(
activation='logistic', max_iter=10000, hidden_layer_sizes=(4,2))
model.fit(xs, ys)
print('score:', model.score(xs, ys)) # outputs 0.5
print('predictions:', model.predict(xs)) # outputs [0, 0, 0, 0]
print('expected:', np.array([0, 1, 1, 0]))
I put my code in a jupyter notebook on github as well https://gist.github.com/zrbecker/6173ac01ed30be4eea9cc96e21f4896f
Why can't scikit-learn come to a solution, when I can show explicitly that one exists? Is the cost function getting stuck in a local minimum? Is there some kind of regularization happening on the parameters that force them to stay close to 0? The parameters I used were reasonably large (i.e. -30 to 30).
It appears a logistic activation is the root cause here.
Change your activation to either tanh
or relu
(my favourite). Demo:
model = sklearn.neural_network.MLPClassifier(
activation='relu', max_iter=10000, hidden_layer_sizes=(4,2))
model.fit(xs, ys)
Outputs for this model:
score: 1.0
predictions: [0 1 1 0]
expected: [0 1 1 0]
It's always a good idea to experiment with different network configurations before you settle on the best one or give up altogether.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With