Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Training a neural network to compute 'XOR' in scikit-learn

I am trying to learn how to use scikit-learn's MLPClassifier. For a very simple example, I thought I'd try just to get it to learn how to compute the XOR function, since I have done that one by hand as an exercise before.

However, it just spits out zeros after I try to fit the model.

xs = np.array([
    0, 0,
    0, 1,
    1, 0,
    1, 1
]).reshape(4, 2)

ys = np.array([0, 1, 1, 0]).reshape(4,)

model = sklearn.neural_network.MLPClassifier(
    activation='logistic', max_iter=10000, hidden_layer_sizes=(4,2))
model.fit(xs, ys)

print('score:', model.score(xs, ys)) # outputs 0.5
print('predictions:', model.predict(xs)) # outputs [0, 0, 0, 0]
print('expected:', np.array([0, 1, 1, 0]))

I put my code in a jupyter notebook on github as well https://gist.github.com/zrbecker/6173ac01ed30be4eea9cc96e21f4896f

Why can't scikit-learn come to a solution, when I can show explicitly that one exists? Is the cost function getting stuck in a local minimum? Is there some kind of regularization happening on the parameters that force them to stay close to 0? The parameters I used were reasonably large (i.e. -30 to 30).

like image 429
zrbecker Avatar asked Sep 06 '25 17:09

zrbecker


1 Answers

It appears a logistic activation is the root cause here.

Change your activation to either tanh or relu (my favourite). Demo:

model = sklearn.neural_network.MLPClassifier(
    activation='relu', max_iter=10000, hidden_layer_sizes=(4,2))
model.fit(xs, ys)

Outputs for this model:

score: 1.0
predictions: [0 1 1 0]
expected: [0 1 1 0]

It's always a good idea to experiment with different network configurations before you settle on the best one or give up altogether.

like image 168
cs95 Avatar answered Sep 08 '25 08:09

cs95