Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

simpleRNN input/output shape

I have defined a simpleRNN in keras with the following code :

# define RNN architecture
from keras.layers import Input
from keras.models import Model
from keras.layers import SimpleRNN
from keras.models import Sequential

model = Sequential()
model.add(SimpleRNN(units = 10,
                    return_sequences=False, 
                    unroll=True,
                    input_shape=(6, 2)))

model.compile(loss='mse',
              optimizer='rmsprop',
              metrics=['accuracy'])
model.summary()

then I feed it with input data having shape (batch_size, 6, 2) i.e. 6 timesteps each having two features. I therefore expect 6 simpleRNN cells.

When launching the training, I get the following error message :

Error when checking target: expected simple_rnn_2 to have shape (10,) but got array with shape (1,)

and I don't understand why.

The point of the RNN (my understanding) is to have its input fed by the previous RNN cell in case it is not the first RNN cell and the new timestep input.

So in this case, I expect the second RNN cell to be fed by the first RNN cell a vector of shape (10,) since units = 10. How come that it gets a (1,) sized vector ?

What is strange is that as soon as I add a Dense layer in the model, this solves the issue. So the following architecture :

# define RNN architecture
from keras.layers import Input
from keras.models import Model
from keras.layers import SimpleRNN, Dense
from keras.models import Sequential

model = Sequential()
model.add(SimpleRNN(units = 10,
                    return_sequences=False, 
                    unroll=False,
                    input_shape=(6, 2)))
model.add(Dense(1, activation='relu'))
model.compile(loss='mse',
              optimizer='rmsprop',
              metrics=['accuracy'])
model.summary()

does not throw an error. Any idea why ?

like image 605
FenryrMKIII Avatar asked Sep 05 '25 02:09

FenryrMKIII


1 Answers

Assuming you are actually training the model (you did not include that code), the problem is that you are feeding it target outputs of shape (1,) while the SimpleRNN expects input of shape (10,). You can look up the docs here: https://keras.io/layers/recurrent/

The docs clearly state that the output of the SimpleRNN is equal to units, which is 10. Each unit produces one output.

The second sample does work because you have added a Dense layer that reduces the output size to (1,). Now the model can accept your training target outputs and they are backpropped through the network.

like image 165
gerwin Avatar answered Sep 07 '25 23:09

gerwin