I have defined a simpleRNN in keras with the following code :
# define RNN architecture
from keras.layers import Input
from keras.models import Model
from keras.layers import SimpleRNN
from keras.models import Sequential
model = Sequential()
model.add(SimpleRNN(units = 10,
return_sequences=False,
unroll=True,
input_shape=(6, 2)))
model.compile(loss='mse',
optimizer='rmsprop',
metrics=['accuracy'])
model.summary()
then I feed it with input data having shape (batch_size, 6, 2) i.e. 6 timesteps each having two features. I therefore expect 6 simpleRNN cells.
When launching the training, I get the following error message :
Error when checking target: expected simple_rnn_2 to have shape (10,) but got array with shape (1,)
and I don't understand why.
The point of the RNN (my understanding) is to have its input fed by the previous RNN cell in case it is not the first RNN cell and the new timestep input.
So in this case, I expect the second RNN cell to be fed by the first RNN cell a vector of shape (10,) since units = 10. How come that it gets a (1,) sized vector ?
What is strange is that as soon as I add a Dense layer in the model, this solves the issue. So the following architecture :
# define RNN architecture
from keras.layers import Input
from keras.models import Model
from keras.layers import SimpleRNN, Dense
from keras.models import Sequential
model = Sequential()
model.add(SimpleRNN(units = 10,
return_sequences=False,
unroll=False,
input_shape=(6, 2)))
model.add(Dense(1, activation='relu'))
model.compile(loss='mse',
optimizer='rmsprop',
metrics=['accuracy'])
model.summary()
does not throw an error. Any idea why ?
Assuming you are actually training the model (you did not include that code), the problem is that you are feeding it target outputs of shape (1,)
while the SimpleRNN
expects input of shape (10,)
. You can look up the docs here: https://keras.io/layers/recurrent/
The docs clearly state that the output of the SimpleRNN
is equal to units
, which is 10
. Each unit produces one output.
The second sample does work because you have added a Dense
layer that reduces the output size to (1,)
. Now the model can accept your training target outputs and they are backpropped through the network.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With