I know there are a lot of questions to this topic, but I don't understand why in my case both options are possible. My input shape in the LSTM is (10,24,2) and my hidden_size is 8.
model = Sequential()
model.add(LSTM(hidden_size, return_sequences=True, stateful = True,
batch_input_shape=((10, 24, 2))))
model.add(Dropout(0.1))
Why is it possible to either add this line below:
model.add(TimeDistributed(Dense(2))) # Option 1
or this one:
model.add(Dense(2)) # Option 2
Shouldn't Option 2 lead to a compilation error, because it expects a two-dimensional input?
TimeDistributed(layer, **kwargs) This wrapper allows to apply a layer to every temporal slice of an input. Every input should be at least 3D, and the dimension of index one of the first input will be considered to be the temporal dimension.
TimeDistributed layer is very useful to work with time series data or video frames. It allows to use a layer for each input. That means that instead of having several input “models”, we can use “one model” applied to each input. Then GRU or LSTM can help to manage the data in “time”.
In your case the 2 models you define are identical.
This is caused by the fact that you use the return_sequences=True parameter which means that the Dense layer is applied to every timestep just like TimeDistributedDense but if you switch to False then the 2 models are not identical and an error is raised in case of TimeDistributedDense version though not in the Dense one.
A more thorough explanation is provided here also to a similar situation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With