I am new to RNN's / LSTM's in Keras and need advice on whether / how to use them for my problem, which is many-to-many classification.
I have a number of time series: Approximately 1500 "runs" which each last for about 100-300 time steps and have multiple channels. I understand that I need to zero-pad my data to the maximum number of time steps, so my data looks like this:
[nb_samples, timesteps, input_dim]: [1500, 300, 10]
Since getting the label for a single time step is impossible without knowing the past even for a human, I could do feature engineering and train a classical classification algorithm, however, I think LSTMs would be a good fit here. This answer tells me that for many-to-many classification in Keras, I need to set return_sequences to True. However, I do not quite understand how to proceed from here - do I use the return sequence as input for another, normal layer? How do I connect this to my output layer?
Any help, hints or links to tutorials are greatly appreciated - I found a lot of stuff for many-to-one classification, but nothing good on many-to-many.
There can be many approaches to this, i am specifying which can be good fit to your problem.
If you want to stack
two LSTM
layer, then return-seq
can help to learn for another LSTM
layer as shown in following example.
from keras.layers import Dense, Flatten, LSTM, Activation
from keras.layers import Dropout, RepeatVector, TimeDistributed
from keras import Input, Model
seq_length = 15
input_dims = 10
output_dims = 8 # number of classes
n_hidden = 10
model1_inputs = Input(shape=(seq_length,input_dims,))
model1_outputs = Input(shape=(output_dims,))
net1 = LSTM(n_hidden, return_sequences=True)(model1_inputs)
net1 = LSTM(n_hidden, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1
model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')
## Fit the model
model1.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 15, 10) 0
_________________________________________________________________
lstm_1 (LSTM) (None, 15, 10) 840
_________________________________________________________________
lstm_2 (LSTM) (None, 10) 840
_________________________________________________________________
dense_3 (Dense) (None, 8) 88
_________________________________________________________________
Dense
layer whose input will be [batch, seq_len*lstm_output_dims]
.Note: These features can be useful for classification task, but mostly, we used stacked lstm layer and use its output with-out complete sequence
as features for the classification layer.
This answer may be helpful to understand another approaches for LSTM
architecture for different purpose.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With