Keras, why is adding layers to a model so slow?

Question

I am trying to build a really large model in Keras with 3 LSTM layers with 4096 hidden units each. Previously I had 1024 hidden units in each layer. The compile time for this network was reasonable. Each layer would add in about 1 to 2 seconds. Now that the model has 4096 hidden units per layer the add time for each layer is about 5 minutes. What I think is strange though is that the slow performance happens during the three calls to model.add(LSTM...) and not during model.compile(...). I need to use a larger network but this wait time is a little unbearable. It is not so bad for the training since that will take much longer but I don't want to sit through it every time I want to generate test output. Why does the add take so much time? Isn't add just defining the layer and all the time should be spent in the compile function? Also is there anything I can do about it?

print('Building Model')
model = Sequential()
model.add(LSTM(lstm_size, batch_input_shape = (batch_size, 1, len(bytes_set)), stateful = True, return_sequences = True, consume_less = consume_less))
model.add(Dropout(0.5))
print('Added LSTM_1')
model.add(LSTM(lstm_size, stateful = True, return_sequences = True, consume_less = consume_less))
model.add(Dropout(0.5))
print('Added LSTM_2')
model.add(LSTM(lstm_size, stateful = True, return_sequences = False, consume_less = consume_less))
model.add(Dropout(0.5))
print('Added LSTM_3')
model.add(Dense(len(bytes_set), activation = 'softmax'))

print('Compiling Model')
model.compile(optimizer = SGD(lr = 0.3, momentum = 0.9, decay = 1e-5, nesterov = True),
              loss = 'categorical_crossentropy', 
              metrics = ['accuracy'])

Here is my .theanorc

[global]
floatX = float32
mode = FAST_RUN
device = gpu
exception_verbosity = high

[nvcc]
fastmath = 1

Here are is my model summary as requested. Unfortunately I have been running this new version for the past few hours so I dont want to make any new changes. This model has 4 LSTM layers of size 1500 each.

Layer (type)                       Output Shape        Param #     Connected to                     
====================================================================================================
lstm_1 (LSTM)                      (64, 1, 1500)       9774000     lstm_input_1[0][0]               
____________________________________________________________________________________________________
dropout_1 (Dropout)                (64, 1, 1500)       0           lstm_1[0][0]                     
____________________________________________________________________________________________________
lstm_2 (LSTM)                      (64, 1, 1500)       18006000    dropout_1[0][0]                  
____________________________________________________________________________________________________
dropout_2 (Dropout)                (64, 1, 1500)       0           lstm_2[0][0]                     
____________________________________________________________________________________________________
lstm_3 (LSTM)                      (64, 1, 1500)       18006000    dropout_2[0][0]                  
____________________________________________________________________________________________________
dropout_3 (Dropout)                (64, 1, 1500)       0           lstm_3[0][0]                     
____________________________________________________________________________________________________
lstm_4 (LSTM)                      (64, 1500)          18006000    dropout_3[0][0]                  
____________________________________________________________________________________________________
dropout_4 (Dropout)                (64, 1500)          0           lstm_4[0][0]                     
____________________________________________________________________________________________________
dense_1 (Dense)                    (64, 128)           192128      dropout_4[0][0]                  
====================================================================================================
Total params: 63984128
____________________________________________________________________________________________________

Marcin Możejko · Accepted Answer

It's slow because you are trying to allocate a matrix which needs at least 0.5GB memory. 4096 units * 4097 weights is already a huge number. LSTM has additional inner weights associated with input, output and forgetting gates. As you can see this sums up to a huge number.

UPDATE I wrote my answer from my mobile and I wrote TB instead of GB. You can easily check the size of your model by adding:

print model.summary()

in both cases (1024 and 4096). Please share your results in a comment because I'm interested :)

Keras, why is adding layers to a model so slow?

Tags:

python

neural-network

keras

theano

chasep255

1 Answers

Marcin Możejko

Recent Activity

Donate For Us

Keras, why is adding layers to a model so slow?

Tags:

python

neural-network

keras

theano

chasep255

1 Answers

Marcin Możejko

Related questions

Recent Activity

Donate For Us