Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fine tuning InceptionV3 in Keras

I am implementing a Convolutional Neural Net using transfer learning in Keras by using pre-trained InceptionV3 model from keras.applications like shown below

#Transfer learning with Inception V3
base_model = applications.InceptionV3(weights='imagenet', include_top=False, input_shape=(299, 299, 3))

## set model architechture
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x) 
predictions = Dense(y_train.shape[1], activation='softmax')(x) 
model = Model(input=base_model.input, output=predictions)

for layer in base_model.layers:
    layer.trainable = False

model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

model.summary()

I was following a blog post that said the model must be trained for a few epochs after freezing the base model. I have trained the model for 5 epochs which gave me acc of 0.47. After that acc don't improve much. Then I stopped the training and unfreezed some of the layers like this and freezing first 2 Convolution layers.

for layer in model.layers[:172]:
   layer.trainable = False
for layer in model.layers[172:]:
   layer.trainable = True

And compiled with SGD with lower learning rate.

Was my approach to stop training the model when the acc don't improve much with layers freezed correct.? Should I have trained longer.?

How to know the correct time to stop training with layers freezed.?

like image 853
Sreeram TP Avatar asked Sep 05 '25 03:09

Sreeram TP


1 Answers

IMHO, you don't have to train your randomly initialized layers until loss/accuracy stops improving.

When I used InceptionV3 for fine-tuning I trained my additional Dense layer for just 2 epochs, even though training it for few more epochs would most likely lead to better loss/accuracy. The number of epochs for initial training depends on your problem and data. (For me 2 epochs reached ~40%.)

I thinks it's a waste of time to train only Dense layer for too long. Train it to get something considerably better that random initialization. Then unfreeze more layers and train them longer together with your Dense layer. As soon as your Dense layer gives reasonable predictions, it's good to train other layers, especially that you have batch normalization in InceptionV3 that stabilizes the variance of gradients for earlier layers.

like image 141
kostek Avatar answered Sep 07 '25 23:09

kostek