I implemented a data generator to split my training data into mini batches of 256 to avoid memory errors. Its running on the training data but it does not show validation loss and validation accuracy at the end of each epochs. I also applied the data generator on validation data and defined validation steps. I don't know exactly what's the wrong with the code that its's not showing validation loss and accuracy? here is the code:
early_stopping_cb=tf.keras.callbacks.EarlyStopping(patience=3,restore_best_weights=True)
batch_size=256
epoch_steps=math.ceil(len(utt)/ batch_size)
val_steps=math.ceil(len(val_prev)/ batch_size)
hist = model.fit_generator(generate_data(utt_minus_one, utt, y_train, batch_size),
steps_per_epoch=epoch_steps, epochs=3,
callbacks = [early_stopping_cb],
validation_data=generate_data(val_prev, val_curr,y_val,batch_size),
validation_steps=val_steps, class_weight=custom_weight_dict,
verbose=1)
here is code for generator:
#method to use generator to split data into mini batches of 256 each loaded at run time
def generate_data(X1,X2,Y,batch_size):
p_input=[]
c_input=[]
target=[]
batch_count=0
for i in range(len(X1)):
p_input.append(X1[i])
c_input.append(X2[i])
target.append(Y[i])
batch_count+=1
if batch_count>batch_size:
prev_X=np.array(p_input,dtype=np.int64)
cur_X=np.array(c_input,dtype=np.int64)
cur_y=np.array(target,dtype=np.int32)
yield ([prev_X,cur_X],cur_y )
p_input=[]
c_input=[]
target=[]
batch_count=0
return
Here is the trace for first epoch which also gives an error:
Epoch 1/3
346/348 [============================>.] - ETA: 4s - batch: 172.5000 - size: 257.0000 - loss: 0.8972 - accuracy: 0.8424WARNING:tensorflow:Your dataset iterator ran out of data; interrupting training. Make sure that your iterator can generate at least `steps_per_epoch * epochs` batches (in this case, 1044 batches). You may need touse the repeat() function when building your dataset.
WARNING:tensorflow:Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,accuracy
346/348 [============================>.] - 858s 2s/step - batch: 172.5000 - size: 257.0000 - loss: 0.8972 - accuracy: 0.8424
Can any one help in sorting out these issues?
There is need of a one while loop for per epoch over the for loop to split into mini batches. So if there are 348 batches per epochs then 3*348= 1044 batches overall.
#method to use generator to split data into mini batches of 256 each loaded at run time
def generate_data(X1,X2,Y,batch_size):
count=0
p_input=[]
c_input=[]
target=[]
batch_count=0
while True:
for i in range(len(X1)):
p_input.append(X1[i])
c_input.append(X2[i])
target.append(Y[i])
batch_count+=1
if batch_count>batch_size:
count=count+1
prev_X=np.array(p_input,dtype=np.int64)
cur_X=np.array(c_input,dtype=np.int64)
cur_y=np.array(target,dtype=np.int32)
yield ([prev_X,cur_X],cur_y )
p_input=[]
c_input=[]
target=[]
batch_count=0
print(count)
return
And trace for first epoch:
Epoch 1/3
335/347 [===========================>..] - ETA: 30s - batch: 167.0000 - size: 257.0000 - loss: 1.2734 - accuracy: 0.8105346
347/347 [==============================] - ETA: 0s - batch: 173.0000 - size: 257.0000 - loss: 1.2635 - accuracy: 0.8113WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_v1.py:2048: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
86
347/347 [==============================] - 964s 3s/step - batch: 173.0000 - size: 257.0000 - loss: 1.2635 - accuracy: 0.8113 - val_loss: 0.5700 - val_accuracy: 0.8367
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With