Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Printing training progress with Keras using QSUB and a bash file

I'm able to run a python script that trains a model using Keras/Tensorflow with the following bash script:

#!/bin/bash
#PBS -N Tarea_UNET
#PBS -l nodes=1:ppn=4:gpus=1
cd $PBS_O_WORKDIR
source $ANACONDA3/activate inictel_uni
python U-NET.py

Inside "U-NET.py" the training function goes like this:

history=model.fit(train_B,train_A, epochs = 200, batch_size = 20, validation_split=0.052631578, shuffle=True)

The problem is I can't visualize the training progress that helps me to monitor the metrics or see the estimated training time and I've got to wait until the whole process finishes. "qstat" gives me only the time it has been running the code, so it's useless. Do you have any ideas?

like image 806
Giorgio Luigi Morales Luna Avatar asked Jan 31 '26 23:01

Giorgio Luigi Morales Luna


1 Answers

One simple approach is to provide a callback for Keras to invoke at the right times. You can do whatever logging, progress reporting you want in this callback.

Here is the high-level documentation and some pre-made callbacks: https://keras.io/callbacks/

Usage is very simple. You just pass a list of callback to fit

model.fit(x_train, y_train, ... callbacks=[<your_callbacks>])

See examples at the end of the doc.

You can see all the methods that you can override here: https://github.com/keras-team/keras/blob/adc321b4d7a4e22f6bdb00b404dfe5e23d4887aa/keras/callbacks.py#L146

like image 121
iga Avatar answered Feb 03 '26 12:02

iga