I have trained a neural network using the TensorFlow backend in Keras (2.1.5) and I have also used the keras-contrib (2.0.8) library in order to add a CRF layer as an output for the network.
I would like to know how can I get the precision, recall and f1 score for each class after making the predictions on a test set using the NN.
Assume that you have a function get_model() that builds a your exact same model you have trained and a path weights_path pointing to your HDF5 file containing your model weights:
model = get_model()
model.load_weights(weights_path)
This should load your model properly. Then you just have to define a ImageDataGenerator of your test data and fit the model to obtain predictions:
# Path to your folder testing data
testing_folder = ""
# Image size (set up the image size used for training)
img_size = 256
# Batch size (you should tune it based on your memory)
batch_size = 16
val_datagen = ImageDataGenerator(
    rescale=1. / 255)
validation_generator = val_datagen.flow_from_directory(
    testing_folder,
    target_size=(img_size, img_size),
    batch_size=batch_size,
    shuffle=False,
    class_mode='categorical')
Then you can make the model generate all predictions over your entire dataset using the model.predict_generator() method:
# Number of steps corresponding to an epoch
steps = 100
predictions = model.predict_generator(validation_generator, steps=steps)
And finally create a confussion matrix using the metrics.confusion_matrix() method from sklearn package:
val_preds = np.argmax(predictions, axis=-1)
val_trues = validation_generator.classes
cm = metrics.confusion_matrix(val_trues, val_preds)
Or get all precisions, recalls and f1-scores for all classes using metrics.precision_recall_fscore_support() method from sklearn (argument average=None outputs metrics for all classes):
# label names
labels = validation_generator.class_indices.keys()
precisions, recall, f1_score, _ = metrics.precision_recall_fscore_support(val_trues, val_preds, labels=labels)
I haven't tested it, but I guess this will help you.
Have a look at sklearn.metrics.classification_report:
from sklearn.metrics import classification_report
y_pred = model.predict(x_test)
print(classification_report(y_true, y_pred))
gives you something like
             precision    recall  f1-score   support
    class 0       0.50      1.00      0.67         1
    class 1       0.00      0.00      0.00         1
    class 2       1.00      0.67      0.80         3
avg / total       0.70      0.60      0.61         5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With