So I made a model with tensorflow keras and it seems to work ok. However, my supervisor said it would be useful to calculate the Matthews correlation coefficient, as well as the accuracy and loss it already calculates.
my model is very similar to the code in the tutorial here (https://www.tensorflow.org/tutorials/keras/basic_classification) except with a much smaller dataset.
is there a prebuilt function or would I have to get the prediction for each test and calculate it by hand?
To calculate the MCC of the model, we can use the following formula: MCC = (TP*TN – FP*FN) / √(TP+FP)(TP+FN)(TN+FP)(TN+FN) MCC = (15*375-5*5) / √(15+5)(15+5)(375+5)(375+5) MCC = 0.7368.
A Matthews correlation coefficient close to +1, in fact, means having high values for all the other confusion matrix metrics. The same cannot be said for balanced accuracy, markedness, bookmaker informedness, accuracy and F1 score.
There is nothing out of the box but we can calculate it from the formula in a custom metric.
The basic classification link you supplied is for a multi-class categorisation problem whereas the Matthews Correlation Coefficient is specifically for binary classification problems.
Assuming your model is structured in the "normal" way for such problems (i.e. y_pred is a number between 0 and 1 for each record representing predicted probability of a "True" and labels are each exactly a 0 or 1 representing ground truth "False" and "True" respectively) then we can add in an MCC metric as follows:
# if y_pred > threshold we predict true. 
# Sometimes we set this to something different to 0.5 if we have unbalanced categories
threshold = 0.5  
def mcc_metric(y_true, y_pred):
  predicted = tf.cast(tf.greater(y_pred, threshold), tf.float32)
  true_pos = tf.math.count_nonzero(predicted * y_true)
  true_neg = tf.math.count_nonzero((predicted - 1) * (y_true - 1))
  false_pos = tf.math.count_nonzero(predicted * (y_true - 1))
  false_neg = tf.math.count_nonzero((predicted - 1) * y_true)
  x = tf.cast((true_pos + false_pos) * (true_pos + false_neg) 
      * (true_neg + false_pos) * (true_neg + false_neg), tf.float32)
  return tf.cast((true_pos * true_neg) - (false_pos * false_neg), tf.float32) / tf.sqrt(x)
which we can include in our model.compile call:
model.compile(optimizer='adam',
              loss=tf.keras.losses.binary_crossentropy,
              metrics=['accuracy', mcc_metric])
Here is a complete worked example where we categorise mnist digits depending on whether they are greater than 4:
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
y_train, y_test = 0 + (y_train > 4), 0 + (y_test > 4)
def mcc_metric(y_true, y_pred):
  predicted = tf.cast(tf.greater(y_pred, 0.5), tf.float32)
  true_pos = tf.math.count_nonzero(predicted * y_true)
  true_neg = tf.math.count_nonzero((predicted - 1) * (y_true - 1))
  false_pos = tf.math.count_nonzero(predicted * (y_true - 1))
  false_neg = tf.math.count_nonzero((predicted - 1) * y_true)
  x = tf.cast((true_pos + false_pos) * (true_pos + false_neg) 
      * (true_neg + false_pos) * (true_neg + false_neg), tf.float32)
  return tf.cast((true_pos * true_neg) - (false_pos * false_neg), tf.float32) / tf.sqrt(x)
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='relu'),
  tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam',
              loss=tf.keras.losses.binary_crossentropy,
              metrics=['accuracy', mcc_metric])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
output:
Epoch 1/5
60000/60000 [==============================] - 7s 113us/sample - loss: 0.1391 - acc: 0.9483 - mcc_metric: 0.8972
Epoch 2/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.0722 - acc: 0.9747 - mcc_metric: 0.9495
Epoch 3/5
60000/60000 [==============================] - 6s 97us/sample - loss: 0.0576 - acc: 0.9797 - mcc_metric: 0.9594
Epoch 4/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.0479 - acc: 0.9837 - mcc_metric: 0.9674
Epoch 5/5
60000/60000 [==============================] - 6s 95us/sample - loss: 0.0423 - acc: 0.9852 - mcc_metric: 0.9704
10000/10000 [==============================] - 1s 58us/sample - loss: 0.0582 - acc: 0.9818 - mcc_metric: 0.9639
[0.05817381642502733, 0.9818, 0.9638971]
Since the asker accepted a Python version from sklearn, here is Stewart_Rs answer in pure Python:
from math import sqrt
def mcc(tp, fp, tn, fn):
    # https://stackoverflow.com/a/56875660/992687
    x = (tp + fp) * (tp + fn) * (tn + fp) * (tn + fn)
    return ((tp * tn) - (fp * fn)) / sqrt(x)
It has the advanatage of being general, not just for evaluating binary classifications.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With