I am attempting to calculate the True Positive rate ect. of a binary confusion matrix, and output the results to a csv file.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import csv
from sklearn.metrics import confusion_matrix
AllBinary = pd.read_csv('BinaryData.csv')
y_test = AllBinary['Binary_ac']
y_pred = AllBinary['Binary_pred']
cm = confusion_matrix(y_test, y_pred)
stats = pd.DataFrame()
TP = cm[0][0]
FP = cm[0][1]
FN = cm[1][0]
TN = cm[1][1]
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
print(TP)
print(TN)
print(FP)
print(FN)
stats.to_csv('C:/out/' + 'BinaryStats' + '.csv', header = True)
The print results show that the basic confusion matrix stats are calculated as follows:
210483
153902
32845
10788
The csv output creates the headings, but the results are blank. What am I doing incorrectly?
Update:
print(stats)
Empty DataFrame
Columns: [TruePositive, TrueNegative, Falsepositive, FalseNegative]
The problem here is that you can't append to a df like this by simply assigning a scalar value to a new column:
In [55]:
stats = pd.DataFrame()
stats['TruePositive'] = 210483
stats
Out[55]:
Empty DataFrame
Columns: [TruePositive]
Index: []
You'll need to construct the df with the desired values in the ctor:
In [62]:
TP = 210483
FP = 153902
FN = 32845
TN = 10788
stats = pd.DataFrame({'TruePositive':[TP], 'TrueNegative':[TN], 'FalsePositive':[FP], 'FalseNegative':[FN]})
stats
Out[62]:
FalseNegative FalsePositive TrueNegative TruePositive
0 32845 153902 10788 210483
OR add a dummy row and then your code will work:
In [71]:
stats = pd.DataFrame()
stats = stats.append(pd.Series('dummy'), ignore_index=True)
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
stats
Out[71]:
0 TruePositive TrueNegative FalsePositive FalseNegative
0 dummy 210483 10788 153902 32845
You can then drop the dummy column calling drop:
In [72]:
stats.drop(0, axis=1)
Out[72]:
TruePositive TrueNegative FalsePositive FalseNegative
0 210483 10788 153902 32845
So why your attempt failed is because your initial df was empty, you're assigning a new column with a scalar value, the scalar value will set all rows for the new column to this value. As your df has no rows this fails which is why you have an empty df.
Another way would be to create the df with a single row (here I put NaN):
In [77]:
stats = pd.DataFrame([np.NaN])
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
stats.dropna(axis=1)
Out[77]:
TruePositive TrueNegative FalsePositive FalseNegative
0 210483 10788 153902 32845
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With