Impact of type of scoring in cross_validation

mab66 · 8 July 2021 13:29

Hello,
I don’t understand well the difference between scoring = 'accuracy" and scoring= “balanced_accuracy”, it gives results very different in question 6 of the wrap up quiz.

echidne · 8 July 2021 15:42

Hi mab66,
balanced_accuracy has to be used when you have imbalanced dataset
accuracy is (TP +TN)/(TP + TN + FP + FN) where TP = True Positives, TN = True Negatives, FP = False Positives, and FN = False Negatives.
balanced_accuracy is (TP/(TP + FN) +TN/(FP+TN))/2
To have a better idea take a look to this tutorial. It has been done for R but I think it’s clear enough.