Technology Sharing

F1-score

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

F1-score

F1-score is an indicator to measure the performance of classification models. It is particularly suitable for processing extremely unbalanced data sets. The value range of F1-score is from 0 to 1. The larger the value, the better the performance.
Calculation formula:
F1-score is the harmonic mean of precision and recall.
∗ ∗ F 1 s c o r e = 2 × p r e c i s i o n × R e c a l l p r e c i s i o n + R e c a l l ∗ ∗ **F1score =2times frac{precisiontimes Recall}{precision+Recall}** F1score=2×precision+Recallprecision×Recall
So what are precision and recall?
Precision: The ratio of samples predicted by the model to samples that are actually positive.
Recall: The proportion of samples that are actually positive that are correctly predicted as positive by the model.
Calculated as follows:
P r e c i s i o n = T P T P + F P Precision = frac{TP}{TP+FP} Precision=TP+FPTP
R e c a l l = T P T P + F N Recall = frac{TP}{TP+FN} Recall=TP+FNTP
Eg:

Actual CategoryPrediction Category
11
10
01
11
00

What is TP? FP? FN
TP (True Positive): The model predicts that the model is positive and the model is actually positive. There are 2 such examples.
FP(False Positive): The model predicts that the class is positive, but it is actually negative. There is 1 in the example.
FN (False Negative): The model predicts a negative class, but it is actually a positive class. There is 1 here.
P r e c i s i o n = T P T P + F P = 2 2 + 1 = 0.67 Precision = frac{TP}{TP+FP}=frac{2}{2+1}=0.67 Precision=TP+FPTP=2+12=0.67
R e c a l l = T P T P + F N = 2 2 + 1 = 0.67 Recall = frac{TP}{TP+FN}=frac{2}{2+1}=0.67 Recall=TP+FNTP=2+12=0.67
F 1 s c o r e = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l = 2 × 0.67 × 0.67 0.67 + 0.67 = 0.67 F1score=2times frac{Precision times Recall}{Precision+Recall}=2times frac{0.67 times 0.67 }{0.67 +0.67 }=0.67 F1score=2×Precision+RecallPrecision×Recall=2×0.67+0.670.67×0.67=0.67

What questions do recall and precision answer?
Recall: is theActually positiveIn the sample,Predicted to be positiveof!
So the answer is: among all the positive samples, how many are correctly identified by the model?
Accuracy: Yes in allPredicted as positiveIn the sample,Actually, it is also positiveThe ratio.
So the answer is: among all the samples predicted to be positive, how many are actually positive?