Multi-Class Performance Metrics¶
In most literature, standard multi-class performance metrics are used to evaluate an activity recognition algorithm.
In module pyActLearn.performace
, the following functions get the confusion matrix and calculate per-class
performance and overall micro and macro performance.
-
pyActLearn.performance.
get_confusion_matrix
(num_classes, label, predicted)[source]¶ Calculate confusion matrix based on ground truth and predicted result
Parameters: Returns: Confusion matrix (numpy_class by numpy_class)
Return type: numpy.array
-
pyActLearn.performance.
get_performance_array
(confusion_matrix)[source]¶ Calculate performance matrix based on the given confusion matrix
[Sokolova2009] provides a detailed analysis for multi-class performance metrics.
Per-class performance metrics:
- True_Positive: number of samples that belong to class and classified correctly
- True_Negative: number of samples that correctly classified as not belonging to class
- False_Positive: number of samples that belong to class and not classified correctMeasure:
- False_Negative: number of samples that do not belong to class but classified as class
- Accuracy: Overall, how often is the classifier correct? (TP + TN) / (TP + TN + FP + FN)
- Misclassification: Overall, how often is it wrong? (FP + FN) / (TP + TN + FP + FN)
- Recall: When it’s actually yes, how often does it predict yes? TP / (TP + FN)
- False Positive Rate: When it’s actually no, how often does it predict yes? FP / (FP + TN)
- Specificity: When it’s actually no, how often does it predict no? TN / (FP + TN)
- Precision: When it predicts yes, how often is it correct? TP / (TP + FP)
- Prevalence: How often does the yes condition actually occur in our sample? Total(class) / Total(samples)
- F(1) Measure: 2 * (precision * recall) / (precision + recall)
- G Measure: sqrt(precision * recall)
Gets Overall Performance for the classifier
- Average Accuracy: The average per-class effectiveness of a classifier
- Weighed Accuracy: The average effectiveness of a classifier weighed by prevalence of each class
- Precision (micro): Agreement of the class labels with those of a classifiers if calculated from sums of per-text decision
- Recall (micro): Effectiveness of a classifier to identify class labels if calculated from sums of per-text decisions
- F-Score (micro): Relationship between data’s positive labels and those given by a classifier based on a sums of per-text decisions
- Precision (macro): An average per-class agreement of the data class labels with those of a classifiers
- Recall (macro): An average per-class effectiveness of a classifier to identify class labels
- F-Score (micro): Relations between data’s positive labels and those given by a classifier based on a per-class average
- Exact Matching Ratio: The average per-text exact classification
Note
In Multi-class classification, Micro-Precision == Micro-Recall == Micro-FScore == Exact Matching Ratio (Multi-class classification: each input is to be classified into one and only one class)
Parameters: - num_classes (
int
) – Number of classes - confusion_matrix (
numpy.array
) – Confusion Matrix (numpy array of num_class by num_class)
Returns: tuple of overall performance and per class performance
Return type: tuple
ofnumpy.array