F-score

In statistical analysis of binary classification, the F-score or F-measure is a measure of a test's accuracy. It is calculated from the precision and recall of the test, where the precision is the number of true positive results divided by the number of all positive results, including those not identified correctly, and the recall is the number of true positive results divided by the number of all samples that should have been identified as positive. Precision is also known as positive predictive value, and recall is also known as sensitivity in diagnostic binary classification. The F1 score is the harmonic mean of the precision and recall. It thus symmetrically represents both precision and recall in one metric. The more generic score applies additional weights, valuing one of precision or recall more than the other. The highest possible value of an F-score is 1.0, indicating perfect precision and recall, and the lowest possible value is 0, if either precision or recall are zero. The name F-measure is believed to be named after a different F function in Van Rijsbergen's book, when introduced to the Fourth Message Understanding Conference (MUC-4, 1992). The traditional F-measure or balanced F-score (F1 score) is the harmonic mean of precision and recall: A more general F score, , that uses a positive real factor , where is chosen such that recall is considered times as important as precision, is: In terms of Type I and type II errors this becomes: Two commonly used values for are 2, which weighs recall higher than precision, and 0.5, which weighs recall lower than precision. The F-measure was derived so that "measures the effectiveness of retrieval with respect to a user who attaches times as much importance to recall as precision". It is based on Van Rijsbergen's effectiveness measure Their relationship is where . This is related to the field of binary classification where recall is often termed "sensitivity". Precision-recall curve, and thus the score, explicitly depends on the ratio of positive to negative test cases.

Graph Chatbot

Chat with Graph Search

Computation of sensitivity coefficients in fixed source simulations with SERPENT2

Deep learning approach for identification of H II regions during reionization in 21-cm observations - II. Foreground contamination

Resource-Efficient Continual Learning for Personalized Online Seizure Detection

Computation of sensitivity coefficients in fixed source simulations with SERPENT2

Deep learning approach for identification of H II regions during reionization in 21-cm observations - II. Foreground contamination

Resource-Efficient Continual Learning for Personalized Online Seizure Detection