**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Concept# Precision and recall

Summary

In pattern recognition, information retrieval, object detection and classification (machine learning), precision and recall are performance metrics that apply to data retrieved from a collection, corpus or sample space.
Precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances. Written as a formula:.
Recall (also known as sensitivity) is the fraction of relevant instances that were retrieved. Written as a formula: . Both precision and recall are therefore based on relevance.
Consider a computer program for recognizing dogs (the relevant element) in a digital photograph. Upon processing a picture which contains ten cats and twelve dogs, the program identifies eight dogs. Of the eight elements identified as dogs, only five actually are dogs (true positives), while the other three are cats (false positives). Seven dogs were missed (false negatives), and seven cats were correctly excluded (true negatives). The program's precision is then 5/8 (true positives / selected elements) while its recall is 5/12 (true positives / relevant elements).
Adopting a hypothesis-testing approach from statistics, in which, in this case, the null hypothesis is that a given item is irrelevant (i.e., not a dog), absence of type I and type II errors (i.e., perfect specificity and sensitivity of 100% each) corresponds respectively to perfect precision (no false positive) and perfect recall (no false negative).
More generally, recall is simply the complement of the type II error rate (i.e., one minus the type II error rate). Precision is related to the type I error rate, but in a slightly more complicated way, as it also depends upon the prior distribution of seeing a relevant vs. an irrelevant item.
The above cat and dog example contained 8 − 5 = 3 type I errors (false positives) out of 10 total cats (true negatives), for a type I error rate of 3/10, and 12 − 5 = 7 type II errors, for a type II error rate of 7/12. Precision can be seen as a measure of quality, and recall as a measure of quantity.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related courses (32)

Related people (29)

Related publications (136)

Related concepts (14)

Related MOOCs (1)

Related lectures (109)

Ontological neighbourhood

Related units (6)

EE-612: Fundamentals in statistical pattern recognition

This course provides in-depth understanding of the most fundamental algorithms in statistical pattern recognition or machine learning (including Deep Learning) as well as concrete tools (as Python sou

CS-423: Distributed information systems

This course introduces the foundations of information retrieval, data mining and knowledge bases, which constitute the foundations of today's Web-based distributed information systems.

DH-406: Machine learning for DH

This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple

Uncertainty coefficient

In statistics, the uncertainty coefficient, also called proficiency, entropy coefficient or Theil's U, is a measure of nominal association. It was first introduced by Henri Theil and is based on the concept of information entropy. Suppose we have samples of two discrete random variables, X and Y. By constructing the joint distribution, PX,Y(x, y), from which we can calculate the conditional distributions, PXY(xy) = PX,Y(x, y)/PY(y) and PYX(yx) = PX,Y(x, y)/PX(x), and calculating the various entropies, we can determine the degree of association between the two variables.

F-score

In statistical analysis of binary classification, the F-score or F-measure is a measure of a test's accuracy. It is calculated from the precision and recall of the test, where the precision is the number of true positive results divided by the number of all positive results, including those not identified correctly, and the recall is the number of true positive results divided by the number of all samples that should have been identified as positive.

Sensitivity and specificity

Sensitivity and specificity mathematically describe the accuracy of a test that reports the presence or absence of a condition. If individuals who have the condition are considered "positive" and those who do not are considered "negative", then sensitivity is a measure of how well a test can identify true positives and specificity is a measure of how well a test can identify true negatives: Sensitivity (true positive rate) is the probability of a positive test result, conditioned on the individual truly being positive.

Introduction to optimization on smooth manifolds: first order methods

Learn to optimize on smooth, nonlinear spaces: Join us to build your foundations (starting at "what is a manifold?") and confidently implement your first algorithm (Riemannian gradient descent).

Hydraulic Transients of Turbines: Hydroacoustic Modeling

Explores hydraulic turbine modeling, stability, and historical development, emphasizing the selection criteria for Francis turbines.

Probabilistic Retrieval Models

Covers probabilistic retrieval models, evaluation metrics, query likelihood, user relevance feedback, and query expansion.

Characterizing Fibrations in Chain Complexes

Explores the characterization of fibrations and acyclic fibrations in chain complexes.

Jean-Paul Richard Kneib, Emma Elizabeth Tolley, Tianyue Chen, Michele Bianco

The upcoming Square Kilometre Array Observatory will produce images of neutral hydrogen distribution during the epoch of reionization by observing the corresponding 21-cm signal. However, the 21-cm signal will be subject to instrumental limitations such as ...

Pedro Miguel Nunes Pereira de Almeida Reis, Paul Johanns

We investigate the failure mechanism of stopper knots, with a particular focus on the figure -8 knot as a representative example. Stopper knots are widely used in climbing, sailing, racket stringing, and sewing to maintain tension in ropes, strings, or thr ...

Within the scope of the implementation of a nuclear data pipeline aiming at producing the best possible evaluated nuclear data files, a major point is the production of relevant sensitivity coefficients when including integral benchmark information. Thanks ...