Concept

Inter-rater reliability

In statistics, inter-rater reliability (also called by various similar names, such as inter-rater agreement, inter-rater concordance, inter-observer reliability, inter-coder reliability, and so on) is the degree of agreement among independent observers who rate, code, or assess the same phenomenon. Assessment tools that rely on ratings must exhibit good inter-rater reliability, otherwise they are not valid tests. There are a number of statistics that can be used to determine inter-rater reliability. Different statistics are appropriate for different types of measurement. Some options are joint-probability of agreement, such as Cohen's kappa, Scott's pi and Fleiss' kappa; or inter-rater correlation, concordance correlation coefficient, intra-class correlation, and Krippendorff's alpha. There are several operational definitions of "inter-rater reliability," reflecting different viewpoints about what is a reliable agreement between raters. There are three operational definitions of agreement: Reliable raters agree with the "official" rating of a performance. Reliable raters agree with each other about the exact ratings to be awarded. Reliable raters agree about which performance is better and which is worse. These combine with two operational definitions of behavior: The joint-probability of agreement is the simplest and the least robust measure. It is estimated as the percentage of the time the raters agree in a nominal or categorical rating system. It does not take into account the fact that agreement may happen solely based on chance. There is some question whether or not there is a need to 'correct' for chance agreement; some suggest that, in any case, any such adjustment should be based on an explicit model of how chance and error affect raters' decisions. When the number of categories being used is small (e.g. 2 or 3), the likelihood for 2 raters to agree by pure chance increases dramatically.

Official source

https://en.wikipedia.org/wiki/Inter-rater_reliability

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Inter-rater reliability

Graph Chatbot

Chat with Graph Search

The hindbrain and cortico-reticular pathway in adolescent idiopathic scoliosis

Performance evaluation of radon active sensors and passive dosimeters at low and high radon concentrations

Encapsulation strategies for mechanical impact and damp heat reliability improvement of lightweight photovoltaic modules towards vehicle-integrated applications

The hindbrain and cortico-reticular pathway in adolescent idiopathic scoliosis

Encapsulation strategies for mechanical impact and damp heat reliability improvement of lightweight photovoltaic modules towards vehicle-integrated applications

Performance evaluation of radon active sensors and passive dosimeters at low and high radon concentrations