In statistics and psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions:"It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are precise, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. Various kinds of reliability coefficients, with values ranging between 0.00 (much error) and 1.00 (no error), are usually used to indicate the amount of error in the scores." For example, measurements of people's height and weight are often extremely reliable. There are several general classes of reliability estimates: Inter-rater reliability assesses the degree of agreement between two or more raters in their appraisals. For example, a person gets a stomach ache and different doctors all give the same diagnosis. Test-retest reliability assesses the degree to which test scores are consistent from one test administration to the next. Measurements are gathered from a single rater who uses the same methods or instruments and the same testing conditions. This includes intra-rater reliability. Inter-method reliability assesses the degree to which test scores are consistent when there is a variation in the methods or instruments used. This allows inter-rater reliability to be ruled out. When dealing with forms, it may be termed parallel-forms reliability. Internal consistency reliability, assesses the consistency of results across items within a test. Validity (statistics)#Reliability Reliability does not imply validity. That is, a reliable measure that is measuring something consistently is not necessarily measuring what you want to be measured. For example, while there are many reliable tests of specific abilities, not all of them would be valid for predicting, say, job performance.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (2)
CS-487: Industrial automation
This course consists of two parts:
  1. architecture of automation systems, hands-on lab
  2. dependable systems and handling of faults and failures in real-time systems, including fault-tolerant computin
MICRO-448: Manufacturing systems and supply chain dynamics
This course discusses quantitatively some important and generic performance and reliability issues that affect the behavior of manufacturing systems and supply chains.
Related lectures (19)
Economy of China: Made in China
Explores the Chinese economy, trade war impact, global value chains, and trade networks.
Scientific Citations: Importance and Responsibility
By Sandrine Hinrichs emphasizes the importance of citing scientific sources accurately and responsibly.
Volume Calculation in Decoding Regions
Covers volume calculation in decoding regions for signal transmission trade-offs.
Show more
Related publications (68)

Encapsulation strategies for mechanical impact and damp heat reliability improvement of lightweight photovoltaic modules towards vehicle-integrated applications

Fabiana Lisco

Lightweight modules are essential for next-generation vehicle-integrated photovoltaic (VIPV) applications, such as solar-powered cars, allowing integration of solar cells beyond the roof, and on the hood, boot and body panels, and thereby extending the dri ...
Elsevier2024

Comparison of questionnaire items for discomfort glare studies in daylit spaces

Marilyne Andersen, Jan Wienold, Caroline Karmann, Sneha Jain, Geraldine Cai Ting Quek, Clotilde Marie A Pierson

When studying discomfort glare, researchers tend to rely on a single questionnaire item to obtain user evaluations. It is unclear whether the choice of questionnaire item affects the distribution of user responses and leads to inconsistencies between studi ...
2023

A Discussion on the Reliability of prEN1992-1-1:2021 Shear Strength Provisions for Fibre Reinforced Concrete Members Without Shear Reinforcement

Miguel Fernández Ruiz

The Eurocode 2 for the design of concrete structures (EN1992-1-1:2004) is undergoing a revision that will lead to the publication of the second generation of this code to be used across all CEN member countries. Therefore, the impact of the code will reach ...
SPRINGER INTERNATIONAL PUBLISHING AG2023
Show more
Related units (1)
Related concepts (11)
Cronbach's alpha
Cronbach's alpha (Cronbach's ), also known as rho-equivalent reliability () or coefficient alpha (coefficient ), is a reliability coefficient and a measure of the internal consistency of tests and measures. Numerous studies warn against using it unconditionally. Reliability coefficients based on structural equation modeling (SEM) or generalizability theory are superior alternatives in many situations. Lee Cronbach first named the coefficient in 1951 with his initial publication, Cronbach's alpha.
Inter-rater reliability
In statistics, inter-rater reliability (also called by various similar names, such as inter-rater agreement, inter-rater concordance, inter-observer reliability, inter-coder reliability, and so on) is the degree of agreement among independent observers who rate, code, or assess the same phenomenon. Assessment tools that rely on ratings must exhibit good inter-rater reliability, otherwise they are not valid tests. There are a number of statistics that can be used to determine inter-rater reliability.
Level of measurement
Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and ratio. This framework of distinguishing levels of measurement originated in psychology and has since had a complex history, being adopted and extended in some disciplines and by some scholars, and criticized or rejected by others.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.