Criterion-referenced testA criterion-referenced test is a style of test which uses test scores to generate a statement about the behavior that can be expected of a person with that score. Most tests and quizzes that are written by school teachers can be considered criterion-referenced tests. In this case, the objective is simply to see whether the student has learned the material. Criterion-referenced assessment can be contrasted with norm-referenced assessment and ipsative assessment. Criterion-referenced testing was a major focus of psychometric research in the 1970s.
Level of measurementLevel of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and ratio. This framework of distinguishing levels of measurement originated in psychology and has since had a complex history, being adopted and extended in some disciplines and by some scholars, and criticized or rejected by others.
Criterion validityIn psychometrics, criterion validity, or criterion-related validity, is the extent to which an operationalization of a construct, such as a test, relates to, or predicts, a theoretical representation of the construct—the criterion. Criterion validity is often divided into concurrent and predictive validity based on the timing of measurement for the "predictor" and outcome. Concurrent validity refers to a comparison between the measure in question and an outcome assessed at the same time.
Rating scaleConcerning rating scales as systems of educational marks, see more articles about education in different countries (named "Education in ..."), for example, Education in Ukraine. Concerning rating scales used in the practice of medicine, see articles about diagnoses, for example, Major depressive disorder. A rating scale is a set of categories designed to elicit information about a quantitative or a qualitative attribute.
Scale (social sciences)In the social sciences, scaling is the process of measuring or ordering entities with respect to quantitative attributes or traits. For example, a scaling technique might involve estimating individuals' levels of extraversion, or the perceived quality of products. Certain methods of scaling permit estimation of magnitudes on a continuum, while other methods provide only for relative ordering of the entities. The level of measurement is the type of data that is measured.
Évaluation sommativeLes concepts d’évaluation sommative et formative ont été apportés par Michael Scriven en 1967. Selon Scriven, une évaluation formative devait permettre à un établissement scolaire d’estimer la capacité de ses programmes scolaires à atteindre leurs objectifs, de façon à guider les choix de l’école pour les améliorer progressivement, au contraire d’une évaluation sommative qui cherche à poser un jugement final sur les programmes : « marchent-ils » ou pas ? Et en conséquence, faut-il les maintenir, les étendre ou les abandonner ? Pour Scriven, toutes les techniques d’évaluation peuvent être sommatives, mais seules certaines sont formatives.
Évaluation formativeLes concepts d’évaluation formative et sommative ont été apportés par Michael Scriven en 1967, dans le contexte de l’évaluation de programmes éducatifs (curriculum evaluation). Pour Scriven, une évaluation formative devait permettre à un établissement scolaire d’estimer la capacité de ses programmes scolaires à atteindre leurs objectifs, de façon à guider les choix de l’école pour les améliorer progressivement, au contraire d’une évaluation sommative qui cherche à poser un jugement final sur les programmes : « marchent-ils » ou pas ? Et en conséquence, faut-il les maintenir, les étendre ou les abandonner ? Benjamin Bloom reprend dans les années suivantes cette distinction pour l’appliquer au processus d’apprentissage, notamment dans son ouvrage Handbook on formative and summative evaluation of student learning.
Test scoreA test score is a piece of information, usually a number, that conveys the performance of an examinee on a test. One formal definition is that it is "a summary of the evidence contained in an examinee's responses to the items of a test that are related to the construct or constructs being measured." Test scores are interpreted with a norm-referenced or criterion-referenced interpretation, or occasionally both. A norm-referenced interpretation means that the score conveys meaning about the examinee with regards to their standing among other examinees.
Modèle de RaschThe Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, attitudes, or personality traits, and the item difficulty. For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire.
Coefficient alpha de CronbachLe coefficient alpha de Cronbach, parfois appelé simplement coefficient , est une statistique utilisée notamment en psychométrie pour mesurer la cohérence interne (ou la fiabilité) des questions posées lors d'un test (les réponses aux questions portant sur le même sujet devant être corrélées). Sa valeur est inférieure ou égale à 1, étant généralement considérée comme "acceptable" à partir de 0,7. Le coefficient alpha de Cronbach doit dans tous les cas être calculé après la validité interne d'un test, on dira donc que la validité interne est un préalable au calcul de la fidélité.