Minimum Description Length (MDL) is a model selection principle where the shortest description of the data is the best model. MDL methods learn through a data compression perspective and are sometimes described as mathematical applications of Occam's razor. The MDL principle can be extended to other forms of inductive inference and learning, for example to estimation and sequential prediction, without explicitly identifying a single model of the data.
MDL has its origins mostly in information theory and has been further developed within the general fields of statistics, theoretical computer science and machine learning, and more narrowly computational learning theory.
Historically, there are different, yet interrelated, usages of the definite noun phrase "the minimum description length principle" that vary in what is meant by description:
Within Jorma Rissanen's theory of learning, a central concept of information theory, models are statistical hypotheses and descriptions are defined as universal codes.
Rissanen's 1978 pragmatic first attempt to automatically derive short descriptions, relates to the Bayesian Information Criterion (BIC).
Within Algorithmic Information Theory, where the description length of a data sequence is the length of the smallest program that outputs that data set. In this context, it is also known as 'idealized' MDL principle and it is closely related to Solomonoff's theory of inductive inference, which is that the best model of a data set is represented by its shortest self-extracting archive.
Selecting the minimum length description of the available data as the best model observes the principle identified as Occam's razor. Prior to the advent of computer programming, generating such descriptions was the intellectual labor of scientific theorists. It was far less formal than it has become in the computer age. If two scientists had a theoretic disagreement, they rarely could formally apply Occam's razor to choose between their theories. They would have different data sets and possibly different descriptive languages.
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
Minimum message length (MML) is a Bayesian information-theoretic method for statistical model comparison and selection. It provides a formal information theory restatement of Occam's Razor: even when models are equal in their measure of fit-accuracy to the observed data, the one generating the most concise explanation of data is more likely to be correct (where the explanation consists of the statement of the model, followed by the lossless encoding of the data using the stated model).
Inductive probability attempts to give the probability of future events based on past events. It is the basis for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a source of knowledge about the world. There are three sources of knowledge: inference, communication, and deduction. Communication relays information found using other methods. Deduction establishes new facts based on existing facts. Inference establishes new facts from data. Its basis is Bayes' theorem.
La théorie algorithmique de l'information, initiée par Kolmogorov, Solomonov et Chaitin dans les années 1960, vise à quantifier et qualifier le contenu en information d'un ensemble de données, en utilisant la théorie de la calculabilité et la notion de machine universelle de Turing. Cette théorie permet également de formaliser la notion de complexité d'un objet, dans la mesure où l'on considère qu'un objet (au sens large) est d'autant plus complexe qu'il faut beaucoup d'informations pour le décrire, ou — à l'inverse — qu'un objet contient d'autant plus d'informations que sa description est longue.
Explore les arbres de décision, de l'induction à l'élagage, en mettant l'accent sur l'interprétabilité et les forces de sélection automatique des fonctionnalités, tout en abordant des défis tels que l'ajustement excessif.
Since the birth of Information Theory, researchers have defined and exploited various information measures, as well as endowed them with operational meanings. Some were born as a "solution to a problem", like Shannon's Entropy and Mutual Information. Other ...
With the increasing complexity of engineered systems, digital twins (DTs) have been widely used to support integrated modeling, simulation, and decision-making of the system of systems (SoS). However, when integrating DTs of each constituent system, it is ...
IOS PRESS2022
, , ,
Byzantine Reliable Broadcast (BRB) is a fundamental distributed computing primitive, with applications ranging from notifications to asynchronous payment systems. Motivated by practical consideration, we study Client-Server Byzantine Reliable Broadcast (CS ...
Schloss Dagstuhl - Leibniz-Zentrum für Informatik2022