Decision tree learning

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.

Isoelastic utility

In economics, the isoelastic function for utility, also known as the isoelastic utility function, or power utility function, is used to express utility in terms of consumption or some other economic variable that a decision-maker is concerned with. The isoelastic utility function is a special case of hyperbolic absolute risk aversion and at the same time is the only class of utility functions with constant relative risk aversion, which is why it is also called the CRRA utility function.

Hyperbolic absolute risk aversion

In finance, economics, and decision theory, hyperbolic absolute risk aversion (HARA) refers to a type of risk aversion that is particularly convenient to model mathematically and to obtain empirical predictions from. It refers specifically to a property of von Neumann–Morgenstern utility functions, which are typically functions of final wealth (or some related variable), and which describe a decision-maker's degree of satisfaction with the outcome for wealth. The final outcome for wealth is affected both by random variables and by decisions.

Exponential utility

In economics and finance, exponential utility is a specific form of the utility function, used in some contexts because of its convenience when risk (sometimes referred to as uncertainty) is present, in which case expected utility is maximized. Formally, exponential utility is given by: is a variable that the economic decision-maker prefers more of, such as consumption, and is a constant that represents the degree of risk preference ( for risk aversion, for risk-neutrality, or for risk-seeking).

Influence diagram

An influence diagram (ID) (also called a relevance diagram, decision diagram or a decision network) is a compact graphical and mathematical representation of a decision situation. It is a generalization of a Bayesian network, in which not only probabilistic inference problems but also decision making problems (following the maximum expected utility criterion) can be modeled and solved. ID was first developed in the mid-1970s by decision analysts with an intuitive semantic that is easy to understand.

Early stopping

In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error.

Loss functions for classification

In machine learning and mathematical optimization, loss functions for classification are computationally feasible loss functions representing the price paid for inaccuracy of predictions in classification problems (problems of identifying which category a particular observation belongs to). Given as the space of all possible inputs (usually ), and as the set of labels (possible outputs), a typical goal of classification algorithms is to find a function which best predicts a label for a given input .

Out-of-bag error

Out-of-bag (OOB) error, also called out-of-bag estimate, is a method of measuring the prediction error of random forests, boosted decision trees, and other machine learning models utilizing bootstrap aggregating (bagging). Bagging uses subsampling with replacement to create training samples for the model to learn from. OOB error is the mean prediction error on each training sample xi, using only the trees that did not have xi in their bootstrap sample.

XGBoost

XGBoost (eXtreme Gradient Boosting) is an open-source software library which provides a regularizing gradient boosting framework for C++, Java, Python, R, Julia, Perl, and Scala. It works on Linux, Windows, and macOS. From the project description, it aims to provide a "Scalable, Portable and Distributed Gradient Boosting (GBM, GBRT, GBDT) Library". It runs on a single machine, as well as the distributed processing frameworks Apache Hadoop, Apache Spark, Apache Flink, and Dask.

Decision stump

A decision stump is a machine learning model consisting of a one-level decision tree. That is, it is a decision tree with one internal node (the root) which is immediately connected to the terminal nodes (its leaves). A decision stump makes a prediction based on the value of just a single input feature. Sometimes they are also called 1-rules. Depending on the type of the input feature, several variations are possible.