Nonparametric regression is a category of regression analysis in which the predictor does not take a predetermined form but is constructed according to information derived from the data. That is, no parametric form is assumed for the relationship between predictors and dependent variable. Nonparametric regression requires larger sample sizes than regression based on parametric models because the data must supply the model structure as well as the model estimates.
In nonparametric regression, we have random variables and and assume the following relationship:
where is some deterministic function. Linear regression is a restricted case of nonparametric regression where is assumed to be affine.
Some authors use a slightly stronger assumption of additive noise:
where the random variable is the `noise term', with mean 0.
Without the assumption that belongs to a specific parametric family of functions it is impossible to get an unbiased estimate for , however most estimators are consistent under suitable conditions.
This is a non-exhaustive list of non-parametric models for regression.
nearest neighbors, see nearest-neighbor interpolation and k-nearest neighbors algorithm
regression trees
kernel regression
local regression
multivariate adaptive regression splines
smoothing splines
neural networks
Gaussian process regression
In Gaussian process regression, also known as Kriging, a Gaussian prior is assumed for the regression curve. The errors are assumed to have a multivariate normal distribution and the regression curve is estimated by its posterior mode. The Gaussian prior may depend on unknown hyperparameters, which are usually estimated via empirical Bayes.
The hyperparameters typically specify a prior covariance kernel. In case the kernel should also be inferred nonparametrically from the data, the critical filter can be used.
Smoothing splines have an interpretation as the posterior mode of a Gaussian process regression.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Local regression or local polynomial regression, also known as moving regression, is a generalization of the moving average and polynomial regression. Its most common methods, initially developed for scatterplot smoothing, are LOESS (locally estimated scatterplot smoothing) and LOWESS (locally weighted scatterplot smoothing), both pronounced ˈloʊɛs. They are two strongly related non-parametric regression methods that combine multiple regression models in a k-nearest-neighbor-based meta-model.
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.
In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables. The term "MARS" is trademarked and licensed to Salford Systems. In order to avoid trademark infringements, many open-source implementations of MARS are called "Earth". This section introduces MARS using a few examples.
Regression modelling is a fundamental tool of statistics, because it describes how the law of a random variable of interest may depend on other variables. This course aims to familiarize students with
This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple
We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights. We consider the asymptotic limit where the number of samples, the input dimension and the network width ...
2023
We present a framework for performing regression when both covariate and response are probability distributions on a compact and convex subset of Rd. Our regression model is based on the theory of optimal transport and links the conditional Fr'echet m ...
We consider the problem of defining and fitting models of autoregressive time series of probability distributions on a compact interval of Double-struck capital R. An order-1 autoregressive model in this context is to be understood as a Markov chain, where ...