**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Publication# Testing Alternative Ground Water Models Using Cross-Validation and Other Methods

Abstract

Many methods can be used to test alternative ground water models. Of concern in this work are methods able to (1) rank alternative models (also called model discrimination) and (2) identify observations important to parameter estimates and predictions (equivalent to the purpose served by some types of sensitivity analysis). Some of the measures investigated are computationally efficient; others are computationally demanding. The latter are generally needed to account for model nonlinearity. The efficient model discrimination methods investigated include the information criteria: the corrected Akaike information criterion, Bayesian information criterion, and generalized cross-validation. The efficient sensitivity analysis measures used are dimensionless scaled sensitivity (DSS), composite scaled sensitivity, and parameter correlation coefficient (PCC); the other statistics are DFBETAS, Cook's D, and observation-prediction statistic. Acronyms are explained in the introduction. Cross-validation (CV) is a computationally intensive nonlinear method that is used for both model discrimination and sensitivity analysis. The methods are tested using up to five alternative parsimoniously constructed models of the ground water system of the Maggia Valley in southern Switzerland. The alternative models differ in their representation of hydraulic conductivity. A new method for graphically representing CV and sensitivity analysis results for complex models is presented and used to evaluate the utility of the efficient statistics. The results indicate that for model selection, the information criteria produce similar results at much smaller computational cost than CV. For identifying important observations, the only obviously inferior linear measure is DSS; the poor performance was expected because DSS does not include the effects of parameter correlation and PCC reveals large parameter correlations.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related MOOCs (14)

Related publications (66)

Related concepts (35)

Ontological neighbourhood

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.

Pearson correlation coefficient

In statistics, the Pearson correlation coefficient (PCC) is a correlation coefficient that measures linear correlation between two sets of data. It is the ratio between the covariance of two variables and the product of their standard deviations; thus, it is essentially a normalized measurement of the covariance, such that the result always has a value between −1 and 1. As with covariance itself, the measure can only reflect a linear correlation of variables, and ignores many other types of relationships or correlations.

Parameter

A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when identifying the system, or when evaluating its performance, status, condition, etc. Parameter has more specific meanings within various disciplines, including mathematics, computer programming, engineering, statistics, logic, linguistics, and electronic musical composition.

Model selection

Model selection is the task of selecting a model from among various candidates on the basis of performance criterion to choose the best one. In the context of learning, this may be the selection of a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the data collected is well-suited to the problem of model selection.

This thesis focuses on two kinds of statistical inference problems in signal processing and data science. The first problem is the estimation of a structured informative tensor from the observation of a noisy tensor in which it is buried. The structure com ...

We study the problem of learning unknown parameters of stochastic dynamical models from data. Often, these models are high dimensional and contain several scales and complex structures. One is then interested in obtaining a reduced, coarse-grained descript ...

Jiancheng Yang, Yi Ding, Zhen Gao

ObjectiveTo construct a new pulmonary nodule diagnostic model with high diagnostic efficiency, non-invasive and simple to measure. MethodsThis study included 424 patients with radioactive pulmonary nodules who underwent preoperative 7-autoantibody (7-AAB) ...