**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Statistical inference

Summary

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.
Inferential statistics can be contrasted with descriptive statistics. Descriptive statistics is solely concerned with properties of the observed data, and it does not rest on the assumption that the data come from a larger population. In machine learning, the term inference is sometimes used instead to mean "make a prediction, by evaluating an already trained model"; in this context inferring properties of the model is referred to as training or learning (rather than inference), and using a model for prediction is referred to as inference (instead of prediction); see also predictive inference.
Introduction
Statistical inference makes propositions ab

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

Loading

Related people

Loading

Related units

Loading

Related concepts

Loading

Related courses

Loading

Related lectures

Loading

Related people (37)

Related concepts (110)

Statistics

Statistics (from German: Statistik, "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and present

Bayesian inference

Bayesian inference (ˈbeɪziən or ˈbeɪʒən ) is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes avail

Statistical hypothesis testing

A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.
Hypothesis testing allows us to make probabil

Related courses (96)

MATH-442: Statistical theory

The course aims at developing certain key aspects of the theory of statistics, providing a common general framework for statistical methodology. While the main emphasis will be on the mathematical aspects of statistics, an effort will be made to balance rigor and intuition.

PHYS-467: Machine learning for physicists

Machine learning and data analysis are becoming increasingly central in sciences including physics. In this course, fundamental principles and methods of machine learning will be introduced and practised.

MSE-421: Statistical mechanics

This course presents an introduction to statistical mechanics geared towards materials scientists. The concepts of macroscopic thermodynamics will be related to a microscopic picture and a statistical interpretation. Lectures and exercises will be complemented with hands-on simulation projects.

Related publications (100)

Loading

Loading

Loading

Related units (28)

Related lectures (191)

Claire Marianne Charlotte Capelo

The explosive growth of machine learning in the age of data has led to a new probabilistic and data-driven approach to solving very different types of problems. In this paper we study the feasibility of using such data-driven algorithms to solve classic physical and mathematical problems. In particular, we try to model the solution of an inverse continuum mechanics problem in the context of linear elasticity using deep neural networks. To better address the inverse function, we start first by studying the simplest related task,consisting of a building block of the actual composite problem. By empirically proving the learnability of simpler functions, we aim to draw conclusions with respect to the initial problem.The basic inverse problem that motivates this paper is that of a 2D plate with inclusion under specific loading and boundary conditions. From measurements at static equilibrium,we wish to recover the position of the hole. Although some analytical solutions have been formulated for 3D-infinite solids - most notably Eshelby’s inclusion problems - finite problems with particular geometries, material inhomogeneities, loading and boundary conditions require the use of numerical methods which are most often efficient solutions to the forward problem, the mapping from the parameter space to the measurement/signal space, i.e. in our case computing displacements and stresses knowing the size and position of the inclusion. Using numerical data generated from the well-defined forward problem,we train a neural network to approximate the inverse function relating displacements and stresses to the position of the inclusion. The preliminary results on the 2D-finite problem are promising, but the black-box nature of neural networks is a huge issue when it comes to understanding the solution.For this reason, we study a 3D-infinite continuous isotropic medium with unique concentrated load, for which the Green’s function gives an analytical mathematical expression relating relative position of the point force and the displacements in the solid. After de-riving the expression of the inverse, namely recovering the relative position of the force from the Green’s matrix computed at a given point in the medium, we are able to study the sensitivity of the inverse function. From both the expression of the Green’s function and its inverse, we highlight what issues might arise when training neural networks to solve the inverse problem. As the Green’s function is not bijective, bijection must been forced when training for regression. Moreover, due to displacements growing to infinity as we approach the singularity at zero, the training domain must be constrained to be some distance away from the singularity. As we train a neural network to fit the inverse of the Green’s function, we show that the input parameters should include the least possible redundant information to ensure the most efficient training.We then extend our analysis to two point forces. As more loads are added, bijection is harder to enforce as permutations of forces must be taken into account and more collisions may arise, i.e. multiple specific combinations of forces might yield the same measurements.One obvious solution is to increase the number of nodes where displacements are measured to limit the possibility of collision. Through new experiments, we show again that the best training is achieved for the least possible amount of nodes, as long as the training data generated is indeed bijective. As the medium is elastic, we propose a neural network architecture that matches the composite nature of the inverse problem. We also present another formulation of the problem which is invariant to permutations of the forces,namely multilabel classification, and yields good performance in the two-load case.Finally, we study the composite inverse function for 2, 3, 4 and 5 forces. By comparing training and accuracy for different neural network architectures, we expose the model easiest to train. Moreover, the evolution of the final accuracy as more loads are added indicates that deep-neural networks (DNNs) are not well suited to fit a non-linear mapping from and to a high-dimensional space. The results are more convincing for multilabel classification.

2020Verifying real-time systems goes beyond the verification of functional properties: it also requires the checking of real-time properties. This makes traditional contract-frameworks partially inept for checking real-time programs. This is a major problem because the failure of real-time and safety critical systems can have serious consequences. This thesis presents a solution to this problem by incorporating Design by Contract (annotating programs with function pre and post conditions) to such systems. The main contribution of this thesis is the development of a contract framework for cyclic real-time control applications written in C++. The contract framework allows the users to specify both functional and temporal properties for the applications. A novel approach of empirical cumulative distribution function (cdf ) based statistical inference is used for dynamically estimating temporal constraints and incorporating them in future contracts. The thesis also illustrates the use of Real-time Logic (RTL) for formal specification of the temporal properties. For evaluating our methodology, we have integrated it to a component-based framework called FASA (Future Automation System Architecture) developed at ABB Corporate Research for writing hard real time control applications. Experiments show that this contract framework can be smoothly integrated to existing control applications thereby increasing their reliability while having a acceptable overhead (less than 10%) on the performance.

2014Modeling the immune system (IS) means putting together a set of assumptions about its components (cells and organs) and their interactions. Simulations of a model show joint behavior of the components, which for complex realistic models is often impossible to find analytically. Simulations allow us to experiment on how initial concentrations and properties of the immune cells and viruses impact the IS behavior, and gain better quantitative and qualitative insight into how the IS works and why different behavior patterns occur. A simulation, once it has been created, must be reviewed both statistically and analytically as well as validated from the biological point of view. We analyzed Chao’s immune system simulation [1][2] from a statistical and analytical point. We explicited both the Markov chain which was simulated and the underlying process on which Chao’s stage-structured approach was built. Furthermore, we established a test protocol for timestep validation which Chao’s simulator passed. We evaluated Chao’s simulator’s dependence on the random number generator, which was shown to be negligible. Finally, we evaluated the simulator output and our major result is the discovery of a secondary response to a primary infection, an occurrence is not shown in Chao’s dissertation. A tertiary response to the infection is never possible due to the size of the secondary response caused by memory cells.

2005