**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Exploratory factor analysis

Summary

In multivariate statistics, exploratory factor analysis (EFA) is a statistical method used to uncover the underlying structure of a relatively large set of variables. EFA is a technique within factor analysis whose overarching goal is to identify the underlying relationships between measured variables. It is commonly used by researchers when developing a scale (a scale is a collection of questions used to measure a particular research topic) and serves to identify a set of latent constructs underlying a battery of measured variables. It should be used when the researcher has no a priori hypothesis about factors or patterns of measured variables. Measured variables are any one of several attributes of people that may be observed and measured. Examples of measured variables could be the physical height, weight, and pulse rate of a human being. Usually, researchers would have a large number of measured variables, which are assumed to be related to a smaller number of "unobserved" factor

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

Loading

Related people

Loading

Related units

Loading

Related concepts

Loading

Related courses

Loading

Related lectures

Loading

Related publications (6)

Loading

Loading

Loading

Related people (1)

Related courses (3)

ENV-444: Exploratory data analysis in environmental health

This course teaches how to apply exploratory spatial data analysis to health data. Teaching focuses on the basics of spatial statistics and of epidemiology, and proposes a context to analyse geodatasets making it possible to study the relationship between health and the environment.

COM-500: Statistical signal and data processing through applications

Building up on the basic concepts of sampling, filtering and Fourier transforms, we address stochastic modeling, spectral analysis, estimation and prediction, classification, and adaptive filtering, with an application oriented approach and hands-on numerical exercises.

MATH-517: Statistical computation and visualisation

The course will provide the opportunity to tackle real world problems requiring advanced computational skills and visualisation techniques to complement statistical thinking. Students will practice proposing efficient solutions, and effectively communicating the results with stakeholders.

Related concepts (1)

Factor analysis

Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it

This thesis is a contribution to financial statistics. One of the principal concerns of investors is the evaluation of portfolio risk. The notion of risk is vague, but in finance it is always linked to possible losses. In this thesis, we present some measures allowing the valuation of risk with the help of Bayesian methods. An exploratory analysis of data is presented to describe the sampling properties of financial time series. This analysis allows us to understand the origins of the daily returns studied in this thesis. Moreover, a discussion of different models is presented. These models make strong assumptions on investor behaviour, which are not always satisfied. This exploratory analysis shows some differences between the behaviour anticipated under equilibrium models, and that of real data. The Bayesian approach has been chosen because it allows one to incorporate all the variability, in particular that associated with model choice. The models studied in this thesis allow one to take heteroskedasticity into account, as well as particular shapes of the tails of returns. ARCH type models and models based on extreme value theory are studied. One original aspect of this thesis is its use of Bayesian analysis to detect change points in financial time series. We suppose that a market has two phases, and that it switches from a state to the other at random. Another new contribution is a model integrating heteroskedasticity and time dependence of extreme values, by superposition of the model proposed by Bortot and Coles (2003) and a GARCH process. This thesis uses simulation intensively for the estimation of risk measures. The drawback of simulation is the amount of time needed to obtain accurate estimates. However, simulation allows one to produce results when direct calculation is not feasible. For example, simulation allows one to compute risk estimates for time horizons greater than one day. The methods presented in this thesis are illustrated on simulated data, and on real data from European and American markets. This thesis involved the construction of a library containing C and S code to perform risk analysis using GARCH and extreme value theory models. The results show that model uncertainty can be incorporated, and that risk measures for time horizons greater than one can be obtained by simulation. The methods presented in this thesis have a natural representation involving conditioning. Thus, they permit the computation of both conditional and unconditional risk estimates. Three methods are described: the GARCH method; the two-state GARCH method; and the HBC method. Unconditional risk estimation using the GARCH method is satisfactory on data which seem stationary, but not reliable on data which are non-stationary, such as data with change points. The two-state GARCH model does a little better, but gives very satisfactory results when the risk is estimated conditionally on time. The HBC method does not give satisfactory results.

Related units (1)

Emeric Rolland Georges Thibaud

Statistical methods for inference on spatial extremes of large datasets are yet to be developed. Motivated by standard dimension reduction techniques used in spatial statistics, we propose an approach based on empirical basis functions to explore and model spatial extremal dependence. Based on a low-rank max-stable model, we propose a data-driven approach to estimate meaningful basis functions using empirical pairwise extremal coefficients. These spatial empirical basis functions can be used to visualize the main trends in extremal dependence. In addition to exploratory analysis, we describe how these functions can be used in a Bayesian hierarchical model to model spatial extremes of large datasets. We illustrate our methods on extreme precipitations in eastern USA. Supplementary materials accompanying this paper appear online

John Wilder Tukey, Donner Professor of Science Emeritus at Princeton University, was born in New Bedford, Massachusetts, on June 16, 1915. After earning bachelor's and master's degrees in chemistry at Brown University in 1936 and 1937, respectively, he started his career at Princeton University with a Ph.D. in mathematics in 1939 followed by an immediate appointment as Henry B. Fine Instructor in Mathematics. A decade later, at age 35, he was advanced to a full professorship. He directed the Statistical Research Group at Princeton University from its founding in 1956; when the Department of Statistics was formed in 1965, he was named its first chairman and held that post until 1970. He was appointed to the Donner Chair in 1976 and remained at Princeton until reaching emeritus status in 1985. At the same time, he was a Member of Technical Staff at AT&T Bell Laboratories since 1945, advancing to Assistant Director of Research, Communications Principles, in 1958 and, in 1961, to Associate Executive Director, Research Information Sciences, a position he held until retirement in 1985. Throughout World War II he participated in projects assigned to the Princeton Branch of the Frankford Arsenal Fire Control Design Division. This wartime service marked the beginning of his close and continuing association with governmental committees and agencies. Among other activities he was a member of the U.S. Delegation to the Conference on the Discontinuance of Nuclear Weapons Tests in Geneva in 1959, served on the President's Science Advisory Committee from 1960 to 1964 and was a member of President Johnson's Task Force on Environmental Pollution and President Nixon's Task Force on Air Pollution. The long list of awards and honors that Tukey has received includes the S. S. Wilks Medal from the American Statistical Association (ASA) (1965), the National Medal of Science (1973), the Medal of Honor from the IEEE (1982), the Deming Medal from the American Society of Quality Control (1983) and the Educational Testing Service Award (1990). He holds honorary degrees from Case Institute of Technology, the University of Chicago and Brown, Temple, Yale and Waterloo Universities; in June 1998, he was awarded an honorary degree from Princeton University. He has led the way to the fields of exploratory data analysis (EDA) and robust estimation. His contributions to the spectral analysis of time series and other aspects of digital signal processes have been widely used in engineering and science. His collaboration with a fellow mathematician resulted in the discovery of the fast Fourier transform (FFT) algorithm. Author of Exploratory Data Analysis and eight volumes of collected papers, he has contributed to a wide variety of areas and has coauthored several books. He has guided more than 50 graduate students to successful Ph.D.'s and inspired their careers. A detailed list of his students as well as a complete curriculum vitae can be found in The Practice of Data Analysis (1997), edited by D. Brillinger, L. Fernholz, and S. Morgenthaler, Princeton University Press. John W. Tukey married Elizabeth Louise Rapp in 1950. Before their marriage, she was Personnel Director of the Educational Testing Service in Princeton, New Jersey.

2000Related lectures (16)