**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Change detection

Summary

In statistical analysis, change detection or change point detection tries to identify times when the probability distribution of a stochastic process or time series changes. In general the problem concerns both detecting whether or not a change has occurred, or whether several changes might have occurred, and identifying the times of any such changes.
Specific applications, like step detection and edge detection, may be concerned with changes in the mean, variance, correlation, or spectral density of the process. More generally change detection also includes the detection of anomalous behavior: anomaly detection.
A time series measures the progression of one or more quantities over time. For instance, the figure above shows the level of water in the Nile river between 1870 and 1970. Change point detection is concerned with identifying whether, and if so when, the behavior of the series changes significantly. In the Nile river example, the volume of water changes significantly after a dam was built in the river. Importantly, anomalous observations that differ from the ongoing behavior of the time series are not generally considered change points as long as the series returns to its previous behavior afterwards.
Mathematically, we can describe a time series as an ordered sequence of observations . We can write the joint distribution of a subset of the time series as . If the goal is to determine whether a change point occurred at a time in a finite time series of length , then we really ask whether equals . This problem can be generalized to the case of more than one change point.
The problem of change point detection can be narrowed down further into more specific problems. In offline change point detection it is assumed that a sequence of length is available and the goal is to identify whether any change point(s) occurred in the series. This is an example of post hoc analysis and is often approached using hypothesis testing methods. By contrast, online change point detection is concerned with detecting change points in an incoming data stream.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (3)

Change detection

In statistical analysis, change detection or change point detection tries to identify times when the probability distribution of a stochastic process or time series changes. In general the problem concerns both detecting whether or not a change has occurred, or whether several changes might have occurred, and identifying the times of any such changes. Specific applications, like step detection and edge detection, may be concerned with changes in the mean, variance, correlation, or spectral density of the process.

CUSUM

In statistical quality control, the CUsUM (or cumulative sum control chart) is a sequential analysis technique developed by E. S. Page of the University of Cambridge. It is typically used for monitoring change detection. CUSUM was announced in Biometrika, in 1954, a few years after the publication of Wald's sequential probability ratio test (SPRT). E. S. Page referred to a "quality number" , by which he meant a parameter of the probability distribution; for example, the mean.

Time series

In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average. A time series is very frequently plotted via a run chart (which is a temporal line chart).

Related publications (7)

Related people (6)

Related units (2)

Related lectures (2)

Topology in Complex Networks: Insights from Topological Data Analysis

Explores the role of higher-order topological properties in complex networks using topological data analysis for structural break and price anomaly detection.

Quantum Physics IPHYS-313: Quantum physics I

Covers the fundamentals of quantum physics, focusing on Hamiltonians, Schrödinger's equation, and magnetic resonance.

Frank Grégoire Jean de Morsier

The main challenge of new information technologies is to retrieve intelligible information from the large volume of digital data gathered every day. Among the variety of existing data sources, the satellites continuously observing the surface of the Earth are key to the monitoring of our environment. The new generation of satellite sensors are tremendously increasing the possibilities of applications but also increasing the need for efficient processing methodologies in order to extract information relevant to the users' needs in an automatic or semi-automatic way. This is where machine learning comes into play to transform complex data into simplified products such as maps of land-cover changes or classes by learning from data examples annotated by experts. These annotations, also called labels, may actually be difficult or costly to obtain since they are established on the basis of ground surveys. As an example, it is extremely difficult to access a region recently flooded or affected by wildfires. In these situations, the detection of changes has to be done with only annotations from unaffected regions. In a similar way, it is difficult to have information on all the land-cover classes present in an image while being interested in the detection of a single one of interest. These challenging situations are called novelty detection or one-class classification in machine learning. In these situations, the learning phase has to rely only on a very limited set of annotations, but can exploit the large set of unlabeled pixels available in the images. This setting, called semi-supervised learning, allows significantly improving the detection. In this Thesis we address the development of methods for novelty detection and one-class classification with few or no labeled information. The proposed methodologies build upon the kernel methods, which take place within a principled but flexible framework for learning with data showing potentially non-linear feature relations. The thesis is divided into two parts, each one having a different assumption on the data structure and both addressing unsupervised (automatic) and semi-supervised (semi-automatic) learning settings. The first part assumes the data to be formed by arbitrary-shaped and overlapping clusters and studies the use of kernel machines, such as Support Vector Machines or Gaussian Processes. An emphasis is put on the robustness to noise and outliers and on the automatic retrieval of parameters. Experiments on multi-temporal multispectral images for change detection are carried out using only information from unchanged regions or none at all. The second part assumes high-dimensional data to lie on multiple low dimensional structures, called manifolds. We propose a method seeking a sparse and low-rank representation of the data mapped in a non-linear feature space. This representation allows us to build a graph, which is cut into several groups using spectral clustering. For the semi-supervised case where few labels of one class of interest are available, we study several approaches incorporating the graph information. The class labels can either be propagated on the graph, constrain spectral clustering or used to train a one-class classifier regularized by the given graph. Experiments on the unsupervised and oneclass classification of hyperspectral images demonstrate the effectiveness of the proposed approaches.

This project presents a pre-State Estimation method for the detection of Bad Data (BD) in Phasor Measurement Units (PMUs) using correlation analysis and a Neural Network Classifier. Presented in this paper is the algorithm design, the steps for generating training and testing data, and the metrics used for evaluation. It is shown that the algorithm is able to detect noisy BD in measurements with load variations, low-level natural noise and transients from bus faults. The benefits of the proposed algorithm include that it can be applied to several measurement subsets in parallel, it is data driven and therefore independent of network model errors, and it is computationally fast, making it a promising technique for online detection.

David Atienza Alonso, Marina Zapater Sancho, Tomas Teijeiro Campo, Valentin Alexandre Guy Gabeff, Leila Cammoun

While Deep Learning (DL) is often considered the state-of-the art for Artificial Intelligence-based medical decision support, it remains sparsely implemented in clinical practice and poorly trusted by clinicians due to insufficient interpretability of neural network models. We have tackled this issue by developing interpretable DL models in the context of online detection of epileptic seizure, based on EEG signal. This has conditioned the preparation of the input signals, the network architecture, and the post-processing of the output in line with the domain knowledge. Specifically, we focused the discussion on three main aspects: 1) how to aggregate the classification results on signal segments provided by the DL model into a larger time scale, at the seizure-level; 2) what are the relevant frequency patterns learned in the first convolutional layer of different models, and their relation with the delta, theta, alpha, beta and gamma frequency bands on which the visual interpretation of EEG is based; and 3) the identification of the signal waveforms with larger contribution towards the ictal class, according to the activation differences highlighted using the DeepLIFT method. Results show that the kernel size in the first layer determines the interpretability of the extracted features and the sensitivity of the trained models, even though the final performance is very similar after post-processing. Also, we found that amplitude is the main feature leading to an ictal prediction, suggesting that a larger patient population would be required to learn more complex frequency patterns. Still, our methodology was successfully able to generalize patient inter-variability for the majority of the studied population with a classification F1-score of 0.873 and detecting 90% of the seizures.

2021