On Perfect Clustering of High Dimension, Low Sample Size Data
Publications associées (79)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Testing mutual independence among several random vectors of arbitrary dimensions is a challenging problem in Statistics, and it has gained considerable interest in recent years. In this article, we propose some nonparametric tests based on different notion ...
TAYLOR & FRANCIS LTD2021
,
We study the problem of constructing epsilon-coresets for the (k, z)-clustering problem in a doubling metric M(X, d). An epsilon-coreset is a weighted subset S subset of X with weight function w : S -> R->= 0, such that for any k-subset C is an element of ...
IEEE COMPUTER SOC2018
, , ,
Atomistic modeling of phase transitions, chemical reactions, or other rare events that involve overcoming high free energy barriers usually entails prohibitively long simulation times. Introducing a bias potential as a function of an appropriately chosen s ...
AMER CHEMICAL SOC2020
, ,
Motivation: Unbiased clustering methods are needed to analyze growing numbers of complex data sets. Currently available clustering methods often depend on parameters that are set by the user, they lack stability, and are not applicable to small data sets. ...
2019
Testing for equality of two high-dimensional distributions is a challenging problem, and this becomes even more challenging when the sample size is small. Over the last few decades, several graph-based two-sample tests have been proposed in the literature, ...
We study the problem of explainable clustering in the setting first formalized by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). A k-clustering is said to be explainable if it is given by a decision tree where each internal node splits data point ...
This paper traces the plunge and rebound of the taxi market in Shenzhen, China through the COVID-19 lockdown. A four-week taxi GPS trajectory data set is collected in the first quarter of 2020, which covers the period of lockdown and phased reopening in th ...
Graph learning methods have recently been receiving increasing interest as means to infer structure in datasets. Most of the recent approaches focus on different relationships between a graph and data sample distributions, mostly in settings where all avai ...
Background: Obesity and obesity-related diseases represent a major public health concern. Recently, studies have substantiated the role of sugar-sweetened beverages (SSBs) consumption in the development of these diseases.The fine identification of populati ...
Clustering is a method for discovering structure in data, widely used across many scientific disciplines. The two main clustering problems this dissertation considers are K-means and K-medoids. These are NP-hard problems in the number of samples and cluste ...