Publication

Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

Related publications (55)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Flickr Hypergroups

Daniel Gatica-Perez, Radu Andrei Negoescu

The amount of multimedia content available online constantly increases, and this leads to problems for users who search for content or similar communities. Users in Flickr often self-organize in user communities through Flickr Groups. These groups are pa ...

Idiap2009

SOM-based Clustering of Multilingual Documents Using an Ontology

Minh Hai Pham

Clustering similar documents is a difficult task for text data mining. Difficulties stem especially from the way documents are translated into numerical vectors. In this chapter, we will present a method that uses Self Organizing Map (SOM) to cluster medic ...

Information Science Reference2008

Learning Cluster Type and Dissimilarity Metric for Each Cluster Using a Set of Possible Cluster Types

Arash Arami

One of the shortcomings of the existing clustering methods is their problems dealing with different shape and size clusters. On the other hand, most of these methods are designed for especial cluster types or have good performance dealing with particular s ...

2007

Likelihood estimation of the extremal index

Mária Süveges

The extension of the likelihood method of Süveges (Extremes, 2007) is presented. The extension allows for finding independent clusters of extreme events and determining the range of dependence on extremal levels, and estimate clustering characteristic of t ...

2007

Short-Term Spatio-Temporal Clustering Applied to Multiple Moving Speakers

Jean-Marc Odobez, Guillaume Lathoud

Distant microphones permit to process spontaneous multi-party speech with very little constraints on speakers, as opposed to close-talking microphones. Minimizing the constraints on speakers permits a large diversity of applications, including meeting summ ...

2007

Discrmininant Models for Text-independent Speaker Verification

This thesis addresses text-independent speaker verification from a machine learning point of view. We use the machine learning framework to better define the problem and to develop new unbiased performance measures and statistical tests to compare objectiv ...

IDIAP2006

Making Retrieval Faster Through Document Clustering

David Grangier, Alessandro Vinciarelli

This work addresses the problem of reducing the time between query submission and results output in a retrieval system. The goal is achieved by considering only a database fraction as small as possible during the retrieval process. Our approach is based on ...

IDIAP2004

Effect of Recognition Errors on Text Clustering

David Grangier, Alessandro Vinciarelli

This paper presents clustering experiments performed over noisy texts (i.e. texts that have been extracted through an automatic process like character or speech recognition). The effect of recognition errors is investigated by comparing clustering results ...

IDIAP2004

OnCall: defeating spikes with a free-market application cluster

George Candea

Even with reasonable overprovisioning, today's Internet application clusters are unable to handle major traffic spikes and flash crowds. As an alternative to fixed-size, dedicated clusters, we propose a dynamically-shared application cluster model based on ...

2004

On Optimal Update Policies and Cluster Sizes for 2-Tier Distributed Systems

Anwitaman Datta, Prasenjit Dey

We try to analyze a generic model for 2-tier distributed systems, exploring the possibility of optimal cluster sizes from an information management perspective, such that the overall cost for updating and searching information may be minimized by adopting ...

2003