**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Person# Lionel Jérémie Martin

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related units

Loading

Courses taught by this person

Loading

Related research domains

Loading

Related publications

Loading

People doing similar research

Loading

Courses taught by this person

No results

Related units (12)

Related research domains (2)

Related publications (9)

People doing similar research (158)

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machin

Graph drawing

Graph drawing is an area of mathematics and computer science combining methods from geometric graph theory and information visualization to derive two-dimensional depictions of graphs arising from a

Loading

Loading

Loading

Data is pervasive in today's world and has actually been for quite some time. With the increasing volume of data to process, there is a need for faster and at least as accurate techniques than what we already have. In particular, the last decade recorded the effervescence of social networks and ubiquitous sensing (through smartphones and the Internet of Things). These phenomena, including also the progresses in bioinformatics and traffic monitoring, pushed forward the research on graph analysis and called for more efficient techniques.
Clustering is an important field of machine learning because it belongs to the unsupervised techniques (i.e., one does not need to possess a ground truth about the data to start learning). With it, one can extract meaningful patterns from large data sources without requiring an expert to annotate a portion of the data, which can be very costly. However, the techniques of clustering designed so far all tend to be computationally demanding and have trouble scaling with the size of today's problems.
The emergence of Graph Signal Processing, attempting to apply traditional signal processing techniques on graphs instead of time, provided additional tools for efficient graph analysis. By considering the clustering assignment as a signal lying on the nodes of the graph, one may now apply the tools of GSP to the improvement of graph clustering and more generally data clustering at large.
In this thesis, we present several techniques using some of the latest developments of GSP in order to improve the scalability of clustering, while aiming for an accuracy resembling that of Spectral Clustering, a famous graph clustering technique that possess a solid mathematical intuition.
On the one hand, we explore the benefits of random signal filtering on a practical and theoretical aspect for the determination of the eigenvectors of the graph Laplacian. In practice, this attempt requires the design of polynomial approximations of the step function for which we provided an accelerated heuristic. We used this series of work in order to reduce the complexity of dynamic graphs clustering, the problem of defining a partition to a graph which is evolving in time at each snapshot. We also used them to propose a fast method for the determination of the subspace generated by the first eigenvectors of any symmetrical matrix. This element is useful for clustering as it serves in Spectral Clustering but it goes beyond that since it also serves in graph visualization (with Laplacian Eigenmaps) and data mining (with Principal Components Projection).
On the other hand, we were inspired by the latest works on graph filter localization in order to propose an extremely fast clustering technique. We tried to perform clustering by only using graph filtering and combining the results in order to obtain a partition of the nodes.
These different contributions are completed by experiments using both synthetic datasets and real-world problems. Since we think that research should be shared in order to progress, all the experiments made in this thesis are publicly available on my personal Github account.

Lionel Jérémie Martin, Pearl Pu Faltings

Reviews keep playing an increasingly important role in the decision process of buying products and booking hotels. However, the large amount of available information can be confusing to users. A more succinct interface, gathering only the most helpful reviews, can reduce information processing time and save effort. To create such an interface in real time, we need reliable prediction algorithms to classify and predict new reviews which have not been voted but are potentially helpful. So far such helpfulness prediction algorithms have benefited from structural aspects, such as the length and readability score. Since emotional words are at the heart of our written communication and are powerful to trigger listeners’ attention, we believe that emotional words can serve as important parameters for predicting helpfulness of review text. Using GALC, a general lexicon of emotional words associated with a model representing 20 different categories, we extracted the emotionality from the review text and applied supervised classification method to derive the emotion-based helpful review prediction. As the second contribution, we propose an evaluation framework comparing three different real-world datasets extracted from the most well-known product review websites. This framework shows that emotion-based methods are outperforming the structure-based approach, by up to 9%.

2014, ,

Spectral clustering is a widely studied problem, yet its complexity is prohibitive for dynamic graphs of even modest size. We claim that it is possible to get information from past cluster assignments to expedite computation. Our approach builds on a recent idea of sidestepping the main bottleneck of spectral clustering, i.e., computing the graph eigenvectors, by using fast Chebyshev graph filtering of random signals. We show that the proposed algorithm achieves clustering assignments with quality approximating that of of spectral clustering and that it can yield significant complexity benefits when the graph dynamics are appropriately bounded.

2018