Clustering in education, particularly in large-scale online environments like MOOCs, is essential for understanding and adapting to diverse student needs. However, the effectiveness of clustering depends on its interpretability, which becomes challenging w ...
Modern neuroscience research is generating increasingly large datasets, from recording thousands of neurons over long timescales to behavioral recordings of animals spanning weeks, months, or even years. Despite a great variety in recording setups and expe ...
Machine learning (ML) enables artificial intelligent (AI) agents to learn autonomously from data obtained from their environment to perform tasks. Modern ML systems have proven to be extremely effective, reaching or even exceeding human intelligence.
Altho ...
In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in ...
In the realm of point cloud scene understanding, particularly in indoor scenes, objects are arranged following human habits, resulting in objects of certain semantics being closely positioned and displaying notable inter-object correlations. This can creat ...
In the rapidly evolving landscape of machine learning research, neural networks stand out with their ever-expanding number of parameters and reliance on increasingly large datasets. The financial cost and computational resources required for the training p ...
Extracting maximal information from experimental data requires access to the likelihood function, which however is never directly available for complex experiments like those performed at high energy colliders. Theoretical predictions are obtained in this ...
A key challenge across many disciplines is to extract meaningful information from data which is often obscured by noise. These datasets are typically represented as large matrices. Given the current trend of ever-increasing data volumes, with datasets grow ...
Monitoring the health of complex industrial assets is crucial for safe and efficient operations. Health indicators that provide quantitative real-time insights into the health status of industrial assets over time serve as valuable tools for, e.g., fault d ...
We introduce robust principal component analysis from a data matrix in which the entries of its columns have been corrupted by permutations, termed Unlabeled Principal Component Analysis (UPCA). Using algebraic geometry, we establish that UPCA is a well-de ...