Publication

Chemical machine learning with kernels: The key impact of loss functions

Volkan Cevher, Sandip De, Junhong Lin, Van Quang Nguyen
2018
Report or working paper
Abstract

Machine learning promises to accelerate materials discovery by allowing computational efficient property predictions from a small number of reference calculations. As a result, the literature spent a considerable effort in designing representations that capture basic physical properties so far. In stark contrast, our work focuses on the less-studied learning formulations in this context in order to exploit inner structures in the prediction errors. In particular, we propose to directly optimize basic loss functions of the prediction error metrics typically used in the literature, such as the mean absolute error or the worst case error. We show that a proper choice of the loss function can directly improve the prediction performance in the desired metric, albeit at the cost of additional computations during training. To support this claim, we describe the statistical learning theoretic foundations and provide numerical evidence with the prediction of atomization energies for a database of small organic molecules

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (32)
Machine learning
Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.
Errors and residuals
In statistics and optimization, errors and residuals are two closely related and easily confused measures of the deviation of an observed value of an element of a statistical sample from its "true value" (not necessarily observable). The error of an observation is the deviation of the observed value from the true value of a quantity of interest (for example, a population mean). The residual is the difference between the observed value and the estimated value of the quantity of interest (for example, a sample mean).
Mean squared prediction error
In statistics the mean squared prediction error (MSPE), also known as mean squared error of the predictions, of a smoothing, curve fitting, or regression procedure is the expected value of the squared prediction errors (PE), the square difference between the fitted values implied by the predictive function and the values of the (unobservable) true value g. It is an inverse measure of the explanatory power of and can be used in the process of cross-validation of an estimated model.
Show more
Related publications (34)

Random matrix methods for high-dimensional machine learning models

Antoine Philippe Michel Bodin

In the rapidly evolving landscape of machine learning research, neural networks stand out with their ever-expanding number of parameters and reliance on increasingly large datasets. The financial cost and computational resources required for the training p ...
EPFL2024

Partial discharge localization in power transformer tanks using machine learning methods

Marcos Rubinstein, Hamidreza Karami

This paper presents a comparison of machine learning (ML) methods used for three-dimensional localization of partial discharges (PD) in a power transformer tank. The study examines ML and deep learning (DL) methods, ranging from support vector machines (SV ...
2024

Glenohumeral joint force prediction with deep learning

Dominique Pioletti, Alexandre Terrier, Patrick Goetti, Philippe Büchler

Deep learning models (DLM) are efficient replacements for computationally intensive optimization techniques. Musculoskeletal models (MSM) typically involve resource-intensive optimization processes for determining joint and muscle forces. Consequently, DLM ...
London2024
Show more