**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Chemical machine learning with kernels: The key impact of loss functions

Résumé

Machine learning promises to accelerate materials discovery by allowing computational efficient property predictions from a small number of reference calculations. As a result, the literature spent a considerable effort in designing representations that capture basic physical properties so far. In stark contrast, our work focuses on the less-studied learning formulations in this context in order to exploit inner structures in the prediction errors. In particular, we propose to directly optimize basic loss functions of the prediction error metrics typically used in the literature, such as the mean absolute error or the worst case error. We show that a proper choice of the loss function can directly improve the prediction performance in the desired metric, albeit at the cost of additional computations during training. To support this claim, we describe the statistical learning theoretic foundations and provide numerical evidence with the prediction of atomization energies for a database of small organic molecules

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (15)

Publications associées (4)

Fonction objectif

vignette|comparaison de certains substituts de la fonction de perte
Le terme fonction objectif ou fonction économique, est utilisé en optimisation mathématique et en recherche opérationnelle pour dési

Prédiction dynamique

La prédiction dynamique est une méthode inventée par Newton et Leibniz. Newton l’a appliquée avec succès au mouvement des planètes et de leurs satellites. Depuis elle est devenue la grande méthode de

Nombre

Un nombre est un concept permettant d’évaluer et de comparer des quantités ou des rapports de grandeurs, mais aussi d’ordonner des éléments en indiquant leur rang. Souvent écrits à l’aide d’un ou pl

Chargement

Chargement

Chargement

Our brain continuously self-organizes to construct and maintain an internal representation of the world based on the information arriving through sensory stimuli. Remarkably, cortical areas related to different sensory modalities appear to share the same functional unit, the neuron, and develop through the same learning mechanism, synaptic plasticity. It motivates the conjecture of a unifying theory to explain cortical representational learning across sensory modalities. In this thesis we present theories and computational models of learning and optimization in neural networks, postulating functional properties of synaptic plasticity that support the apparent universal learning capacity of cortical networks. In the past decades, a variety of theories and models have been proposed to describe receptive field formation in sensory areas. They include normative models such as sparse coding, and bottom-up models such as spike-timing dependent plasticity. We bring together candidate explanations by demonstrating that in fact a single principle is sufficient to explain receptive field development. First, we show that many representative models of sensory development are in fact implementing variations of a common principle: nonlinear Hebbian learning. Second, we reveal that nonlinear Hebbian learning is sufficient for receptive field formation through sensory inputs. A surprising result is that our findings are independent of specific details, and allow for robust predictions of the learned receptive fields. Thus nonlinear Hebbian learning and natural statistics can account for many aspects of receptive field formation across models and sensory modalities. The Hebbian learning theory substantiates that synaptic plasticity can be interpreted as an optimization procedure, implementing stochastic gradient descent. In stochastic gradient descent inputs arrive sequentially, as in sensory streams. However, individual data samples have very little information about the correct learning signal, and it becomes a fundamental problem to know how many samples are required for reliable synaptic changes. Through estimation theory, we develop a novel adaptive learning rate model, that adapts the magnitude of synaptic changes based on the statistics of the learning signal, enabling an optimal use of data samples. Our model has a simple implementation and demonstrates improved learning speed, making this a promising candidate for large artificial neural network applications. The model also makes predictions on how cortical plasticity may modulate synaptic plasticity for optimal learning. The optimal sampling size for reliable learning allows us to estimate optimal learning times for a given model. We apply this theory to derive analytical bounds on times for the optimization of synaptic connections. First, we show this optimization problem to have exponentially many saddle-nodes, which lead to small gradients and slow learning. Second, we show that the number of input synapses to a neuron modulates the magnitude of the initial gradient, determining the duration of learning. Our final result reveals that the learning duration increases supra-linearly with the number of synapses, suggesting an effective limit on synaptic connections and receptive field sizes in developing neural networks.

Volkan Cevher, Sandip De, Junhong Lin, Quoc Tran Dinh

Machine learning promises to accelerate materials discovery by allowing computational efficient property predictions from a small number of reference calculations. As a result, the literature has spent a considerable effort in designing representations that capture basic physical properties. Our work focuses on the less-studied learning formulations in this context in order to exploit inner structures in the prediction errors. In particular, we propose to directly optimize basic loss functions of the prediction error metrics typically used in the literature, such as the mean absolute error or the worst case error. In some instances, a proper choice of the loss function can directly reduce reasonably the prediction performance in the desired metric, albeit at the cost of additional computations during training. To support this claim, we describe the statistical learning theoretic foundations, and provide supporting numerical evidence with the prediction of atomization energies for a database of small organic molecules.

2019The control of compliant robots is, due to their often nonlinear and complex dynamics, inherently difficult. The vision of morphological computation proposes to view these aspects not only as problems, but rather also as parts of the solution. Non-rigid body parts are not seen anymore as imperfect realizations of rigid body parts, but rather as potential computational resources. The applicability of this vision has already been demonstrated for a variety of complex robot control problems. Nevertheless, a theoretical basis for understanding the capabilities and limitations of morphological computation has been missing so far. We present a model for morphological computation with compliant bodies, where a precise mathematical characterization of the potential computational contribution of a complex physical body is feasible. The theory suggests that complexity and nonlinearity, typically unwanted properties of robots, are desired features in order to provide computational power. We demonstrate that simple generic models of physical bodies, based on mass-spring systems, can be used to implement complex nonlinear operators. By adding a simple readout (which is static and linear) to the morphology, such devices are able to emulate complex mappings of input to output streams in continuous time. Hence, by outsourcing parts of the computation to the physical body, the difficult problem of learning to control a complex body, could be reduced to a simple and perspicuous learning task, which can not get stuck in local minima of an error function.

2011