Publication

Encoding quantum-chemical knowledge into machine-learning models of complex molecular properties

Ksenia Briling
2024
EPFL thesis
Abstract

Statistical (machine-learning, ML) models are more and more often used in computational chemistry as a substitute to more expensive ab initio and parametrizable methods. While the ML algorithms are capable of learning physical laws implicitly from data, addition of someprior physical knowledge improves the results and accelerates the training. This thesis covers several aspects of enhancing ML models with quantum-chemical information: representation design, preprocessing of the input data, and loss function choice.The first part focuses on extension of the symmetry-adapted Gaussian process regression model of the electron density. First, we study how the choice of density-fitting and training-loss-function metrics impacts the quality of the predictions. Withal, we show that densitiespredicted by the original model do not integrate to the exact number of electrons which compromises the extrapolative capabilities, and propose a modified, constrained model along with an a posteriori correction. Then, the framework is applied to the on-top pair density.Using a specialized fitting basis set, we train a model to predict CASSCF-quality on-top pair density and compute the on-top pair ratio to visualize static electron correlation effects.The second part introduces the spectrum of approximated Hamiltonian matrices (SPAHM), a family of physics-based molecular representations. Eigenvalue SPAHM is a global representation built from occupied-orbital eigenvalues of an initial-guess Hamiltonian. SPAHM(a,b) are local representations based on initial-guess-level electron densities attributed to atoms and bonds. These representations not only distinguish different molecules and conformations, but also different spin, charge, and potentially electronic states. The advantages of SPAHM are demonstrated on datasets featuring a wide variation of charge and spin.The last part is devoted to application of equivariant neural networks to chemical reaction properties. EquiReact — the model proposed — predicts reaction barriers from 3D structures of reactants and products. Its high interpolative and extrapolative capabilities, particularly in the absence of atom-mapping information, are demonstrated on several datasets. Overall, the work presented in this thesis contributes to the global effort to develop, improve, and advance ML-based methods used in computational chemistry.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (32)
Molecular dynamics
Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of the system. In the most common version, the trajectories of atoms and molecules are determined by numerically solving Newton's equations of motion for a system of interacting particles, where forces between the particles and their potential energies are often calculated using interatomic potentials or molecular mechanical force fields.
Computational chemistry
Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of molecules, groups of molecules, and solids. It is essential because, apart from relatively recent results concerning the hydrogen molecular ion (dihydrogen cation, see references therein for more details), the quantum many-body problem cannot be solved analytically, much less in closed form.
Ab initio quantum chemistry methods
Ab initio quantum chemistry methods are computational chemistry methods based on quantum chemistry. The term ab initio was first used in quantum chemistry by Robert Parr and coworkers, including David Craig in a semiempirical study on the excited states of benzene. The background is described by Parr. Ab initio means "from first principles" or "from the beginning", implying that the only inputs into an ab initio calculation are physical constants.
Show more
Related publications (99)

Machine learning-aided generative molecular design

Philippe Schwaller, Jeff Guo

Machine learning has provided a means to accelerate early-stage drug discovery by combining molecule generation and filtering steps in a single architecture that leverages the experience and design preferences of medicinal chemists. However, designing mach ...
Nature Portfolio2024

Efficient and insightful descriptors for representing molecular and material space

Alexander Jan Goscinski

Data-driven approaches have been applied to reduce the cost of accurate computational studies on materials, by using only a small number of expensive reference electronic structure calculations for a representative subset of the materials space, and using ...
EPFL2024

Advancing Computational Chemistry with Stochastic and Artificial Intelligence Approaches

Justin Villard

Computational chemistry aims to simulate reactions and molecular properties at the atomic scale, advancing the design of novel compounds and materials with economic, environmental, and societal implications. However, the field relies on approximate quantum ...
EPFL2023
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.