The chemical industry faces mounting pressure to develop more sustainable processes while accelerating innovation to address climate challenges. Traditional discovery approaches in chemistry follow an iterative cycle: forming hypotheses about new molecules or reactions, testing these hypotheses experimentally, analyzing the results, and using these insights to refine future hypotheses. Each stage presents unique challenges that can create bottlenecks in the discovery process. At the hypothesis stage, researchers must navigate vast chemical spaces to identify promising candidates. During testing, optimal reaction conditions must be determined, focusing on reactivity while also considering sustainability. Analysis requires the interpretation of complex spectroscopic data, and the refinement stage demands efficient integration of all gathered information to guide future experiments.
Digital chemistry methods offer promising approaches to accelerate this discovery cycle. By leveraging artificial intelligence, machine learning, and optimization techniques, these methods can enhance each stage of the process while promoting more sustainable practices. This work investigates how these computational tools can be effectively deployed across the entire discovery workflow, demonstrating their potential through several case studies.
Following the cycle we first showcase the potential of generative AI to develop new hypotheses by creating new molecules with desired properties. In our case we propose new catalyst candidates for the Suzuki cross-coupling with desired binding energies. Subsequently, to support the testing phase with digital methods, transformer-based models for solvent recommendation were developed to predict suitable solvents for chemical reactions while suggesting greener alternatives. The effectiveness of these recommendations was successfully experimentally validated. The next step is automating the analysis, showcased by a case study on assisting in the interpretation of 1H-NMR spectra. The attention mechanisms in a transformer model were leveraged to establish a mapping between 1H-NMR spectral peaks and molecular substructures, achieving high accuracy in assigning the experimental spectra to the correct molecule. And lastly, the iterative formation of new hypotheses on previous experiments was accelerated using Bayesian optimization in combination with automated synthesis hardware. This combination enabled the efficient optimization of iodoalkyne synthesis across multiple starting materials while exploring only a small part of the potential parameter space.
Recognizing that the impact of digital chemistry tools depends heavily on their accessibility, we highlight a potential method to increase accessibility by packaging existing chemistry AI tools into an App, that does not requiring coding knowledge and focused on local execution. Finally, an examination of the environmental footprint of computational chemistry methods themselves emp