Sokseiha Muy, Yang Yu, Tian Xie
Data-driven approaches based on high-throughput capabilities and machine learning hold promise in revolutionizing human-centred materials discovery for sustainability and decarbonization. This Review examines the strengths and limitations of different traditional and emerging approaches to demonstrate their inherent connection and highlight the evolving paradigms of materials design. Breakthroughs in molecular and materials discovery require meaningful outliers to be identified in existing trends. As knowledge accumulates, the inherent bias of human intuition makes it harder to elucidate increasingly opaque chemical and physical principles. Moreover, given the limited manual and intellectual throughput of investigators, these principles cannot be efficiently applied to design new materials across a vast chemical space. Many data-driven approaches, following advances in high-throughput capabilities and machine learning, have tackled these limitations. In this Review, we compare traditional, human-centred methods with state-of-the-art, data-driven approaches to molecular and materials discovery. We first introduce the limitations of human-centred Edisonian, model-system and descriptor-based approaches. We then discuss how data-driven approaches can address these limitations by promoting throughput, reducing cognitive overload and biases, and establishing atomistic understanding that is transferable across a broad chemical space. We examine how high-throughput capabilities can be combined with active learning and inverse design to efficiently optimize materials out of millions or an intractable number of candidates. Lastly, we pinpoint challenges to accelerate future workflows and ultimately enable self-driving platforms, which automate and streamline the optimization of molecules and materials in iterative cycles.
NATURE PORTFOLIO2022