Employing Machine Learning to investigate properties of molecules and periodic systems
Title: Employing Machine Learning to investigate properties of molecules and periodic systems
SNIC Project: SNIC 2020/13-96
Project Type: SNIC Small Compute
Principal Investigator: Rodrigo Pereira de Carvalho <rodrigo.carvalho@physics.uu.se>
Affiliation: Uppsala universitet
Duration: 2020-11-01 – 2021-11-01
Classification: 10304


Nowadays, it is a common sense on science that the interplay of Artificial Intelligence (more specifically by means of machine learning techniques) and first-principles calculations can lead us to a breakthrough on materials design. Recent works have been employing machine learning to show the possibility to discover new materials [1], calculate physical properties without the high computational cost of prior techniques [2] and even to by-pass the solution of Quantum Mechanic equations [3]. For molecular systems, we can produce high-accurate fingerprints to serve as inputs for machine learning, but for periodic systems (solids, surfaces, etc) we face some difficulties, besides new approaches has been developed [2,4]. The specific goals for this project are: (I) Apply the machine learning machinery to molecular systems to speed-up or by-pass ab initio calculations. (ii) Benchmark a few options to represent periodic systems (especially aimed as electrodes). (iii) Interplay of our machinery with DFT to investigate properties of selected materials. To this date, we have been developing an original database of organic energy materials based on high-quality DFT calculations for molecular and periodic systems. So far, we have an impressive number of ~30000 unique molecular structures, that required more than 310000 DFT calculations, and approximately 30 crystals representing electroactive compounds for Li-ion batteries. From these crystals, we published a paper [5] in a prestigious journal (ChemSusChem) showing how to tailor organic materials to achieve higher voltages and energy densities in organic batteries. Furthermore, there is a manuscript in its final steps showing an innovative methodology based on Artificial Intelligence (AI) to discover novel organic materials, in which we used the database to analyse 20 million new molecules in a high-throughput approach to propose new cathodes for batteries. Finally, a Licentiate is planned to be defended in the middle of November based on these results. As the project is getting in its final phase, we need more time to analyse the proposed molecules in the level of DFT and confirm if the AI results are correct. This project will serve as a stage for this purpose and to benchmark different AI techniques and architectures. Lastly, we are planning to expand the AI framework after publication to other aspects of battery systems. The developed database and the AI-software are going to be distributed under open-source basis within the publication of the manuscript. 1. Balachandran, Prasanna V., et al. Physical Review Materials 2.4 (2018) 2. Faber, Felix, et al. International Journal of Quantum Chemistry 115.16 (2015) 3. Brockherde, Felix, et al. Nature communications 8.1 (2017) 4. Huo, Haoyan, and Matthias Rupp. "Unified representation for machine learning of molecules and crystals." preprint (2017). 5. R. P. Carvalho, C. F. N. Marchiori, D. Brandell, C. M. Araujo, ChemSusChem 2020, 13, 2402.