Data-Driven Determination of Macromolecular Structures
Title: Data-Driven Determination of Macromolecular Structures
DNr: Berzelius-2024-256
Project Type: LiU Berzelius
Principal Investigator: Nicholas Pearce <nicholas.pearce@liu.se>
Affiliation: Linköpings universitet
Duration: 2024-08-29 – 2025-03-01
Classification: 10203
Homepage: https://macromolecular.science
Keywords:

Abstract

Accurate macromolecular structures are the focus of research for thousands of researchers internationally and are foundational components of drug design and other therapeutic efforts. The overall approaches of structure determination and analysis, even as individual steps within this workflow have changed beyond recognition. Notwithstanding the widespread success and impact of macromolecular structural models on our understanding of the molecular basis of life, the fact remains that our models remain fundamentally flawed and incomplete, despite huge technical leaps in the collection of experimental data. The overly simplistic representation of biological macromolecules as a single conformation loses all of the subtlety of molecular motions and dynamics, and cannot be used to understand complex structural transitions such as allostery. Thus, our current models only allow us to go so far in our understanding of molecular processes. Additionally, continuing manual subjective interpretation of the experimental data allows for misinterpretation and bias, especially of bound ligands. These problems will only become more acute with the growth of cryo-electron microscopy (cryoEM), where the average size of structures is much larger than those typical for macromolecular crystallography (MX), and the scale of the modelling challenge increases accordingly. The development of robust AI-driven methods for modelling, validation and analysis therefore remains a significant challenge but also opportunity in structural biology. With increasing automation now enabling the routine collection of multiple datasets for each structural determination experiment, it is time to rethink how we determine molecular structures, by routinely using multiple datasets for structure determination and analysis in AI-driven approaches. We aim to define new best practice approaches in the modelling and analysis of macromolecular structures in MX and cryoEM, and demonstrate the importance of improved models in real drug-discovery efforts.