Modifying and retraining AlphaFold2 to integrate experimental information
Title: Modifying and retraining AlphaFold2 to integrate experimental information
DNr: Berzelius-2024-439
Project Type: LiU Berzelius
Principal Investigator: Claudio Mirabello <claudio.mirabello@scilifelab.se>
Affiliation: Linköpings universitet
Duration: 2024-12-01 – 2025-06-01
Classification: 10203
Keywords:

Abstract

This project focuses on aiding experimental determination of the structure of proteins through the use and redesign of DeepMind's AlphaFold2 (AF2) and AlphaFold3 (AF3). AF2 and AF3 are end-to-end, very deep neural network for protein structure prediction that, thanks to its unprecedented accuracy, has revolutionized the whole field of structural biology [1]. Though AF2/AF3 predictions are not perfect, they are accurate enough to aid in the experimental determination of protein structures through NMR, X-ray crystallography or cryo-EM [2]. The scope of this project is to find out the best way of integrating AF2/AF3 within the experimental pipelines and specifically to use BerzeLiUs to develop, train and validate new AF2-inspired models that are able to take into account, as further input, raw data from lab experiments. The type of inputs is also to be determined within this project, and could range anywhere from microscopy imaging data to distance constraints within amino acids. The structural biology community would greatly benefit from such a hybrid system, as it would drastically cut down manual curation of inputs/outputs between steps in the current pipelines. Furthermore, it would greatly increase experimental output, making it possible to build high quality structures even when the experimental data is incomplete, or at too low resolution. Claudio Mirabello, who will be the main person working on this, has 10+ years of experience on developing custom Neural Networks, especially within the field of structural biology, and some of his previous research work have actually contributed to the development of AlphaFold2. We also have a number of ongoing collaborations with experimentalists across Sweden to provide data and knowledge about the single methods. So we believe we have all the tools to succeed in this project. This project is a part of a strategic effort tightly integrated to the recently formed SciLifeLab platform for Integrated Structural Biology, aiming to leverage world-leading research by the combination of experimental and prediction methods. Adding AI-based structure predictions to the uniquely strong infrastructure and competence in structural biology in Sweden has the potential to keep Sweden in the absolute forefront of structural biology in basic and applied research. This project is also tightly connected to the emerging DDLS Data Platform hosted by the SciLifeLab Data Centre, to make web services and databases developed within this initiative readily accessible to the entire research community. 1. Jones, David T., and Janet M. Thornton. "The impact of AlphaFold2 one year on." Nature methods 19.1 (2022): 15-20. 2. Terwilliger, Thomas C., et al. "Improving AlphaFold modeling using implicit information from experimental density maps." bioRxiv (2022).