Generation of training data for AI
Title: Generation of training data for AI
DNr: NAISS 2023/22-1235
Project Type: NAISS Small Compute
Principal Investigator: Louise Persson <louise.persson@kemi.uu.se>
Affiliation: Uppsala universitet
Duration: 2023-11-16 – 2024-12-01
Classification: 10603
Keywords:

Abstract

Molecular dynamics (MD) simulations find great use in combination with native mass spectrometry experiments for studying proteins. However, the aspect of how charges are distributed on a protein in the experiments is enigmatic. Currently, it cannot be detected experimentally, and methods for computational predictions are very computationally expensive. I plan to train a deep learning algorithm for predicting the distribution of charges on proteins under native mass spectrometry conditions, that can be used for performing this type of MD simulations. First, I need to generate data to train on, which is what I will use the resources in this project for. I will need to generate multiple charge configurations for a set of proteins, and perform short MD simulations to capture their conformational flexibility. I will then use DFT to obtain the total energy of the charge configurations, which I will use as ground truth in training of the algorithm.