Characterising macromolecular interactions with artificial intelligence and cryo-EM
Abstract
A protein’s function is determined by its 3D structure, which is usually non-static. In fact, for many proteins, the ability to undergo conformational transitions is key to letting them perform their function; structural transitions allow membrane channels and transporters to either block or allow entry into the cell, allow receptors to relay signaling across membranes, and allow many enzymes to catalyze different reactions. To fully understand the function of these proteins, then, we need a characterization of the proteins in all their functional states. Deep-learning-based models like AlphaFold2 [Jumper et al. Nature, 2021] and AlpaFold3 [Abramson et al. Nature, 2024] have greatly improved our ability to predict the 3D structure of proteins, but usually target single-state predictions. For AlphaFold2, we [Lidbrink et al. Plos Computational Biology, 2025], and others [del Alamo et al. eLife, 2022] in the field have shown that stochastic subsampling of the multiple sequence alignment (MSA) depth can guide AlphaFold2 to sample alternative states. Alternative promising strategies have also been proposed, like clustering the MSA [Wayment-Steele et al. Nature, 2023], and stochastically masking columns in the MSA [Kalakoti et al. Nature Communications Biology, 2025]. While these methods initially were developed for and have been benchmarked on AlphaFold2, they can easily be adapted for AlphaFold3, although their utility in this case remains to be demonstrated. We seek to evaluate the performance of these strategies on AlphaFold3 and compare them to other AI methods that have been developed for sampling multiple states of proteins, like bioEmu [Lewis et al. Science, 2025]. Further, we seek to examine whether we can improve the methods by tweaking some details, e.g., by masking the MSA using different amino acids.