In Silico Drug Development
This Large Storage application is co-submitted with the Large Compute project application for the period Jan 1 - Dec 31 2022, and is a continuation of Large Storage 2020/4-5 and SNIC 2020/6-286 (NSC Center storage) .
Computational chemistry has an important role to fill in the development of new pharmaceuticals. With high performance computer clusters coupled to the latest developments in algorithms and software, we are able to screen vast libraries of compounds searching for new drug candidates, create in silico models of target proteins, and explore protein-protein interactions crucial for e.g. signaling pathways in cancer cells or toxin mechanisms of action.
We focus our studies on several multi-protein complexes as targets for cancer therapy, in order to identify small molecule inhibitors able to block modes of action or signaling pathways. These involve the multimeric UPRosome including two timers of the activated IRE1 receptor, mRNA substrate, RtcB ligase, and phosphorylating/dephosphorylating enzymes ABL and PTP1B, and studies of several death inducing or death effector complexes such as the apoptosome, the stressosome, and the necrosome. We also explore the mechanism and identify possible small molecular binders to the heavily upregulated MTHFD2 playing a key role in cancer cell drug resistance, and the small peptide AGR2 which recent results have shown to be an inducer of tumorogenesis.
We follow well-established protocols for these studies, involving protein preparation (homology modelling if needed), protein-protein docking calculations according to a recent ‘consensus structure’ protocol developed in our group, followed by replica MD simulations to determine stabilities and key interactions. Normally, the MD simulations carried out are of the length 500-1000 ns each, and performed in triplicate, placing high demands for HPC resources. In the drug development projects, we perform systematic docking of large databases (up to 1bn compounds), refined docking of top ranked ligands, and detailed BPMD and MD simulations of resulting complexes, followed by additional hit-to-lead optimizations. The size of the compound libraries we use in our research requires the use of massively parallel execution. We are currently extending our work into the area of Machine Learning de novo drug discovery, with very promising initial data.
The size and extent of the simulations, and amount of data processed in the screening campaigns, justifies the resources applied for, whereby we ask to increase the allocation to 400 TB / 2 000 000 files on Klemming@PDC, 50 TB / 1 000 000 files on Cephyr@C3SE and 200 TB / 2 000 000 files on NSC Center storage.