A transformer for prediction of MS2 spectrum intensities

System

NSC Web

Front Page

Getting Access

Support Email

support@nsc.liu.se

Feedback

Give Feedback

A transformer for prediction of MS2 spectrum intensities

Title:	A transformer for prediction of MS2 spectrum intensities
DNr:	Berzelius-2021-46
Project Type:	LiU Berzelius
Principal Investigator:	Lukas Käll <lukas.kall@scilifelab.se>
Affiliation:	Kungliga Tekniska högskolan
Duration:	2021-09-17 – 2021-12-01
Classification:	10203
Homepage:	http://kaell.org
Keywords:

Abstract

Machine learning has for a long time been an integral part of the interpretation of data from mass spectrometry-based proteomics. Relatively recently a machine-learning structure appeared that has successfully been employed in other areas of bioinformatics, Transformers. One of their key properties is that they enable so-called transfer learning, i.e. adapting networks trained for other tasks to new functionality with relatively few training examples. Here, we implemented a Transformer based on the pre-trained model TAPE for the task of predicting MS2 intensities. TAPE is a general model trained to predict missing residues from protein sequences. Despite being trained for a different task, we could modify its behavior by adding a prediction head at the end of the TAPE model and train it using the spectrum intensity from the training set to the well-known predictor Prosit. We just demonstrate that the predictor, which we call Prosit-Transformer, is outperforming the recurrent neural network-based predictor Prosit, increasing the median angular similarity on its hold-out set from 0.908 to 0.923. However, in order to further improve the results, we need better GPU performance to shorten our weeks-long training cycle. We believe that transformers will significantly increase prediction accuracy for other types of predictions within mass spectrometry-based proteomics, particularly predictions that use amino acid sequences as input.

National Supercomputer Centre at Linköping University

Abstract