Scalable Data analysis of Mass spectrometry for Proteomics of Human Cell Lines
Title: Scalable Data analysis of Mass spectrometry for Proteomics of Human Cell Lines
DNr: NAISS 2024/6-415
Project Type: NAISS Medium Storage
Principal Investigator: Fredrik Edfors <edfors@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2024-12-17 – 2025-10-01
Classification: 10203
Keywords:

Abstract

Mass spectrometry is a state-of-the-art techniques which allows unbiased detection of several biomolecules especially protein at the high-throughput level. With the current advances in MS, there are a tremendous amount of data generated worldwide. Most of them are already stored with an open data consortium for example ProteomeXchange. However, there is not really a generic tool that can process MS data from the machine to the analysis. Most of proteomics scientists use the most convenient tool of choices which mainly come from the vencdor softwares. This may possibly lead to bias in comparison and unimaginably steps of data processing. nf-core/quantms was then developed to handle these issues with the basic architecture of nextflow. It introduces several steps of MS data analysis in one single command whereas it only requires sample-to-data relationship file (SDRF) in order to run. SDRF can handle several serious issues in mass spectrometry research. The file itself requires systematic notation of experiemnts and analysis runs. Interestingly, human cell lines are the most common models to study physiological response in human system. It is thus important to use MS to rule out biological response at the protein levels. By understanding and processing MS dataset in human cell lines reproducibly, it allows us to select the right models for the study. This can reduce cost and precisely opt the optimal model for research questions.