Title: Applied bioinformatics for Medbioinfo graduate school
DNr: NAISS 2024/22-540
Project Type: NAISS Small Compute
Principal Investigator: Gabriele Pozzati <>
Affiliation: Stockholms universitet
Duration: 2024-04-11 – 2024-07-01
Classification: 10203


This project will provide computational resources to students participating in the Applied Bioinformatics course at the MedBioinfo graduate school. The aim of the course is to help participants acquire a set of transversal skills required to extract meaningful insight from datasets in the era of high throughput life science technologies. Such is the data analysis bottle neck, that these skills are in high demand. Students will gain first hand experience in the tools used to crack the current challenges in reproducible science, big data projects, and data mining on high performance compute farms. Rather than being presented in isolation, all the introduced tools will be used in a comprehensive case study of a real world dataset : 41 Gbp of Illumina raw sequence reads (both metagenomes and metatranscriptomes) from 125 human oral swabs from positive or negative COVID PCR patients published in Dec 2021. Each course participant will be in charge of analyzing a subset of the total 250 available FASTQ files. Students will build and run an analysis pipeline in order to detect the presence of RNA or DNA in the swabs that correspond to known pathogens (bacterial and viral).