AI-driven Discovery and Classification of Glycosylation Machinery in Bacteroidota
Title: AI-driven Discovery and Classification of Glycosylation Machinery in Bacteroidota
DNr: Berzelius-2026-149
Project Type: LiU Berzelius
Principal Investigator: Andre Mateus <andre.mateus@umu.se>
Affiliation: Umeå universitet
Duration: 2026-04-29 – 2026-11-01
Classification: 10601
Homepage: https://mateuslab.com/
Keywords:

Abstract

The phylum Bacteroidota includes numerous species that are central to the human gut microbiome, as well as many important pathogens. A unique and conserved feature of these bacteria is a general protein glycosylation system that modifies a large fraction of extracytoplasmic proteins. Although this system appears essential and broadly conserved across the phylum, many of the core components of the machinery remain unidentified. We will use AI/ML methods as the primary tool to systematically identify and characterize these components across Bacteroidota. In particular, we will leverage structure prediction using AlphaFold, combined with AI-based clustering, large-scale functional annotation, and comparative structural analysis to discover and classify candidates. Building on a previous allocation, which generated a promising candidate oligosaccharyltransferase (OTase), we will extend these approaches to follow up and systematically identify additional components of the glycosylation machinery. By integrating these methods at the scale of the entire phylum, we aim to uncover conserved and divergent features that would be difficult to detect using conventional approaches alone. In parallel, we will apply the same AI/ML-driven framework to the discovery and classification of glycosyltransferases (GTs), combining AlphaFold-based structural comparisons with state-of-the-art AI classifiers, automated annotation methods, and clustering approaches to identify novel GT families and refine functional classification. Building on an existing broadly representative GT grouping and curated dataset, we will use the requested resources to perform the large-scale structural and AI-driven analyses needed to resolve functional relationships within and across families. While focused on Bacteroidota, these methods have broader applicability across kingdoms and biological systems, potentially improving GT annotation and functional prediction beyond this specific project.