Identification of novel antibiotic resistance genes using Deep Neural Networks (Mainly Graph Neural Networks)
Title: Identification of novel antibiotic resistance genes using Deep Neural Networks (Mainly Graph Neural Networks)
SNIC Project: Berzelius-2021-37
Project Type: LiU Berzelius
Principal Investigator: Sofiane Ennadir <ennadir@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2021-09-01 – 2022-03-01
Classification: 30402
Keywords:

Abstract

The rapid emergence of resistant bacteria in clinical settings has endangered modern healthcare, including sepsis treatment, organ transplants, and cancer care1. Many bacterial strains carry multiple antibiotic resistance genes (ARGs) that are mobile, and confer resistance to a broad range of antibiotics. Early identification of ARGs with the potential to become clinically relevant or has been already accumulated in pathogens is valuable. It could facilitate surveillance through detection and confinement of ARGs, improve molecular diagnostics, advise optimal antibiotic treatments, and inform drug discovery efforts. In the last few years, Deep Learning has shown great results in tackling a diverse and broad spectrum of problems, setting unprecedented improvements in performance. These methods have also been heavily applied to subjects related to biology, in particular, predicting the 3D structure and the function of proteins4. This can empower us to improve the discovery of ARGs by studying a comprehensive dataset of protein structures, including those encoded by ARGs and their sensitive homologs, which would be hard to collect with experimental methods. In this study, we hypothesize that using deep neural networks, the 3D structural motifs in proteins could be utilized to identify novel ARGs that are distantly related to known ARGs but have high structural similarities. The method will be designed to be applicable in genomics and metagenomics. The method and the results will be published in peer-reviewed journals. To be findable and accessible, the scripts and the trained models will be deposited in a public repository such as GitHub. Moreover, we will implement our methods to run on various platforms by using docker containers and utilizing libraries that are compatible with different hardware configurations, e.g., servers without GPU. We also aim to embed the analyses pipeline in a user-friendly web application to process input genomic and metagenomic data.