Deep learning models for modelling genetic variation
Title: Deep learning models for modelling genetic variation
DNr: Berzelius-2024-360
Project Type: LiU Berzelius
Principal Investigator: Carl Nettelblad <carl.nettelblad@it.uu.se>
Affiliation: Uppsala universitet
Duration: 2024-10-01 – 2025-04-01
Classification: 40402
Homepage: https://github.com/kausmees/GenoCAE
Keywords:

Abstract

We are developing deep learning models based on autoencoder architectures for modelling genetic variation, as well as predicting traits of economic importance in plant and animal breeding applications. Our deep learning genetics model was recently accepted in the genetics journal G3. There are currently only a few successful models for full genome models, with ours being one. We're currently exploring contrastive learning and other recent self-supervised approaches for low-dimensional embeddings of this kind of data. We also still hope to start evaluating diffusion-based models in this context. Another new development we're considering is to use deep learning embedding techniques for genome assembly tasks. Initial attempts are highly promising and almost nothing has been published along these lines before.