Multimodal Graph Generative models for the Biomedical domain
Title: Multimodal Graph Generative models for the Biomedical domain
DNr: Berzelius-2025-390
Project Type: LiU Berzelius
Principal Investigator: Michail Vazirgiannis <mvaz@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2025-12-05 – 2026-07-01
Classification: 10203
Keywords:

Abstract

Building upon the advancements achieved with Prot2Text-V2, we now extend our research toward Prot2Text-Reasoning, a next-generation model designed to move beyond descriptive prediction into the realm of interpretable biological reasoning. While Prot2Text-V2 effectively generates semantically rich and accurate function descriptions, it does not explicitly model the underlying causal relationships between sequence, structure, and biological activity. Prot2Text-Reasoning addresses this limitation by introducing a dedicated reasoning layer that integrates multimodal protein representations with structured biological knowledge and chain-of-thought inference mechanisms. Through the incorporation of symbolic ontologies, retrieval-augmented context, and step-wise logical reasoning, the model aims not only to describe a protein’s function but also to explain why and how particular motifs, domains, or interactions lead to that function. This continuation represents a paradigm shift from generative understanding to interpretable reasoning, laying the foundation for protein language models that can assist in hypothesis generation, functional annotation, and knowledge-driven discovery in molecular biology.