Multimodal Graph Generative models for the Biomedical domain
Title: Multimodal Graph Generative models for the Biomedical domain
DNr: Berzelius-2024-390
Project Type: LiU Berzelius
Principal Investigator: Michail Vazirgiannis <mvaz@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2024-10-08 – 2025-05-01
Classification: 10203
Keywords:

Abstract

Graph-structured data contains rich relational information, yet effectively translating this structured information into human-understandable natural language and vice-versa remains a challenging task in machine learning. This project aims to bridge this gap by proposing a novel approach, Graph2Text, to generate coherent and descriptive text from graph-structured data and the opposite. Instead of directly converting graphs into sentences, our idea leverages a sequential tokenization technique to represent the graph structure as a sequence of tokens that reflect the information from the adjacency matrix and can be fed into a large language model (LLM). The primary goal of this research project is to develop an innovative methodology that transforms complex graph representations into a sequence of tokens while preserving the essential relational information. This transformed sequence will serve as input/output to a text generation model, enabling the generation of accurate and human-readable descriptions or summaries of the original graph.