Adaptive and reconstructive computer vision models
Title: Adaptive and reconstructive computer vision models
DNr: Berzelius-2024-118
Project Type: LiU Berzelius
Principal Investigator: Hedvig Kjellström <hedvig@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2024-03-21 – 2024-10-01
Classification: 10207
Keywords:

Abstract

The aim of this project is to leverage lightweight low rank adaptation (LoRA) of pre-trained vision models in order to effectively fine-tune an open-vocabulary semantic segmentation model for different domains and seamlessly plug one of several available LoRA models as a way of adapting a foundation model to a specific domain. More specifically, the idea is to use a pre-trained image encoder to create clusters of semantically and visually related images and once we enough data in a certain domain, we want to create a LoRA for that specific cluster (a target domain). Once created, we can reuse these LoRA, whenever an input image falls in (or close to) one of the defined clusters. The hypothesis is that by doing this we can make the segmentation models more resilient to domain shift. This project has been prepared in the past few months through a comprehensive literature study and initial experimentation which has resulted in a experimental pipeline design. We are now ready to run experiments, and this proposal is to provide the infrastructure for them.