Representation Learning Methods in Computer Vision
Title: Representation Learning Methods in Computer Vision
DNr: Berzelius-2025-236
Project Type: LiU Berzelius
Principal Investigator: Atsuto Maki <atsuto@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2025-08-04 – 2026-03-01
Classification: 20208
Keywords:

Abstract

This project is a continuation of both Berzelius-2025-179 and Berzelius-2025-25, previously led by the same principal investigator Atsuto Maki, and follows the recommendation to merge the two efforts. Representation learning is a key component in modern computer vision, enabling models to automatically learn rich, high-level features from raw data. In this project, we aim to advance representation learning techniques in computer vision in the following areas: 1. Football Tracking Data Understanding This project aims to develop foundation models for football understanding by advancing self-supervised representation learning on multi-player tracking data, capturing the spatial and temporal dynamics of 22 players and the ball. In the current stage, we focus on learning representations of individual player motion through tasks such as joint motion forecasting. These representations will be evaluated across downstream tasks including player motion prediction, shot spotting, and action recognition. Due to the scale and complexity of the data, substantial computational resources are required for model training and experiments. This part is related to EA Sports TRACAB and KTH, and partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Marianne and Marcus Wallenberg Foundation. 2. Utilizing synthetic data for quality inspection in manufacturing This project aims to address the limitation of insufficient annotated data in manufacturing quality inspection by training deep learning-based computer vision models entirely on synthetic data generated from CAD models, with a focus on learning robust object representations that bridge the domain gap for reliable real-world performance. In the current stage, we are working on extending synthetic active learning to more complicated manufacturing environments. This requires generating synthetic data on various manufacturing use cases, training object detection models and studying the effect on different aspects such as texture, object size, pre-processing and rendering methods. This part is a collaboration between Scania CV AB and KTH. It is partially supported by the Wallenberg AI, Autonomous Systems, and Software Program (WASP) funded by the Marianne and Marcus Wallenberg Foundation. 3. Deep Continual Learning with Imbalanced Data for Intelligent Systems This project focuses on developing novel continual learning methodologies while addressing the challenge of class imbalance within large-scale, evolving datasets. The core objective is to create robust representation learning techniques that can learn from a continuous stream of data in a task-agnostic manner, without forgetting previously learned knowledge. The research will primarily target computer vision but will also extend to NLP to explore the applicability of these methods across different modalities. 4. Steerable Latent World Models This project aims to advance steerable latent world models for learning world models that can be used to plan in latent space. In particular, currently we will evaluate how a few different design aspects of our proposed approach, namely how we model the latent transformation distributions. E.g. VAE, discrete, energy-based models, continuous flows. Additionally, we will investigate different conditioning approaches. Furthermore, extensive ablations will be conducted to assess the impact of key hyper-parameters.