Representation Learning Methods in Computer Vision
| Title: |
Representation Learning Methods in Computer Vision |
| DNr: |
Berzelius-2026-75 |
| Project Type: |
LiU Berzelius |
| Principal Investigator: |
Atsuto Maki <atsuto@kth.se> |
| Affiliation: |
Kungliga Tekniska högskolan |
| Duration: |
2026-03-31 – 2026-10-01 |
| Classification: |
20208 |
| Keywords: |
|
Abstract
This project is a continuation of Berzelius-2025-236.
Representation learning is a key component in modern computer vision, enabling models to automatically learn rich, high-level features from raw data. In this project, we aim to advance representation learning techniques in computer vision in the following areas:
1. Football Tracking Data Understanding
In the current stage, we focus on learning representations for multi-player, ball-aware scenarios through tasks such as motion prediction. These representations will be evaluated on downstream tasks, including action recognition and possession detection.
2. Deep Continual Learning with Imbalanced Data for Intelligent Systems
Building on the robust representation learning frameworks established in the previous cycle, this project will now pivot to Deep Continual Learning with Imbalanced Data. The objective is to develop techniques that can learn from continuous streams of data without catastrophic forgetting, while specifically addressing class imbalance. The research will target computer vision benchmarks and possibly extend to NLP.
3. Steerable Latent World Models
In the current stage, we focus on systematically evaluating the proposed energy-based objective for latent world models. We would like to systematically evaluate aspects of our proposed energy-based model objective and general approach. These aspects include different theoretically motivated variations of the objective, recipes for the energy-based model (number of steps, other modeling choices etc.), and a study of scaling. The majority of the compute for this project will be required for the pre-training, with testing not requiring much. We estimate that this will require quite a few ablations, on the order of 300 training runs.
4. Scalable Online Task-Free Continual Learning for Real-Time Adaptation in High-Stakes Dynamic Environments
This project develops a scalable Online Task-Free Continual Learning framework to enable AI models to learn incrementally from streaming data while mitigating catastrophic forgetting. The focus is on high-stakes, dynamic environments (e.g., healthcare), where models must adapt in real-time without retraining. We will explore novel architectures to balance stability and plasticity under strict computational constraints, with evaluations on evolving datasets and real-world benchmarks.