DeepVision: Deep Learning for Robot Vision
Title: DeepVision: Deep Learning for Robot Vision
DNr: Berzelius-2022-248
Project Type: LiU Berzelius
Principal Investigator: Michael Felsberg <>
Affiliation: Linköpings universitet
Duration: 2023-01-01 – 2023-07-01
Classification: 10207


This is going to continue Berzelius-2021-100. Recently, image representations based on convolutional neural networks (CNNs) have demonstrated significant improvements over the state-of-the-art in many computer vision applications including image classification, object detection, scene recognition, semantic segmentation, action recognition, and visual tracking. CNNs consist of a series of convolution and pooling operations followed by one or more fully connected (FC) layers. Deep networks are trained using raw image pixels with a fixed input size or sparse point clouds in a finite volume. These networks require large amounts of labeled training data. The introduction of large datasets (e.g. ImageNet, 14 million images, semantic 3D datasets, and synthetic datasets) and the parallelism enabled by modern GPUs have facilitated the rapid deployment of deep networks for many visual tasks. This development has led to what many peers call the deep learning revolution in computer vision. CVL is currently working on six different research tasks within the DeepVision project and that GPU-resources are requested for: 1. Video object segmentation 2. Detailed semantic description of humans in images and videos 3. Deep learning for large scale remote sensing scene analysis 4. Probabilistic 3D computation from time-of-flight measurements 5. Algebraically constrained networks for sparse image data 6. Weakly supervised object detection and segmentation 7. Incremental, few-shot, and long-tailed learning of image classification