Cross-dataset generalization for video: are recurrent models less texture biased?
Title: Cross-dataset generalization for video: are recurrent models less texture biased?
SNIC Project: Berzelius-2022-75
Project Type: LiU Berzelius
Principal Investigator: Sofia Broomé <sbroome@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2022-04-01 – 2022-10-01
Classification: 10207
Keywords:

Abstract

The paper is an empirical investigation and comparison of the cross-dataset-generalization abilities of, specifically, recurrent CNNs and 3D CNNs, which are heavy to train. Studying cross-dataset-generalization for video models also implies studying their texture bias, which is novel work in the video domain (Geirhos et al., ICLR 2019, did this for single image data). I am studying the empirical consequences of the inherently different temporal modeling of these two models. Importance of project: The topic is novel and important since deep learning for video is far behind deep learning for single images. On a personal level, the project is important because it is my final paper to include in my thesis next year. Expected goal fulfilment: There is now an arXiv paper for the work (https://arxiv.org/abs/2112.12175) based on experiments that were run on Berzelius, but during the work with my monograph thesis I am adding more experiments. Also, the article is still in submission, which process may require further experiments later on too. Software and methods to be used: Python, Pytorch with GPU-computation and various other conda-installed libraries. For the heavier datasets, I am dependent on distributed computing (in my case, via Pytorch lightning), which I have already tested within the Berzelius pilot phase. Ideally, if time allows, I will start to use the Singularity framework for the software environment. Otherwise, I will continue to use conda.