Memorization in Deep Networks
||Memorization in Deep Networks|
||Mårten Björkman <firstname.lastname@example.org>|
||Kungliga Tekniska högskolan|
||2021-09-01 – 2022-03-01|
Deep networks achieve state of the art performance on several tasks within the natural world. At the same time, modern architectures have enough capacity to perfectly fit common benchmark datasets and shatter them.
At present, the mechanisms governing learning and memorization in deep networks are not fully understood. In this work, we exploit the local geometry of convolutional and dense layers to empirically investigate which layers are responsible for memorization.
Importantly, ReLU activations, the most popular non-linearities adopted by feed-forward networks, allow to interpret a model under the lenses of activation regions and hyperplane arrangements, opening an angle for Geometric studies.
In this project, we focus on activation regions to contrast learning with memorization at each individual layer of several trained networks, with the goal of developing a measure of generality of features learned by each layer.