High-dimensional entropy estimation with applications to deep learning

System

NSC Web

Front Page

Getting Access

Support Email

support@nsc.liu.se

Feedback

Give Feedback

High-dimensional entropy estimation with applications to deep learning

Title:	High-dimensional entropy estimation with applications to deep learning
DNr:	Berzelius-2025-29
Project Type:	LiU Berzelius
Principal Investigator:	Viktor Nilsson <vikn@kth.se>
Affiliation:	Kungliga Tekniska högskolan
Duration:	2025-02-01 – 2025-08-01
Classification:	10106
Keywords:

Abstract

The project's aim is to investigate whether the notion of renormalized mutual information (RMI) is tractable in high dimension. This problem has close connections to the problem of entropy estimation, which is notoriously difficult in high dimension. Initial results indicate that for large datasets like MNIST, features with dimension of moderate sizes allow entropy estimation using the differentiable "KNIFE" estimator. We aim for a theoretical and empirical study of whether RMI is a useful measure for compression of input information. More specifically, we seek to understand whether a high RMI implies good performance for transfer learning. This would be established by testing the performance of training models for downstream tasks on features that have a wide range of RMI scores. Ideally, we will also develop ways to optimize (maximize) for RMI in high dimension, but this may prove too difficult. Meeting the goal would have the impact of establishing RMI as a useful measure for information compression in feature learning. This would warrant further research into estimating it efficiently, as well as using it as an optimization objective for unsupervised feature learning or regularization. Also, it would yield a new way to analyze the learning process of common training algorithms such as SGD and ADAM. Theoretically, it would be interesting to further develop the relation to mutual information and renormalized mutual information. Also, the impact of dimensionality must be investigated. For instance, it is not known whether or how RMI can be compared across different dimensions. This project has so far yielded the original method of entropy estimation called REMEDI [Nilsson et al. 2025], whose article was accepted to ICML2024. The next phase of the project is to further explore the applicability of the REMEDI method to the original problem of RMI estimation, as well as applying the method to other problems in machine learning, such as rectified flows (also known as Flow-Matching), were we have discovered a potential fruitful application. Clarification: Recent experiments concerning rectified flows were not carried out on the Berzelius cluster. Some were carried out by collaborators at Argonne National Laboratory, using their own cluster, called Swing, while others were carried by me locally or on Google Colab. I personally have not done much large-scale experimental ML work since the ICML submission, mostly due to other institutional/departmental commitments such as lecturing. As such, the allocation of resources on the Berzelius cluster has been minimal during the last six months. During the last week, in the lead-up to the ICML 2025 and UAI 2025 deadlines, I have carried out several large experiments requiring A100s, but have been relegated to using Google Colab instead of Berzelius, due to the recent maintenance window. Right now, I have been starting to execute those on Berzelius, and I am hoping to keep using it in February and onwards. I with my collaborators are aiming submissions to UAI2025 and NeurIPS2025. Thus, computational resources will be required during the coming six months.

National Supercomputer Centre at Linköping University

Abstract