Graph-based, spatial and temporal machine learning
Title: Graph-based, spatial and temporal machine learning
DNr: Berzelius-2023-76
Project Type: LiU Berzelius
Principal Investigator: Fredrik Lindsten <fredrik.lindsten@liu.se>
Affiliation: Linköpings universitet
Duration: 2023-04-01 – 2023-10-01
Classification: 10201
Keywords:

Abstract

This is a joint proposal for 5 separate projects, which are outlined below. Distilling large GNNs for Materials Science Graph neural networks (GNNs) have recently shown very promising results in the materials science field. For example, they can be used to predict energies and forces of atomic systems, and large GNNs like GemNet [1] currently set the state of the art on various datasets. Although models like GemNet result in very good performance, they are computationally expensive. This limits their use for long simulations and large systems. Therefore, an initial goal for this project is to improve the performance of computationally cheap GNNs like PaiNN [2], using so called knowledge distillation. This could pave the way for models that are both computationally cheap and make accurate predictions. This is an ongoing project, and we have promising preliminary results that we plan to continue to improve and submit to a top machine learning conference during 2023. [1] Gasteiger, Becker, Günnemann. GemNet: Universal Directional Graph Neural Networks for Molecules. NeurIPS 2021 [2] Schütt, Unke, Gastegger. Equivariant message passing for the prediction of tensorial properties and molecular spectra. ICML 2021 Graph-based Deep Weather Forecasting In this project we will investigate the use of Graph Neural Networks (GNNs) for Numerical Weather Prediction (NWP). Recent works [1, 2] have shown that GNN-based deep learning models can learn to highly accurately approximate NWP systems while producing predictions in a fraction of the time used by the original system. When incorporating real observations the deep learning models can even surpass the accuracy of the original NWP systems. We will work in an existing collaboration with SMHI to see how these methods can be applied to weather data in the Nordic region. In extensions, we will also work to improve the underlying machine learning methods. Extensions include more refined modelling of the temporal dependencies and improved methods for quantifying the uncertainty in predictions. [1] GraphCast: Learning skillful medium-range global weather forecasting, Lam et al., preprint, 2022 [2] Forecasting Global Weather with Graph Neural Networks, Keisler, R., preprint 2022. Non-conjugate Deep GMRFs In our previous work [1] we showed how the model class of Deep GMRFs [2] can be applied to general graph-structured data, capturing uncertainty through Bayesian inference. The approach is however limited to real-valued prediction targets with additive Gaussian noise. In this project we aim to enable the use of Deep GMRF models for new problems and new types of data. To achieve this, we will apply new methods for Bayesian inference, which can in turn enable new model extensions. These model extensions can make the model more flexible and allow for modelling more types of data. [1] Scalable Deep Gaussian Markov Random Fields for General Graphs, Oskarsson et al., ICML 2022 [2] Deep Gaussian Markov Random Fields, Sidén and Lindsten, ICML 2020 Neural Kalman Filters Temporal data are ubiquitous in several disciplines and represent the underlying (unseen) dynamics of systems. Two modalities have traditionally been employed for the problem of dynamics learning: simulation-based (probabilistic) inference and machine-learning-based (data-driven) inference. Both modalities, however, are limited; for the former to be most accurate, learning must make use of a (probabilistic) model that faithfully represents all the complex true dynamics of the system – this can be a formidable and/or impractical task – and for the latter to be most accurate, learning may require a very large sample of data, so that edge cases of the true dynamics are sufficiently and proportionally represented – yet again, it may be difficult to get such data. Recent research [1-2] explores the merits of methods that combine principles of both these modalities. The benefits of these supervised learning methods come at the cost of resource-intensive training of deep, over-parameterised models. Our self-supervised algorithm better exploits the temporal dependencies implied in the data and is able to deliver better performance with much more resource efficiency. [1] V. Garcia Satorras, Z. Akata, and M. Welling, “Combining Generative and Discriminative Models for Hybrid Inference,” NeurIPS, 2019. [2] P. Becker et al., “Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces,” ICML, 2019 Land Cover Classification Mapping out Land Cover classes/segments from global satellite imagery has proven to be a non-trivial problem. More specifically, the modern image classifiers/segmentation models pre-trained on large image corpora fine-tuned on annotated satellite images fail to respect distribution shifts in data – for instance, urban land covers in Brasília, Brazil appear very different compared to those in Paris, France. Recent research on related problems [1-2] use meta-learning to first obtain a region-agnostic model and then fine-tune this model with a few annotated examples corresponding to a region of interest for improved predictive performance. Meta-learning, however, is well known to be computationally expensive and often involves wasteful training computations. We propose a model that uses a spatial Bayesian prior to account for local data distributions, through which we can do away with the meta-learning proposed earlier. [1] Rußwurm, Marc, et al. "Meta-learning for few-shot land cover classification." Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition workshops. 2020. [2] Tseng, Gabriel, Hannah Kerner, and David Rolnick. "TIML: Task-Informed Meta-Learning for Agriculture." arXiv preprint arXiv:2202.02124 (2022)