Graph-based, spatial, temporal, and generative machine learning
Title: |
Graph-based, spatial, temporal, and generative machine learning |
DNr: |
Berzelius-2025-97 |
Project Type: |
LiU Berzelius |
Principal Investigator: |
Fredrik Lindsten <fredrik.lindsten@liu.se> |
Affiliation: |
Linköpings universitet |
Duration: |
2025-04-01 – 2025-10-01 |
Classification: |
10210 |
Keywords: |
|
Abstract
This is a joint proposal for 7 separate projects with the same PI including 5 PhD students and one postdoc.
These projects investigate method development related to graph-based, spatio-temporal, and generative machine learning, with important applications such as material science and weather forecasting.
It is a continuation of previous projects which have used Berzelius, and which have resulted in publications in top-tier AI/ML venues such as NeurIPS, ICLR, and AISTATS , and we aim to continue to publish at top tier venues withing the area of AI/ML. The projects are outlined below.
Probabilistic Machine Learning Weather Modelling:
Machine learning has recently shown promising potential for weather modelling. Traditionally, this has been done with numerical methods based on differential equations but this is very slow and computationally expensive compared to machine learning based systems. In this project we will continue working on improving our previous work [1] within probabilistic limited area modelling. Additionally, we will work on using machine learning for the data assimilation task which solves the filtering problem to give initial conditions for forecasting models based on observations and previous forecasts. We will also investigate the use of probabilistic machine learning models for ocean modelling, potentially coupling the two systems as is commonly done in numerical weather modelling. This project is funded by WASP through the NEST-project main: Multi-dimensional Alignment and Integration of Physical and Virtual Worlds (https://wasp-sweden.org/multi-dimensional-alignment-and-integration/). We have collaborators from SMHI, Danish Meteorological Institute, University of Helsinki and California Institute of Technology.
[1] Diffusion-LAM: Probabilistic Limited Area Weather Forecasting with Diffusion
Accepted for poster presentation and short 4-page paper at ICLR 2025 Workshop: Tackling Climate Change with Machine Learning. Currently working on camera-ready version. Preprint: Larsson, Erik & Oskarsson, Joel & Landelius, Tomas & Lindsten, Fredrik. (2025). Diffusion-LAM: Probabilistic Limited Area Weather Forecasting with Diffusion. 10.48550/arXiv.2502.07532.
By: Erik Larsson
Learning physical trajectories for flow-based generative models applied to weather forecasting:
Diffusion and flow-based models have demonstrated great success in modelling complex stochastic spatio-temporal problems. The key innovation in these methods is their simulation-free training objective that regresses a neural network against a user-defined vector field. However, these vector fields are often chosen as linear interpolations of the data and does as such not produce physically meaningful trajectories for the intermediate steps. In this project we aim to build on these ideas to develop new ways of generating probabilistic forecasts that better capture these trajectories. This work is a continuation of work set to appear at ICLR 2025. This project is funded by WASP through the NEST-project main: Multi-dimensional Alignment and Integration of Physical and Virtual Worlds (https://wasp-sweden.org/multi-dimensional-alignment-and-integration/).
By: Martin Andrae
Posterior sampling for conditional generation with flow-based models:
Conditional generation with generative models is typically performed by explicitly training with class labels or property values, and often produce high likelihood point estimates which may not accurately approximate the posterior. In this project we take a probabilistic perspective, and target asymptotically exact sampling from the posterior using unconditional domain specific pre-trained generative models. The methodology builds upon a prior preliminary study and is set to be demonstrated on a range of conditional generation tasks ranging from the design of novel materials, molecules to proteins with target properties.
By: Adhithyan Kalaivanan
Discrete diffusion model for generating materials
Our previous work [1], a collaboration with the Theoretical Physics division at Linköping University, developed a new generative method for generation of materials and demonstrated the effectiveness of the method in generating materials in a proof-of-concept. While the previous work focused on the development of the method, we now aim to put this model to use for exploring the true possibilities of discovering new materials with useful properties beyond a proof-of-concept. This involves some method development for guiding the model towards materials with desired properties, but also more large-scale experiments like training on a larger dataset and generating more materials.
By: Filip Ekström Kelvinius and Dong Qian
[1] Ekström Kelvinius, F., Andersson, O. B., Parackal, A. S., Qian, D., Armiento, R., & Lindsten, F. (2025). WyckoffDiff-A Generative Diffusion Model for Crystal Symmetry. arXiv preprint arXiv:2502.06485.
Diffusion model and sequential Monte Carlo for inverse problems
We have in a previous project [1] developed a sequential Monte Carlo algorithm to solve a so-called Bayesian inverse problem with diffusion prior, i.e., we aim at sampling from a conditional distribution where the prior is a (pre-trained) generative diffusion model. We are now exploring new application areas for this.
[1] Ekström Kelvinius, F., Zhao, Z., & Lindsten, F. (2025). Solving Linear-Gaussian Bayesian Inverse Problems with Decoupled Diffusion Sequential Monte Carlo. arXiv preprint arXiv:2502.06379.
Development and Evaluation of Neural Limited Area Weather Forecasting Models
Weather forecasting is crucial for society, and many institutes and companies invest large efforts into producing accurate and timely forecasts. It is often of interest to produce forecasts for a specific region, and limited area models are one way to achieve this. As new machine learning techniques are being applied to weather forecasting, there is substantial interest to use these also for regional forecasting. In this project we are building on our previous pioneering work within the area [1] to develop such neural limited area models applicable to more realistic settings. We have multiple collaborators in the project from SMHI, Danish Meteorological Institute, ETH Zurich and GeoSphere Austria. The project has been ongoing for some time, and is currently in its final phase where we are drafting up a publication that we aim to publish in the earth system modeling literature. There are some final large-scale experiments left to run, and the Berzelius resource would be crucial for enabling these.
By: Joel Oskarsson
[1] Oskarsson, J., Landelius, T. and Lindsten, F., 2023. Graph-based Neural Weather Prediction for Limited Area Modeling. NeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning.
Diffusion Models for Monte Carlo Sampling
Diffusion models are a class of generative models known for their state-of-the-art performance across different tasks. Their key idea is to employ a noise diffusion process to gradually transform a complex data distribution into a simpler one. Samples can then be generated by approximating the time-reversal of this forward diffusion process. In this project, we aim to leverage this framework to approximately sample from a given a target distribution. Specifically, we consider a coupled system of SDE and ODE that govern samples and their associated weights, with the weights serving to correct potential biases. This setup is closely related to Sequential Monte Carlo.
By: Dong Qian