Learning Domain Policies for Classical Planning
Title: Learning Domain Policies for Classical Planning
DNr: Berzelius-2023-149
Project Type: LiU Berzelius
Principal Investigator: Simon Ståhlberg <simon.stahlberg@liu.se>
Affiliation: Linköpings universitet
Duration: 2023-05-30 – 2023-12-01
Classification: 10201
Homepage: https://www.ida.liu.se/divisions/aiics/rlpgroup/


Two prominent approaches to solving a problem in classical planning are: using heuristic search to solve the given problem; or using a policy tailored towards the domain of the given problem. The first approach is often very computationally expensive whilst the second is very efficient. However, finding a policy that generalises over a domain is frequently computationally infeasible. In other words, policy-based approaches requires an expensive learning step whose result is then reused for all problems in the domain. This aspect is also found in most machine learning methods. Currently, policies are often learned using SAT programs that do not scale well. We showed in an earlier project that deep learning can be used successfully to learn both optimal and suboptimal policies for many domains. Additionally, since we used an architecture based on graph neural networks, the expressive power of the trained models are well-understood theoretically and empirically. We will continue this line of research in this project and investigate if the Transformer architecture can be used to increase the expressive power of our neural networks, and provide additional flexibility. The Transformer architecture also provide better GPU utilization than graph neural networks.