Learning Domain Policies for Classical Planning
Two prominent approaches to solving a problem in classical planning are: using heuristic search to solve the given problem; or using a policy tailored towards the domain of the given problem. The first approach is often very computationally expensive whilst the second is very efficient. However, finding a policy that generalises over a domain is frequently computationally infeasible. In other words, policy-based approaches requires an expensive learning step whose result is then reused for all problems in the domain. This aspect is also found in most machine learning methods. Currently, policies are often learned using SAT programs that do not scale well. In this project we explore how to use deep learning to learn policies for many domains efficiently. More specifically, we take a closer look at graph neural networks that are defined using the first-order predicates of the given domain. This approach has several advantages such as better interpretability and enables formal analysis using counting logic. This project can also be viewed as bridging the gap between model-free and model-based AI research. We aim to first train the networks in a supervised fashion, and later unsupervised by integrating the learning process tightly to an existing classical planner.