High-Dimensional Bayesian Optimization

System

NSC Web

Front Page

Getting Access

Support Email

support@nsc.liu.se

Feedback

Give Feedback

High-Dimensional Bayesian Optimization

Title:	High-Dimensional Bayesian Optimization
DNr:	Berzelius-2023-26
Project Type:	LiU Berzelius
Principal Investigator:	Leonard Papenmeier <leonard.papenmeier@cs.lth.se>
Affiliation:	Lunds universitet
Duration:	2023-02-08 – 2023-09-01
Classification:	10201
Keywords:

Abstract

Bayesian optimization (BO) is a popular technique for optimizing expensive-to-evaluate black-box functions (EBFs), i.e., functions whose value can only be observed at specific points and whose evaluation might take a considerable time. While being paramount for applications such as neural architecture search (NAS), drug design, robotics, or engineering, EBFs with hundreds or thousands of parameters still pose a challenge. Current techniques make strong assumptions about the structure of the function, which might not hold in practice. This project is targeting to push the boundaries of high-dimensional BO. More specifically, I am working on a follow-up project of BAxUS (1), a recent algorithm that gave a new state-of-the-art performance for high-dimensional BO. BAxUS is restricted to problems with a continuous search space and does not allow for parallel function evaluations. To make it an off-the-shelf solver for practitioners, I aim to support more complex search spaces, such as a mix of continuous, categorical, and ordinal variables or hierarchical search spaces. Mixed spaces are important for hyperparameter tuning and many engineering problems. Hierarchical spaces are spaces where the presence of a parameter depends on the presence of other parameters. For example, the kernel size of a layer in NAS only needs to be optimized if that layer is a convolutional layer and not, for example, a fully-connected layer. High-dimensional BO usually requires a considerable amount of function evaluations, and the computation of a Gaussian process (GP) posterior grows cubically with the number of observations. Furthermore, the maximum-likelihood estimation for the GP hyperparameters grows with the number of function parameters. Luckily, modern frameworks such as BoTorch (2) allow training GPs with GPUs. This is why I request access to the Berzelius cluster. (1) https://openreview.net/forum?id=e4Wf6112DI (2) https://botorch.org/

National Supercomputer Centre at Linköping University

Abstract