Sparse and Quantized visual graph attention networks for efficient and resilient image understanding

System

NSC Web

Front Page

Getting Access

Support Email

support@nsc.liu.se

Feedback

Give Feedback

Sparse and Quantized visual graph attention networks for efficient and resilient image understanding

Title:	Sparse and Quantized visual graph attention networks for efficient and resilient image understanding
DNr:	Berzelius-2025-19
Project Type:	LiU Berzelius
Principal Investigator:	Jose Nunez-yanez <jose.nunez-yanez@liu.se>
Affiliation:	Linköpings universitet
Duration:	2025-03-20 – 2025-10-01
Classification:	10206
Keywords:

Abstract

This project focuses on how graph attention networks can be used in computer vision problems such as object detection, image classification and others. The objective is to create efficient and light-weight models that use graph attention networks with adaptive levels of sparsity and quantization to outperform transformers and other neural network models in energy efficiency at a comparable level of accuracy. Visual transformers have been shown to outperform traditional convolutional neural networks in many computer vision applications but they use attention mechanisms that are applied to fully connected graphs and require very high compute and energy requirements that limits their scalability to high resolution computer vision. This is especially the case if the objective is to deploy them in mobile devices and sensors at the network edge with strict limitations in power and thermal dissipation. Graph attention networks are conceptually close to transformers but potentially much more efficient thanks to their sparsity. In this project, we will explore how to perform additional sparsification at the feature level orthogonal to quantization at the tensor level at multiple levels down to 1-bit. We will then propose custom hardware solutions able to stream the sparse data and perform optimal arithmetic operations at the sub-byte level. Critically, we will investigate a technique that predicts the level of sparsification and quantization possible depending on the complexity of the input data. The aim is to provide levels of performance per Joule between one-two order of magnitude better than what is possible today with leading GPU/TPU and CPU hardware solutions. We will use real-world problems in the areas of anomaly detection in robotics, cyber-physical systems, natural language processing or autonomous driving to demonstrate this new level of efficiency.

National Supercomputer Centre at Linköping University

Abstract