Language models for Swedish in collaboration with AI Sweden and the Royal Library
Title: Language models for Swedish in collaboration with AI Sweden and the Royal Library
SNIC Project: Berzelius-2022-2
Project Type: LiU Berzelius
Principal Investigator: Johanna Björklund <johanna.bjorklund@umu.se>
Affiliation: Umeå universitet
Duration: 2022-03-24 – 2022-10-01
Classification: 10208
Homepage: https://docs.google.com/document/d/13Z02NlhymiNSaCI_4VwuvHbvj5MR3FnT_bLr-FNuIv4/edit?usp=sharing
Keywords:

Abstract

This project will develop the first truly large-scale generative language model for the Swedish language. The model will be based on the GPT architecture, using the Nvidia Megatron framework, and will have up to 100 billion parameters. As such, this project is an extension of the preliminary work done by the AI Sweden team in collaboration with Nvidia on using the Megatron framework to build large-scale language models for the Swedish language. The current project will deliver the largest model built in Sweden to date, and will be unique even internationally due to its size and, in extension, extremely broad applicability for the entire Nordic region. The WASP WARA on media and language is the perfect development and application environment for such a model, and will rapidly accelerate the competence and capacity of Swedish AI research in general, and Swedish NLP in particular.