Language models for Swedish in collaboration with AI Sweden and RISE
Title: Language models for Swedish in collaboration with AI Sweden and RISE
DNr: Berzelius-2022-184
Project Type: LiU Berzelius
Principal Investigator: Johanna Björklund <johanna.bjorklund@umu.se>
Affiliation: Umeå universitet
Duration: 2022-10-03 – 2023-05-01
Classification: 10208
Homepage: https://docs.google.com/document/d/108lIzYvSqWav_ZDBroDb7GlMfG15BVRVmsI1O8q15ok/edit?usp=sharing
Keywords:

Abstract

This application concerns the 2nd phase of the project will develop the first truly large-scale generative language models for the Swedish language. The models will be based on the GPT architecture, using the Nvidia Megatron framework, and our will have 20B, 40B, and 175B parameters. As such, this project is an extension of the preliminary work done by the AI Sweden team in collaboration with Nvidia, and on the Project Berzelius-2022-2 that served to realise the initial models in this series. The proposed project will deliver the largest model built in Sweden to date, and will be unique even internationally due to its size and, in extension, extremely broad applicability for the entire Nordic region. The WASP WARA Media and Language is the perfect development and application environment for such a model, and will rapidly accelerate the competence and capacity of Swedish AI research in general, and Swedish NLP in particular.