Large language model for 6G mobile
Title: Large language model for 6G mobile
DNr: Berzelius-2024-128
Project Type: LiU Berzelius
Principal Investigator: Ming Xiao <mingx@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2024-04-02 – 2024-11-01
Classification: 20203
Homepage: https://www.kth.se/profile/mingx
Keywords:

Abstract

In this project, we would like to link large language models and semantic communication for our research. With the emergence of e.g. generative pre-trained transformer (GPT)-2/3/4, Bidirectional Encoder Representation from Transformer (BERT), large language model Meta AI ( LLaMA), and the emergence of some large language models in the visual domain, such as DALL-E, and Contrastive Language-Image Pre-Training (CLIP), we believe that the field of communication is undergoing a huge change. There is a gradual shift from traditional data-based bit-level transmission to contextual knowledge-based semantic transmission. This kind of method can effectively reduce the amount of data sent during the transmission process, which not only improves the transmission efficiency of the bandwidth, but also brings more possibilities for accelerating the data transmission. Therefore, in this project, we want to focus on the application of big language modeling in communication. We want to conduct our experiments from two perspectives. The first one is to directly invoke the Large Language Model(LLM) to test the robustness and applicability of the generalized knowledge in the existing Big Model to the noisy information appearing in the communication, focusing mainly on the quality of the semantics in the transmission process. For this scenario, we will not re-train the large model, but will only use the pre-trained model and fine-tune it on a small scale or insert adapters for small-scale training. The second experiment is to train some larger models, in which case we would like to adapt the network structure to build a larger network that is more suitable for semantic communication and explore the most suitable network structure for the specific scenario of semantic communication.