Back to Main Conference 2022
LREC 2022main
Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish
Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)
Abstract
We present GTP-SW3, a 3.5 billion parameter autoregressive language model, trained on a newly created 100 GB Swedish corpus. This paper provides insights with regards to data collection and training, while highlights the challenges of proper model evaluation. The results of quantitive evaluation through perplexity indicate that GPT-SW3 is a competent model in comparison with existing autoregressive models of similar size. Additionally, we perform an extensive prompting study which reveals the good text generation capabilities of GTP-SW3.