Back to Main Conference 2022
LREC 2022main

Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/2u9szjhk3jwh

Abstract

We present GTP-SW3, a 3.5 billion parameter autoregressive language model, trained on a newly created 100 GB Swedish corpus. This paper provides insights with regards to data collection and training, while highlights the challenges of proper model evaluation. The results of quantitive evaluation through perplexity indicate that GPT-SW3 is a competent model in comparison with existing autoregressive models of similar size. Additionally, we perform an extensive prompting study which reveals the good text generation capabilities of GTP-SW3.

Details

Paper ID
lrec2022-main-376
Pages
pp. 3509-3518
BibKey
ekgren-etal-2022-lessons
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • AE

    Ariel Ekgren

  • AC

    Amaru Cuba Gyllensten

  • EG

    Evangelia Gogoulou

  • AH

    Alice Heiman

  • SV

    Severine Verlinden

  • Joey Öhman

  • FC

    Fredrik Carlsson

  • MS

    Magnus Sahlgren

Links