TækTåk: Syntactic Analysis of Language Use on Danish TikTok
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Language use is different across different language communities. Social media provides a rich source for studying how language varies, as it contains large data for a wide variety of sub-communities. In this paper, we study language usage on Danish TikTok. TikTok is a video-based platform, but most users are mainly active in the text-based comment sections. With the goal of analyzing language usage on this language variety, we contribute: 1) the first Danish social media treebank annotated for Universal Dependencies 2) evaluation of a variety of parsers using the new treebank, showing that cross-lingual in-domain data provides a valuable signal 3) a comparison of syntactic trends on standard Danish languages and TikTok language.