Back to Main Conference 2014
LREC 2014main

YouDACC: the Youtube Dialectal Arabic Comment Corpus

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/35tyqjp42ypn

Abstract

This paper presents YOUDACC, an automatically annotated large-scale multi-dialectal Arabic corpus collected from user comments on Youtube videos. Our corpus covers different groups of dialects: Egyptian (EG), Gulf (GU), Iraqi (IQ), Maghrebi (MG) and Levantine (LV). We perform an empirical analysis on the crawled corpus and demonstrate that our location-based proposed method is effective for the task of dialect labeling.

Details

Paper ID
lrec2014-main-456
Pages
pp. 1246-1251
BibKey
salama-etal-2014-youdacc
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • AS

    Ahmed Salama

  • HB

    Houda Bouamor

  • BM

    Behrang Mohit

  • KO

    Kemal Oflazer

Links