Back to Main Conference 2018
LREC 2018main

Multi-lingual Argumentative Corpora in English, Turkish, Greek, Albanian, Croatian, Serbian, Macedonian, Bulgarian, Romanian and Arabic

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/3pva2b8zg2eb

Abstract

Argumentative corpora are costly to create and are available in only few languages with English dominating the area. In this paper we release the first publicly available corpora in all Balkan languages and Arabic. The corpora are obtained by using parallel corpora where the source language is English and target language is either a Balkan language or Arabic. We use 8 different argument mining classifiers trained for English, apply them all on the source language and project the decision made by the classifiers to the target language. We assess the performance of the classifiers on a manually annotated news corpus. Our results show when at least 3 to 6 classifiers are used to judge a piece of text as argumentative an F1-score above 90% is obtained.

Details

Paper ID
lrec2018-main-617
Pages
N/A
BibKey
sliwa-etal-2018-multi
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • AS

    Alfred Sliwa

  • YM

    Yuan Ma

  • RL

    Ruishen Liu

  • NB

    Niravkumar Borad

  • SZ

    Seyedeh Ziyaei

  • MG

    Mina Ghobadi

  • FS

    Firas Sabbah

  • AA

    Ahmet Aker

Links