Back to Main Conference 2016
LREC 2016main

Universal Dependencies for Japanese

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/36uzv4avymic

Abstract

We present an attempt to port the international syntactic annotation scheme, Universal Dependencies, to the Japanese language in this paper. Since the Japanese syntactic structure is usually annotated on the basis of unique chunk-based dependencies, we first introduce word-based dependencies by using a word unit called the Short Unit Word, which usually corresponds to an entry in the lexicon UniDic. Porting is done by mapping the part-of-speech tagset in UniDic to the universal part-of-speech tagset, and converting a constituent-based treebank to a typed dependency tree. The conversion is not straightforward, and we discuss the problems that arose in the conversion and the current solutions. A treebank consisting of 10,000 sentences was built by converting the existent resources and currently released to the public.

Details

Paper ID
lrec2016-main-261
Pages
pp. 1651-1658
BibKey
tanaka-etal-2016-universal
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • TT

    Takaaki Tanaka

  • YM

    Yusuke Miyao

  • MA

    Masayuki Asahara

  • SU

    Sumire Uematsu

  • HK

    Hiroshi Kanayama

  • SM

    Shinsuke Mori

  • YM

    Yuji Matsumoto

Links