Back to Main Conference 2018
LREC 2018main

Building an Ellipsis-aware Chinese Dependency Treebank for Web Text

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/4zqurzns6pqh

Abstract

Web 2.0 has brought with it numerous user-produced data revealing one's thoughts, experiences, and knowledge, which are a great source for many tasks, such as information extraction, and knowledge base construction. However, the colloquial nature of the texts poses new challenges for current natural language processing techniques, which are more adapt to the formal form of the language. Ellipsis is a common linguistic phenomenon that some words are left out as they are understood from the context, especially in oral utterance, hindering the improvement of dependency parsing, which is of great importance for tasks relied on the meaning of the sentence. In order to promote research in this area, we are releasing a Chinese dependency treebank of 319 weibos, containing 572 sentences with omissions restored and contexts reserved.

Details

Paper ID
lrec2018-main-276
Pages
N/A
BibKey
ren-etal-2018-building
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • XR

    Xuancheng Ren

  • XS

    Xu Sun

  • JW

    Ji Wen

  • BW

    Bingzhen Wei

  • WZ

    Weidong Zhan

  • ZZ

    Zhiyuan Zhang

Links