Back to Main Conference 2018
LREC 2018main

M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/3f4ymwyr9s39

Abstract

In this paper, we present a new corpus for Chinese Named Entity Recognition (NER) from three domains : human-computer interaction, social media, and e-commerce. The annotation procedure is conducted in two rounds. In the first round, one sentence is annotated by more than one persons independently. In the second round, the experts discuss the sentences for which the annotators do not make agreements. Finally, we obtain a corpus which have five data sets in three domains. We further evaluate three popular models on the newly created data sets. The experimental results show that the system based on Bi-LSTM-CRF performs the best among the comparison systems on all the data sets. The corpus can be used for further studies in research community.

Details

Paper ID
lrec2018-main-706
Pages
N/A
BibKey
lu-etal-2018-cner
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • QL

    Qi Lu

  • YY

    YaoSheng Yang

  • ZL

    Zhenghua Li

  • WC

    Wenliang Chen

  • MZ

    Min Zhang

Links