Back to Main Conference 2012
LREC 2012main

Detecting Japanese Compound Functional Expressions using Canonical/Derivational Relation

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

DOI:10.63317/427fc7kpk9gj

Abstract

The Japanese language has various types of functional expressions. In order to organize Japanese functional expressions with various surface forms, a lexicon of Japanese functional expressions with hierarchical organization was compiled. This paper proposes how to design the framework of identifying more than 16,000 functional expressions in Japanese texts by utilizing hierarchical organization of the lexicon. In our framework, more than 16,000 functional expressions are roughly divided into canonical / derived functional expressions. Each derived functional expression is intended to be identified by referring to the most similar occurrence of its canonical expression. In our framework, contextual occurrence information of much fewer canonical expressions are expanded into the whole forms of derived expressions, to be utilized when identifying those derived expressions. We also empirically show that the proposed method can correctly identify more than 80% of the functional / content usages only with less than 38,000 training instances of manually identified canonical expressions.

Details

Paper ID
lrec2012-main-537
Pages
N/A
BibKey
suzuki-etal-2012-detecting
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-7-7
Conference
Eighth International Conference on Language Resources and Evaluation
Location
Istanbul, Turkey
Date
21 May 2012 27 May 2012

Authors

  • TS

    Takafumi Suzuki

  • YA

    Yusuke Abe

  • IT

    Itsuki Toyota

  • TU

    Takehito Utsuro

  • SM

    Suguru Matsuyoshi

  • MT

    Masatoshi Tsuchiya

Links