Back to Main Conference 2018
LREC 2018main

Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/2sgmsme8ccj4

Abstract

For the purpose of POS tagging noisy user-generated text, should normalization be handled as a preliminary task or is it possible to handle misspelled words directly in the POS tagging model? We propose in this paper a combined approach where some errors are normalized before tagging, while a Gated Recurrent Unit deep neural network based tagger handles the remaining errors. Word embeddings are trained on a large corpus in order to address both normalization and POS tagging. Experiments are run on Contact Center chat conversations, a particular type of formal Computer Mediated Communication data.

Details

Paper ID
lrec2018-main-014
Pages
N/A
BibKey
damnati-etal-2018-handling
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • GD

    Géraldine Damnati

  • JA

    Jeremy Auguste

  • AN

    Alexis Nasr

  • DC

    Delphine Charlet

  • JH

    Johannes Heinecke

  • FB

    Frédéric Béchet

Links