Back to Main Conference 2018
LREC 2018main

No more beating about the bush : A Step towards Idiom Handling for Indian Language NLP

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/4ywaxwj2mz6o

Abstract

One of the major challenges in the field of Natural Language Processing (NLP) is the handling of idioms; seemingly ordinary phrases which could be further conjugated or even spread across the sentence to fit the context. Since idioms are a part of natural language, the ability to tackle them brings us closer to creating efficient NLP tools. This paper presents a multilingual parallel idiom dataset for seven Indian languages in addition to English and demonstrates its usefulness for two NLP applications - Machine Translation and Sentiment Analysis. We observe significant improvement for both the subtasks over baseline models trained without employing the idiom dataset.

Details

Paper ID
lrec2018-main-048
Pages
N/A
BibKey
agrawal-etal-2018-beating
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • RA

    Ruchit Agrawal

  • VC

    Vighnesh Chenthil Kumar

  • VM

    Vigneshwaran Muralidharan

  • DS

    Dipti Sharma

Links