Back to Main Conference 2016
LREC 2016main

Detection of Reformulations in Spoken French

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/3fagfs4u3h5k

Abstract

Our work addresses automatic detection of enunciations and segments with reformulations in French spoken corpora. The proposed approach is syntagmatic. It is based on reformulation markers and specificities of spoken language. The reference data are built manually and have gone through consensus. Automatic methods, based on rules and CRF machine learning, are proposed in order to detect the enunciations and segments that contain reformulations. With the CRF models, different features are exploited within a window of various sizes. Detection of enunciations with reformulations shows up to 0.66 precision. The tests performed for the detection of reformulated segments indicate that the task remains difficult. The best average performance values reach up to 0.65 F-measure, 0.75 precision, and 0.63 recall. We have several perspectives to this work for improving the detection of reformulated segments and for studying the data from other points of view.

Details

Paper ID
lrec2016-main-596
Pages
pp. 3760-3767
BibKey
grabar-eshkol-taravela-2016-detection
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • NG

    Natalia Grabar

  • IE

    Iris Eshkol-Taravela

Links