Back to Main Conference 2022
LREC 2022main

HECTOR: A Hybrid TExt SimplifiCation TOol for Raw Texts in French

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/4tkbm3qskuy7

Abstract

Reducing the complexity of texts by applying an Automatic Text Simplification (ATS) system has been sparking interest inthe area of Natural Language Processing (NLP) for several years and a number of methods and evaluation campaigns haveemerged targeting lexical and syntactic transformations. In recent years, several studies exploit deep learning techniques basedon very large comparable corpora. Yet the lack of large amounts of corpora (original-simplified) for French has been hinderingthe development of an ATS tool for this language. In this paper, we present our system, which is based on a combination ofmethods relying on word embeddings for lexical simplification and rule-based strategies for syntax and discourse adaptations. We present an evaluation of the lexical, syntactic and discourse-level simplifications according to automatic and humanevaluations. We discuss the performances of our system at the lexical, syntactic, and discourse levels

Details

Paper ID
lrec2022-main-493
Pages
pp. 4620-4630
BibKey
todirascu-etal-2022-hector
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • AT

    Amalia Todirascu

  • RW

    Rodrigo Wilkens

  • ER

    Eva Rolin

  • TF

    Thomas François

  • DB

    Delphine Bernhard

  • NG

    Núria Gala

Links