Using syntax for the semantic representation of sentences

Proceedings of the Workshop on Structured Linguistic Data and Evaluation (SLiDE)

Abstract

Deep learning methods in natural language processing often rely on statistical methods to tokenize texts before vectorization. This segmentation produces lexical subunits offering great flexibility. However, the reuse of identical tokens across words with different meanings can favor representations based on surface form rather than on linguistic information, especially semantics. This mismatch between semantics and surface form can lead to undesirable effects in language processing. To limit the influence of form on the semantics of vector representations, we propose an intermediate representation based on syntactic parsing that is more compact and more faithful to word meaning.

Resources

Details

Paper ID

lrec2026-ws-slide-15

Pages

pp. 169-179

DOI

10.63317/4gtinxarm3dd

BibKey

boucharenc-etal-2026-syntax

Editors

Germany) Erhard Hinrichs (Tübingen University, Sweden) Joakim Nivre (Uppsala University, Bulgaria) Petya Osenova (Sofia University, USA) James Pustejovsky (Brandeis University, Germany) Claus Zinn (Tübingen University

Publisher

European Language Resources Association (ELRA)

ISSN

N/A

ISBN

N/A

Workshop

Proceedings of the Workshop on Structured Linguistic Data and Evaluation (SLiDE)

Location

Palma, Mallorca, Spain

Date

11 - 16 May 2026

Authors

IB
Iskandar Boucharenc
ES
Eve Sauvage
TG
Thomas Gerald
JT
Julien Tourille
SC
Sabrina Campano
CG
Cyril Grouin
SR
Sophie Rosset

Links

URL

DOI