Back to Main Conference 2024
LREC-COLING 2024main

QueryNER: Segmentation of E-commerce Queries

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/43qqtg25vmx2

Abstract

We present QueryNER, a manually-annotated dataset and accompanying model for e-commerce query segmentation. Prior work in sequence labeling for e-commerce has largely addressed aspect-value extraction which focuses on extracting portions of a product title or query for narrowly defined aspects. Our work instead focuses on the goal of dividing a query into meaningful chunks with broadly applicable types. We report baseline tagging results and conduct experiments comparing token and entity dropping for null and low recall query recovery. Challenging test sets are created using automatic transformations and show how simple data augmentation techniques can make the models more robust to noise. We make the QueryNER dataset publicly available.

Details

Paper ID
lrec2024-main-1178
Pages
pp. 13455-13470
BibKey
palen-michel-etal-2024-queryner
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • CP

    Chester Palen-Michel

  • LL

    Lizzie Liang

  • ZW

    Zhe Wu

  • CL

    Constantine Lignos

Links