Back to Main Conference 2004
LREC 2004main

An Evaluation Protocol for Text Mining Tools : ALCESTE, SAS Text Miner, SPAD-CRM and Temis Text Mining Solutions Testing

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/5j9mk26ckve8

Abstract

Within the context of the opening of the electricity market, EDF needs to be able to analyse large volumes of text data to enable the company to have a better knowledge of its customers. With this in mind, several text mining tools intended for analysing this very diverse information in large quantities have been evaluated using three different corpora. It appeared essential to create a table to enable easy comparison of the software. Inspired by existing expertise in data mining tools, this was carried out while being careful not to favour statistical over linguistic results. This table has ten subjects varying from the editing company to the fields of application passing through data access and lexical table analysis. In addition to the carrying out of the evaluation and its results on four market tools, this article retraces the method for creating the test table, the choice of the tools evaluated and the criteria retained. Moreover, this experience supports the use of a detailed protocol permitting indispensable functions to be identified and evaluated according to the objectives and the profile of the software user and the nature of the corpus to be analysed.

Details

Paper ID
lrec2004-main-126
Pages
N/A
BibKey
quatrain-etal-2004-evaluation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • YQ

    Yasmina Quatrain

  • SN

    Sylvaine Nugier

  • AP

    Anne Peradotto

Links