Back to Main Conference 2008
LREC 2008main

Cleaneval: a Competition for Cleaning Web Pages

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/4y7ib5ikczny

Abstract

Cleaneval is a shared task and competitive evaluation on the topic of cleaning arbitrary web pages, with the goal of preparing web data for use as a corpus for linguistic and language technology research and development. The first exercise took place in 2007. We describe how it was set up, results, and lessons learnt

Details

Paper ID
lrec2008-main-369
Pages
N/A
BibKey
baroni-etal-2008-cleaneval
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • MB

    Marco Baroni

  • FC

    Francis Chantree

  • AK

    Adam Kilgarriff

  • SS

    Serge Sharoff

Links