9:00-- 9:05 |
Opening
address: Joseph Mariani (LIMSI-CNRS,
France) |
Session:
Experiences in HLT Evaluation (Chair: Patrick Paroubek, LIMSI-CNRS,
France)
|
9:05 -- 9:20 |
Fei Xia and Martha Palmer (Dept of Computer and
Information Science, University of Pennsylvania, USA) Evaluating
the Coverage of LTAGs on Annotated Corpora |
9:20 -- 9:35 |
Rashmi Prasad and Anoop Sarkar (IRCS, University
of Pennsylvania, USA) Comparing Test-suite based evaluation and
Corpus-based evaluation of a wide-coverage grammar for English |
9:35 -- 9:50 |
Béatrice Daille (IRIN, Université de Nantes,
France) Evaluating a Multi-Word Term Indexing System: Method,
Implementation and Report |
9:50 -- 10:05 |
Kyo Kageura (NACSIS, Japan)
IR/IE/Summarization Evaluation Projects in Japan |
10:05 -- 10:20 |
Monika Höge (University of Helsinki,
Finland) A Framework for the Quantitative and Qualitative
Evaluation of Translator's Aids Systems |
10:20 -- 10:35 |
Rémi Zajac (Computer Research Laboratory, New
Mexico State University, USA) Evaluation of the Machine
Translation of Financial Documents |
10:35 -- 10:50 |
Claude de Loupy and Patrice Bellot (LIA,
Université d'Avignon, France) Evaluation of Document Retrieval
Systems |
10:50 -- 11:05 |
Ellen M. Voorhees and Dawn M. Tice (National
Institute of Standards and Technology, USA) Implementing a
Question Answering Evaluation |
11:05 -- 11:20 |
Niels Ole Bernsen and Laila Dybkjaer (NIS,
University of Southern Denmark, Denmark) Is that a Good Spoken
Language Dialogue System? |
11:20 -- 11:40 |
Coffee
break |
Session: Issues
and Prospects in HLT Evaluation (Chair: Niels Ole Bernsen, NIS,
University of Southern Denmark, Denmark)
|
11:40 -- 11:55 |
Patrick Paroubek (Spoken Language Processing Group, LIMSI-CNRS, France)
Categorical Data-Specification for Control Task Formalization and
Validation in Quantitative Black Box Evaluation |
11:55 -- 12:10 |
Lynette Hirschman (MITRE, USA) Reading
Comprehension and Question-Answering New Evaluation Paradigms for Human
Language Technology |
12:10 -- 12:25 |
Gérard Sabah (Language and Cognition Group,
LIMSI-CNRS, France) To Validate or not to Validate? - Some
difficulties for a scientific evaluation of natural language processing
systems |
Panel Session:
The Future of HLT Evaluation (Chairs: Lynette Hirschman, MITRE, USA
& Patrick Paroubek, LIMSI-CNRS, France)
|
12:30 -- 13:05 |
To initiate the Panel
Session, the participants
- Gerhard Budin (U. Vienna) - SALT project
- Stéphane Chaudiron (Ministère de
l'Enseignement Supérieur et de la Recherche, France) - Amaryllis
campaign
- Adam Kilgarriff (ITRI, University of Brighton,
UK) - Senseval campaign
- Édouard Geoffrois (DGA, France) -
- Khalid Choukri (ELRA-ELDA, France) - ELRA
project
- Lazaros Polymenakos (IBM, Greece) - Catch-2004
project
- Remi Zajac (New Mexico State University, USA) - Transaccount project
will present in 5 minutes (max) their
views on how to use evaluation in Human Language Technology projects :
What will be evaluated? What are the evaluation techniques? Which
resources are needed? What is the interest of using evaluation within
projects? How technology evaluation relates to usage evalution? How
should evaluation be included in present and future HLT programs? Is it
possible to share resources, tools or expertise on evaluation across
projects? How to conduct international cooperation in this framework?
|
13:05 -- 13:25 |
Continuation of the Panel
Session with participation of the audience.
|
13:25 -- 13:30 |
Closing
statement: Joseph Mariani (LIMSI-CNRS,
France) |