LREC 2000: detailed program

Language Resources and Evaluation Conference 31 May - 2 June 2000 Athens, Greece

Wednesday May 31, 2000

8:00-10:00 Registration

10:00-11:40 Opening ceremony

11:40-12:00 Coffee Break

12:00-13:20 3 Sessions in parallel

Panel 1: Resources for the Millennium
organizer: Catherine Macleod

Session WO1: Corpus Tagging
Chair: M.T. Pazienza
Fei Xia, Martha Palmer, Nianwen Xue, Mary Ellen Okurowski, John Kovarik, Fu-Dong Chiou, Shizhe Huang, Tony Kroch, Mitch Marcus, Developing Guidelines and Ensuring Consistency for Chinese Text Annotation.
Yuji Matsumoto, Tatsuo Yamashita, Using Machine Learning Methods to Improve Quality Tagged of Corpora and Learning Models.
Jakub Zavrel, Walter Daelemans, Bootstrapping a Tagged Corpus through Combination of Existing Heterogeneous Taggers.
Lars Borin, Something Borrowed, Something Blue: Rule-based Combination of POS Taggers.
Session EO1: Evaluation of Machine Translation
Chair: M. King
John White, Jennifer Doyon, Susan Talbott, Determining the Tolerance of Text-handling Tasks for MT Output.
Niamh Bohan, Elisabeth Breidt, Martin Volk, Evaluating Translation Quality as Input to Product Development.
Sonja Niessen, Franz Josef Och, Gregor Leusch, Hermann Ney, An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research.
Dimitri Theologitis, Terminology, Translation Memory and Text at the European Commission Translation Service.

13:20-14:40 Lunch

14:40-16:40 4 Sessions in parallel

Session SO1: Data Centers / Major Projects
Chair: L.S. Lee
Christopher Cieri, Mark Liberman, Issues in Corpus Creation and Distribution: The Evolution of the Linguistic Data Consortium.
Jim Talley, The Establishment of Motorola's Human Language Data Resource Center: Addressing the Criticality of Language Resources in the Industrial Setting.
Elisabeth D'Halleweyn, Erwin Dewallef, Jeannine Beeken, A Platform for Dutch in Human Language Technologies.
Khalid Choukri, Audrey Mance, Valérie Mapelli, Recent Developments within the European Language Resources Association (ELRA).
Nick Campbell, COCOSDA - a Progress Report.
Jeffrey Allen, Khalid Choukri, Survey of Language Engineering Needs: a Language Resources Perspective.
Session WO2: Treebanks
Chair: A.M. Municio
Anne Abeillé, Lionel Clément, Alexandra Kinyon, Building a Treebank for French.
Eva Hajicová, Petr Sgall, Semantico-syntactic Tagging of Very Large Corpora: the Case of Restoration of Nodes on the Underlying Level.
Cristina Bosco, Vincenzo Lombardo, Daniela Vassallo, Leonardo Lesmo, Building a Treebank for Italian: a Data-driven Annotation Schema.
Antonio Moreno, Ralph Grishman, Susana López, Fernando Sánchez, Satoshi Sekine, A Treebank of Spanish and its Application to Parsing.
Rodolfo Delmonte, Shallow Parsing and Functional Structure in Italian Corpora.
Andreas Mengel, Wolfgang Lezius, An XML-based Representation Format for Syntactically Annotated Corpora.
Session WO3: Corpus Categorization
Chair: J. McNaught
George Mikros, George Carayannis, Modern Greek Corpus Taxonomy.
George Tambouratzis, Stella Markantonatou, Nikolaos Hairetakis, George Carayannis, Automatic Style Categorization of Corpora in the Greek Language.
Helka Folch, Serge Heiden, Benoît Habert, Serge Fleury, Gabriel Illouz, Pierre Lafon, Julien Nioche, Sophie Prévost, TyPTex: Inductive Typological Text Classification by Multivariate Statistical Analysis for NLP Systems Tuning/Evaluation.
Session WO4: Reusability Issues
Chair: A. Ogonowski
Patrick Paroubek, Language Resources as by-Product of Evaluation: The MULTITAG Example.
Lynne Cahill, Christy Doran, Roger Evans, Roger Kibble, Chris Mellish, D. Paiva, Mike Reape, Donia Scott, Neil Tipper, Enabling Resource Sharing in Language Generation: an Abstract Reference Architecture.
Björn Gambäck, Fredrik Olsson, Experiences of Language Engineering Algorithm Reuse.
Panel 2: Human Language Technology Resources for Central European Languages: European Integration Issues
organizer: Zygmunt Vetulani

16:40-17:00 Coffee Break

17:00-18:20 5 sessions in parallel

Session SP1: Phonetic Issues and Speech Synthesis
Chair: S. Itahashi
Guy Pérennou, Martine De Calmès, MHATLex: Lexical Resources for Modeling the French Pronunciation.
Damjan Vlaj, Janez Kaiser, Ralph Wilhelm, Ute Ziegenhain, PLEDIT - A New Efficient Tool for Management of Multilingual Pronunciation Lexica and Batchlists.
Einar Meister, Arvo Eek, Toomas Altosaar, Martti Vainio, Object-oriented Access to the Estonian Phonetic Database.
Philippe Boula de Mareüil, Christophe d'Alessandro, François Yvon, Véronique Aubergé, Jacqueline Vaissière, Angélique Amelot, A French Phonetic Lexicon with Variants for Speech and Language Processing.
Matej Rojc, Zdravko Kacic, A Computational Platform for Development of Morphologic and Phonetic Lexica.
Dafydd Gibbon, Ana Paula Quirino Simões, Martin Matthiesen, An Optimized FS Pronunciation Resource Generator for Highly Inflecting Languages.
Jong-mi Kim, Design Methodology for Bilingual Pronunciation Dictionary.
France Mihelic, Jerneja Gros, Elmar Nöth, Volker Warnke, Labeling of Prosodic Events in Slovenian Speech Database GOPOLIS.
Nicole Beringer, Marcia Neff, Regional Pronunciation Variants for Automatic Segmentation.
Josué Ndamba, Jean Silence Bayamboussa, Le Programme Compalex (COMPAraison LEXicale).
Stavroula-Evita Fotinea, Athanassios Protopapas, Dimitris Dimitriadis, George Carayannis, Perceptual Evaluation of Text-to-Speech Implementation of Enclitic Stress in Greek.
N. Chenfour, A. Benabbou, A. Mouradi, Etude et Evaluation de la Di-Syllabe comme Unité Acoustique pour le Système de Synthèse Arabe PARADIS.
Matej Rojc, Zdravko Kacic, Design of Optimal Slovenian Speech Corpus for Use in the Concatenative Speech Synthesis System.
Session SP2 : Spoken Language Resources Issues from Construction to Validation
Chair: J. B. Millar
Rhys James Jones, John S. Mason, Louise Helliker, Mark Pawlewski, Recruitment Techniques for Minority Language Speech Databases: Some Observations.
Andreas Witt, Harald Lüngen, Dafydd Gibbon, Enhancing Speech Corpus Resources with Multiple Lexical Tag Layers.
Daniela Oppermann, Susanne Burger, Karl Weilhammer, What are Transcription Errors and Why are They made?
Stephanie Strassel, David Graff, Nii Martey, Christopher Cieri, Quality Control in Large Annotation Projects Involving Multiple Judges: The Case of the TDT Corpora.
D. Vaufreydaz, C. Bergamini, J.F. Serignat, L. Besacier, M. Akbar, A New Methodology for Speech Corpora Definition from Internet Documents.
David Graff, Steven Bird, Many Uses, Many Annotations for Large Speech Corpora: Switchboard and TDT as Case Studies.
Henk van den Heuvel, Lou Boves, Khalid Choukri, Simo Goddijn, Eric Sanders, SLR Validation: Present State of Affairs and Prospects.
Barbara Di Eugenio, On the Usage of Kappa to Evaluate Agreement on Coding Tasks.
Session WP1: Lexicon
Chair: T. Kruyt
Martin Gellerstam, Yvonne Cederholm, Torgny Rasmark, The Bank of Swedish.
Hu Junfeng, Yu Shiwen, The Multi-layer Language Knowledge Base of Chinese NLP.
Joan Soler i Bou, Producing LRs in Parallel with Lexicographic Description: the DCC project.
Luzia Wittmann, Ricardo Daniel Ribeiro, Tânia Pêgo, Fernando Batista, Some Language Resources and Tools for Computational Processing of Portuguese at INESC.
Jon Mills, Screffva: A Lexicographer's Workbench.
Tomaz Erjavec, Roger Evans, Nancy Ide, Adam Kilgariff, The Concede Model for Lexical Databases.
Hideki Kashioka, Satosi Shirai, Automatically Expansion of Thesaurus Entries with a Different Thesaurus.
Zygmunt Vetulani, Electronic Language Resources for Polish: POLEX, CEGLEX and GRAMLEX.
Sharon Inkelas, Aylin Küntay, C. Orhan Orgun, Ronald Sprouse, Turkish Electronic Living Lexicon (TELL): A Lexical Database.
Ülle Viks, Tools for the Generation of Morphological Entries in Dictionaries.
Young-Soog Chae, Key-Sun Choi, Design and Construction of Knowledge base for Verb using MRD and Tagged Corpus.
Session WP2: Corpus Annotation
Chair: D. Tufis
I. Aduriz, E. Agirre, I. Aldezabal, X. Arregi, J. M. Arriola, X. Artola, K. Gojenola, A. Maritxalar, K. Sarasola, M. Urkia, A Word-level Morphosyntactic Analyzer for Basque.
Thorsten Brants, Oliver Plaehn, Interactive Corpus Annotation.
Kiyoaki Shirai, Hozumi Tanaka, Takenobu Tokunaga, Semi-automatic Construction of a Tree-annotated Corpus Using an Iterative Learning Statistical Language Model.
Sotiris Boutsis, Prokopis Prokopidis, Voula Giouli, Stelios Piperidis, A Robust Parser for Unrestricted Greek Text.
Leonardo Lesmo, Vincenzo Lombardo, Automatic Assignment of Grammatical Relations.
Patrice Bonhomme, Patrice Lopez, Resources for Lexicalized Tree Adjoining Grammars and XML Encoding: TagML.
Constantin Orasan, CLinkA: A Coreferential Links Annotator.
Eva Hajicová, Jarmila Panenová, Petr Sgall, Coreference in Annotating a Large Corpus.
Catalina Barbu, FAST - Towards a Semi-automatic Annotation of Corpora.
Nadjet Bouayad-Agha, Layout Annotation in a Corpus of Patient Information Leaflets.
Session WP3: Multilingual Corpora
Chair: C. Charalambakis
Peter Bennison, Lynne Bowker, Designing a Tool for Exploiting Bilingual Comparable Corpora.
Zheng Jie, Mao Yuhang, A Word Sense Disambiguation Method Using Bilingual Corpus.
Marko Tadic, Building the Croatian-English Parallel Corpus.
Johann Gamper, A Parallel Corpus of Italian/German Legal Texts.
Tamás Váradi, Lexical and Translation Equivalence in Parallel Corpora.
Lluís de Yzaguirre, Marta Ribas, Jordi Vivaldi, M. Teresa Cabré, Some Technical Aspects about Aligning Near Languages.
Noah A. Smith, Michael E. Jahr, Cairo: An Alignment Visualization Tool.

18:20-18:25 5 min Break

18:25-19:45 4 Sessions in parallel

Session SO2: Dialogue Evaluation Methods
Chair: S. Furui
Carine-Alexia Lavelle, Martine De Calmès, Guy Pérennou, Dialogue and Prompting Strategies Evaluation in the DEMON System.
H. Bonneau-Maynard, L. Devillers, S. Rosset, Predictive Performance of Dialog Systems.
Niels Ole Bernsen, Laila Dybkjær, A Methodology for Evaluating Spoken Language Dialogue Systems and Their Components.
Marilyn Walker, Candace Kamm, Julie Boland, Developing and Testing General Models of Spoken Dialogue System Performance.
Session WO5: Corpus Tools
Chair: A. Cappelli
David Day, Alan Goldschen, John Henderson, A Framework for Cross-Document Annotation.
Diana Santos, Eckhard Bick, Providing Internet Access to Portuguese Corpora: the AC/DC Project.
Massimo Poesio, Annotating a Corpus to Develop and Evaluate Discourse Entity Realization Algorithms: Issues and Preliminary Results.
Claude de Loupy, Marc El-Bèze, Using Few Clues Can Compensate the Small Amount of Resources Available for Word Sense Disambiguation.
Session WO6: Acquisition of Lexical Information
Chair: I. Prodanof
Daniel Zeman, Anoop Sarkar, Learning Verb Subcategorization from Corpora: Counting Frame Subsets.
Roberto Basili, Maria Teresa Pazienza, Michele Vindigni, Fabio Massimo Zanzotto, Tuning Lexicons to New Operational Scenarios.
Uwe Quasthoff, Christian Wolff, A Flexible Infrastructure for Large Monolingual Corpora.
Penny Labropoulou, Elena Mantzari, Harris Papageorgiou, Maria Gavrilidou, Automatic Generation of Dictionary Definitions from a Computational Lexicon.
Panel 3: Multilingual Content Encoding and Translation
organizer: Antonio Sanfilippo

20:30 Welcome Reception - Peristylion

Thursday June 1, 2000

9:00-9:40 2 Sessions in parallel

Invited Speakers:
1) Salim Roukos (Manager, Conversational Systems, IBM TJ. Watson Research Center, USA):

Next Generation Natural Language Applications.

Chair: J. Mariani

2) Alan. K. Melby (Brigham Young University at Provo, USA) & Klaus-Dirk Schmitz (Fachhochschule Koeln, Germany):

Terminology Standards-Help for the Terminology Community.

Chair: B. Maegaard

9:40-9:45 5 min Break

9:45-10:45 4 Sessions in parallel

Session SO3: Speech Synthesis
Chair: V. Steinbiss
Amalia Arvaniti, Mary Baltazani, GREEK ToBI: A System for the Annotation of Greek Speech Corpora.
Thierry Dutoit, Michel Bagein, Fabrice Malfrère, Vincent Pagel, Alain Ruelle, Nawfal Tounsi, Dominique Wynsberghe, EULER: an Open, Generic, Multilingual and Multi-platform Text-to-Speech System.
Byeongchang Kim, Jin-seok Lee, Jeongwon Cha, Geunbae Lee, POSCAT: A Morpheme-based Speech Corpus Annotation Tool.
Session WO7: Syntactic Parsing
Chair: G. Papakonstantinou
Toni Badia, Àngels Egea, A Strategy for the Syntactic Parsing of Corpora: from Constraint Grammar Output to Unification-based Processing.
Takehito Utsuro, Learning Preference of Dependency between Japanese Subordinate Clauses and its Evaluation in Parsing.
Ann Copestake, Dan Flickinger, An Open Source Grammar Development Environment and Broad-coverage English Grammar Using HPSG.
Session WO8: Acquisition of Semantic Information
Chair: N. Ide
Paolo Allegrini, Simonetta Montemagni, Vito Pirrelli, Controlled Bootstrapping of Lexico-semantic Classes as a Bridge between Paradigmatic and Syntagmatic Knowledge: Methodology and Evaluation.
Aristomenis Thanopoulos, Nikos Fakotakis, George Kokkinakis, Automatic Extraction of Semantic Similarity of Words from Raw Technical Texts.
Kimura Kazuhiro, Hirakawa Hideki, Abstraction of the EDR Concept Classification and its Effectiveness in Word Sense Disambiguation.
Session EO2: Evaluation of Tools
Chair: L. Hirschman
Alessandro Lenci, Simonetta Montemagni, Vito Pirrelli, Claudia Soria, Where Opposites Meet. A Syntactic Meta-scheme for Corpus Annotation and Parsing Evaluation.
Mochizuki Hajime, Okumura Manabu, A Comparison of Summarization Methods Based on Task based Evaluation.
Philippe Langlais, Sébastien Sauvé, George Foster, Elliott Macklovitch, Guy Lapalme, Evaluation of TRANSTYPE, a Computer-aided Translation Typing System: A Comparison of a Theoretical- and a User-oriented Evaluation Procedures.

10:45-11:00 Coffee Break

11:00-12:00 4 Sessions in parallel

Session SO4: Speech Synthesis Evaluation
Chair: I. Dologlou
Gérard Bailly, Eduardo R. Banga, Alex Monaghan, Erhard Rank, The Cost258 Signal Generation Test Array.
Shuichi Itahashi, Guidelines for Japanese Speech Synthesizer Evaluation.
Albert Rilliard, Véronique Aubergé, Perception and Analysis of a Reiterant Speech Paradigm: a Functional Diagnostic of Synthetic Prosody.
Session WO9: Applications in the Written Area
Chair: J. Odijk
Andrew Bredenkamp, Berthold Crysmann, Mirela Petrea, Looking for Errors: A Declarative Formalism for Resource-adaptive Language Checking.
Guillermo Rojo, Maria Concepción Álvarez, Pilar Alvariño, Adelaida Gil, María Paula Santalla, Susana Sotelo, An Architecture for Document Routing in Spanish: Two Language Components, Pre-processor and Parser.
Hiroyuki Shinnou, Masanori Ikeya, Extraction of Unknown Words Using the Probability of Accepting the Kanji Character Sequence as One Word.
Session WO10: Semantic Annotation of Corpora
Chair: C. Fellbaum
Ornella Corazzari, Nicoletta Calzolari, Antonio Zampolli, An Experiment of Lexical-Semantic Tagging of an Italian Corpus.
Martha Palmer, Hoa Trang Dang, Joseph Rosenzweig, Semantic Tagging for the Penn Treebank.
Philippe Alcouffe, Nicolas Gacon, Claude Roux, Frédérique Segond, A Step toward Semantic Indexing of an Encyclopedic Corpus.
Session: National and International Programs
Chair: G. Carayannis

12:00-13:20 5 Sessions in parallel

Session SP3: Spoken Language Resources' Projects
Chair: D. Gibbon
Asuncion Moreno, Robrecht Comeyne, Keith Haslam, Henk van den Heuvel, Harald Höge, Sabine Horbach, Giorgio Micca, SALA: SpeechDat across Latin America. Results of the First Phase.
Rainer Siemund, Harald Höge, Siegfried Kunzmann, Krzysztof Marasek, SPEECON - Speech Data for Consumer Devices.
Nelleke Oostdijk, The Spoken Dutch Corpus. Overview and First Evaluation.
Asunción Moreno, Børge Lindberg, Christoph Draxler, Gaël Richard, Khalid Choukri, Stephan Euler, Jeffrey Allen, SPEECHDAT-CAR. A Large Speech Database for Automotive Environments.
Tami Rannon, Ofra Golani, Anat Goren, Sherrie Shammass, Ami Moyal, Creation of Spoken Hebrew Databases.
José Bettencourt Gonçalves, Rita Veloso, Spoken Portuguese: Geographic and Social Varieties.
Wim Goedertier, Simo Goddijn, Jean-Pierre Martens, Orthographic Transcription of the Spoken Dutch Corpus.
Giulia Bernardis, Hervé Bourlard, Martin Rajman, Jean-Cédric Chappelier, Development of Acoustic and Linguistic Resources for Research and Evaluation in Interactive Vocal Information Servers.
Marcello Federico, Dimitri Giordani, Paolo Coletti, Development and Evaluation of an Italian Broadcast News Corpus.
Christopher Cieri, David Graff, Mark Liberman, Nii Martey, Stephanie Strassel, Large, Multilingual, Broadcast News Corpora for Cooperative Research in Topic Detection and Tracking: The TDT-2 and TDT-3 Corpus Efforts.
Lin-Shan Lee, Lee-Feng Chien, Live Lexicons and Dynamic Corpora Adapted to the Network Resources for Chinese Spoken Language Processing Applications in an Internet Era.
Klaus Ries, Lori Levin, Liza Valle, Alon Lavie, Alex Waibel, Shallow Discourse Genre Annotation in CallHome Spanish.
Zdravko Kacic, Bogomir Horvat, Aleksandra Zögling, Issues in Design and Collection of Large Telephone Speech Corpus for Slovenian Language.
Kikuo Maekawa, Hanae Koiso, Sadaoki Furui, Hitoshi Isahara, Spontaneous Speech Corpus of Japanese.
Jerneja Gros, France Mihelic, Simon Dobrišek, Tomaz Erjavec, Mario Zganec, Corpora of Slovene Spoken Language for Multi-lingual Applications.
Wolfgang Menzel, Eric Atwell, Patrizia Bonaventura, Daniel Herron, Peter Howarth, Rachel Morton, Clive Souter, The ISLE Corpus of Non-Native Spoken English.
Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, Takanobu Nishiura, Takeshi Yamada, Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition.
Karl Weilhammer, Daniela Oppermann, Susanne Burger, The Influence of Scenario Constraints on the Spontaneity of Speech. A Comparison of Dialogue Corpora.
J.C. Roux, E.C. Botha, J.A. du Preez, Developing a Multilingual Telephone Based Information System in African Languages.
Session WP4 : Lexicon: Semantic and Multilingual Issues
Chair: E. Hajicova
Peggy Cadel, Hélène Ledouble, Extraction of Concepts and Multilingual Information Schemes from French and English Economics Documents.
Jana Klímová, Karel Pala, Application of WordNet ILR in Czech Word formation.
Luisa Bentivogli, Emanuele Pianta, Fabio Pianesi, Coping with Lexical Gaps when Building Aligned Multilingual Wordnets.
Claudia Kunze, Extension and Use of GermaNet, a Lexical-Semantic Database.
Brigitte Krenn, CDB - A Database of Lexical Collocations.
Anna Braasch, Sussi Olsen, Towards a Strategy for a Representation of Collocations - Extending the Danish PAROLE-lexicon.
Paula Guerreiro, Improving Lexical Databases with Collocational Information: Data from Portuguese.
Thierry Fontenelle, A Bilingual Electronic Dictionary for Frame Semantics.
Dominique Dutoit, A Text->Meaning->Text Dictionary and Process.
Vera Fluhr-Semenova, Christian Fluhr, Stéphanie Brisson, Production of NLP-oriented Bilingual Language Resources from Human-oriented dictionaries.
Session WP5: Corpus Tagging
Chair: M. Gavrilidou
Sun Maosong, Sun Honglin, Huang Changning, Zhang Pu, Xing Hongbing, Zhou Qiang, Hua Yu: A Word-segmented and Part-Of-Speech Tagged Chinese Corpus.
Gaëlle Birocheau, Morphological Tagging to Resolve Morphological Ambiguities.
Kristine Levane, Andrejs Spektors, Morphemic Analysis and Morphological Tagging of Latvian Corpus.
Sašo Dzeroski, Tomaz Erjavec, Jakub Zavrel, Morphosyntactic Tagging of Slovene: Evaluating Taggers and Tagsets.
Dan Tufis, Using a Large Set of EAGLES-compliant Morpho-syntactic Descriptors as a Tagset for Probabilistic Tagging.
Barbora Hladká, The Context (not only) for Humans.
Montserrat Marimon Felipe, Jordi Porta Zamorano, PoS Disambiguation and Partial Parsing Bidirectional Interaction.
Kiril Ribarov, Rule-based Tagging: Morphological Tagset versus Tagset of Analytical Functions.
Session WP6: Tools in the Written Area
Chair: S. Bakamidis
Thierry Declerck, Alexander Werner Jachmann, Hans Uszkoreit, The New Edition of the Natural Language Software Registry (an Initiative of ACL hosted at DFKI).
Elisa Gavieiro-Villatte, Laurent Spaggiari, Open Ended Computerized Overview of Controlled Languages.
Byung-Ju Kang, Key-Sun Choi, Automatic Transliteration and Back-transliteration by Decision Tree Learning.
Jan-Torsten Milde, Markus Reinsch, The Universal XML Organizer: UXO.
Claire Grover, Colin Matheson, Andrei Mikheev, Marc Moens, LT TTT - A Flexible Tokenisation Tool.
Alessandro Cucchiarelli, Enrico Faggioli, Paola Velardi, Will Very Large Corpora Play For Semantic Disambiguation The Role That Massive Computing Power Is Playing For Other AI-Hard Problems?
Jo Calder, Interarbora and Thistle - Delivering Linguistic Structure by the Internet.
X. Artola, A. Díaz de Ilarraza, N. Ezeiza, K. Gojenola, A. Maritxalar, A. Soroa, A Proposal for the Integration of NLP Tools using SGML-Tagged Documents.
Irina Prodanof, Amedeo Cappelli, Lorenzo Moretti, Reusability as Easy Adaptability: A Substantial Advance in NL Technology.
Session TP1: Terminology
Chair: A. Le Meur
Sandro Pedrazzini, Elisabeth Maier, Dierk König, Terms Specification and Extraction within a Linguistic-based Intranet Service.
Yasmina Abbas, Marie-Luce Picard, With WORLDTREK Family, Create, Update and Browse your Terminological World.
Gerardo Sierra, John McNaught, Extraction of Semantic Clusters for Terminological Information Retrieval from MRDs.
Antonio Moreno, Chantal Pérez, Reusing the Mikrokosmos Ontology for Concept-based Multilingual Terminology Databases.
Byron Georgantopoulos, Stelios Piperidis, Term-based Identification of Sentences for Text Summarisation.
Marianna Katsoyannou, Eleni Efthimiou, Terminology Encoding in View of Multifunctional NLP Resources.
John Kontos, Ioanna Malagardi, Spyros Fountoukis, ARISTA Generative Lexicon for Compound Greek Medical Terms.

13:20-14:40 Lunch

14:40-16:40 4 sessions in parallel

Session SO5: Evaluation of Dialogue
Chair: N. O. Bernsen
Jean-Yves Antoine, Jacques Siroux, Jean Caelen, Jeanne Villaneau, Jérôme Goulian, Mohamed Ahafhaf, Obtaining Predictive Results with an Objective Evaluation of Spoken Dialogue Systems: Experiments with the DCR Assessment Paradigm.
Lori Levin, Boris Bartlog, Ariadna Font Llitjos, Donna Gates, Alon Lavie, Dorcas Wallace, Taro Watanabe, Monika Woszczyna, Lessons Learned from a Task-based Evaluation of Speech-to-Speech Machine Translation.
Joseph Polifroni, Stephanie Seneff, Galaxy-II as an Architecture for Spoken Dialogue Evaluation.
Thomas Brey, Gerhard Hanrieder, Paul Heisterkamp, Ludwig Hitzenberger, Peter Regel-Brietzmann, Issues in the Evaluation of Spoken Dialogue Systems - Experience from the ACCeSS Project.
Marilyn Walker, Lynette Hirschman, John Aberdeen, Evaluation for Darpa Communicator Spoken Dialogue Systems.
R. López-Cózar, A.J. Rubio, J.E. Díaz Verdejo, A. De la Torre, Evaluation of a Dialogue System Based on a Generic Model that Combines Robust Speech Understanding and Mixed-initiative Control.
Session WO11: Mono-Multilingual Lexicon Acquisition and Building
Chair: A. Braasch
Sun Le, Jin Youbing, Du Lin, Sun Yufang, Automatic Extraction of English-Chinese Term Lexicons from Noisy Bilingual Corpora.
Bonnie J. Dorr, Gina-Anne Levow, Dekang Lin, Scott Thomas, Chinese-English Semantic Resource Construction.
Svetlana Sheremetyeva, Sergei Nirenburg, Towards A Universal Tool For NLP Resource Acquisition.
Sanda M. Harabagiu, Steven J. Maiorano, Acquisition of Linguistic Patterns for Knowledge-based Information Extraction.
George Demetriou, Eric Atwell, Clive Souter, Using Lexical Semantic Knowledge from Machine Readable Dictionaries for Domain Independent Language Modeling.
Adriana Roventini, Antonietta Alonge, Nicoletta Calzolari, Bernardo Magnini, Francesca Bertagna, ItalWordNet: a Large Semantic Database for Italian.
Session WO12: Language Resources: Infrastructural Issues
Chair: C. Wayne
Constantin Orasan, Ramesh Krishnamurthy, An Open Architecture for the Construction and Administration of Corpora.
Tony McEnery, Paul Baker, Lou Burnard, Corpus Resources and Minority Language Engineering.
Steven Bird, Peter Buneman, Wang-Chiew Tan, Towards a Query Language for Annotation Graphs.
Hamish Cunnigham, Kalina Bontcheva, Valentin Tablan, Yorick Wilks, Software Infrastructure for Language Resources: a Taxonomy of Previous Work and a Requirements Analysis.
Nancy Ide, Patrice Bonhomme, Laurent Romary, XCES: An XML-based Encoding Standard for Linguistic Corpora.
Catherine Macleod, Nancy Ide, Ralph Grishman, The American National Corpus: A Standardized Resource for American English.
Session TO1: Terminology
Chair: C. Fluhr
Gerhard Budin, Alan K. Melby, Accessibility of Multilingual Terminological Resources - Current Problems and Prospects for the Future.
Key-Sun Choi, Young-Soog Chae, Terminology in Korea: KORTERM.
Christophe Jouis, ARC A3, ARC A3: A Method for Evaluating Term Extracting Tools and/or Semantic Relations between Terms from Corpora.
Rosa Estopà, Jordi Vivaldi, M.Teresa Cabré, Use of Greek and Latin Forms for Term Detection.
George Demetriou, Robert Gaizauskas, Automatically Augmenting Terminological Lexicons from Untagged Text.
Diana Maynard, Sophia Ananiadou, Creating and Using Domain-specific Ontologies for Terminological Applications.

16:40-17:00 Coffee Break

17:00-19:20 A plenary Session

Panel 4: International Co-operation in the field of Language Resources and Evaluation
organizers: Antonio Zampolli and Lynette Hirschman

Friday June 2, 2000

9:00-9:40 2 Sessions in parallel

Invited Speakers
1) Alex Waibel (Interactive Systems Laboratories, Carnegie Mellon University, USA & Computer Science Department, University of Karlsruhe, Germany).

Meeting Recognition and Tracking.

Chair: H. Hoege

2) Stephen D. Richardson (Microsoft, USA):

The Evolution of an NLP System.

Chair: N. Calzolari

9:40-9:45 5 min Break

9:45-11:05 4 Sessions in parallel

Panel 5: Speech Database Processing Tools - the state of the art in automatic labeling of speech
organizer: Nick Campbell

Session WO13: Multilingual Resources and Applications

Chair: M. Palmer
Jorge Kinoshita, Grammarless Bracketing in an Aligned Bilingual Corpus.
Masumi Narita, Constructing a Tagged E-J Parallel Corpus for Assisting Japanese Software Engineers in Writing English Abstracts.
Marta Villegas, Nuria Bel, Alessandro Lenci, Nicoletta Calzolari, Nilda Ruimy, Antonio Zampolli, Teresa Sadurní, Joan Soler, Multilingual Linguistic Resources: From Monolingual Lexicons to Bilingual Interrelated Lexicons.
Elliot Macklovitch, Michel Simard, Philippe Langlais, TransSearch: A Free Translation Memory on the World Wide Web.
Session WO14: Named Entity Recognition
Chair: R. Gaizauskas
Sean Boisen, Michael R. Crystal, Richard Schwartz, Rebecca Stone, Ralph Weischedel, Annotating Resources for Information Extraction.
Sabine Buchholz, Antal van den Bosch, Integrating Seed Names and ngrams for a Named Entity List and Classifier.
Iason Demiros, Sotiris Boutsis, Voula Giouli, Maria Liakata, Harris Papageorgiou, Stelios Piperidis, Named Entity Recognition in Greek Texts.
Takehito Utsuro, Manabu Sassano, Minimally Supervised Japanese Named Entity Recognition: Resources and Evaluation.
Session EO3: Evaluation and Semantics
Chair: R. Mitkov
Adam Kilgarriff, Joseph Rosenzweig, English Senseval: Report and Results.
Joyce Yue Chai, Evaluation of a Generic Lexical Semantic Resource in Information Extraction.
Gabriel Illouz, Sublanguage Dependent Evaluation: Toward Predicting NLP performances.
Lars Ahrenberg, Magnus Merkel, Anna Sågvall Hein, Jörg Tiedemann, Evaluation of Word Alignment Systems.

11:05-11:20 Coffee Break

11:20-12:20 3 sessions in parallel

Session WO15: Language Resources Projects
Chair: V. Pirrelli
Ángel Martín Municio, Guillermo Rojo, Fernando Sánchez León, Octavio Pinillos, Language Resources Development at the Spanish Royal Academy.
Knut Hofland, A Self-Expanding Corpus Based on Newspapers on the Web.
Stéphane Chaudiron, Khalid Choukri, Audrey Mance, Valérie Mapelli, For a Repository of NLP Tools.
Session WO16: Corpus Annotation and Information Extraction
Chair: S. Piperidis
Rodger Kibble, Kees van Deemter, Coreference Annotation: Whither?
Andrea Setzer, Robert Gaizauskas, Annotating Events and Temporal Information in Newswire Texts.
W.J. Black, J. McNaught, G.P. Zarri, A. Persidis, A. Brasher, L. Gilardoni, E. Bertino, G. Semeraro, P. Leo, A Semi-automatic System for Conceptual Annotation, its Application to Resource Construction and Evaluation.
Session EO4: Grammars and Systems Evaluation
Chair: J. White
Bilel Gargouri, Mohamed Jmaiel, Abdelmajid Ben Hamadou, Using a Formal Approach to Evaluate Grammars.
Ruslan Mitkov, Towards More Comprehensive Evaluation in Anaphora Resolution.
François Trouilleux, Eric Gaussier, Gabriel G. Bès, Annie Zaenen, Coreference Resolution Evaluation Based on Descriptive Specificity.

12:20-13:40 6 Sessions in parallel

Session SP4: Tools for Evaluation and Processing of Spoken Language Resources

Chair: A. Rubio
Edouard Geoffrois, Claude Barras, Steven Bird, Zhibiao Wu, Transcribing with Annotation Graphs.
José A.R. Fonollosa, Asunción Moreno, SpeechDat-Car Fixed Platform.
Rosen Ivanov, Automatic Speech Segmentation in High Noise Condition.
Mario Refice, Michelina Savino, Marco Altieri, Roberto Altieri, SegWin: a Tool for Segmenting, Annotating, and Controlling the Creation of a Database of Spoken Italian Varieties.
Kallirroi Georgila, Nikos Fakotakis, George Kokkinakis, A Graphical Parametric Language-Independent Tool for the Annotation of Speech Corpora.
David Portabella, Albert Febrer, Asunción Moreno, NaniTrans: a Speech Labeling Tool.
L. Cristoforetti, M. Matassoni, M. Omologo, P. Svaizer, E. Zovato, Annotation of a Multichannel Noisy Speech Corpus.
Marcela Charfuelán, José Relaño Gil, M.Carmen Rogríguez Gancedo, Daniel Tapias Merino, Luis Hernández Gómez, Dialogue Annotation for Language Systems Evaluation.
Laila Dybkjær, Morten Baun Møller, Niels Ole Bernsen, Michael Grosse, Martin Olsen, Amanda Schiffrin, Annotating Communication Problems Using the MATE Workbench.
Amy Isard, David McKelvie, Andreas Mengel, Morten Baun Møller, The MATE Workbench Annotation Tool, a Technical Description.
Marc Swerts, Emiel Krahmer, On the Use of Prosody for On-line Evaluation of Spoken Dialogue Systems.
Cosmin Munteanu, Marian Boldea, MDWOZ: A Wizard of Oz Environment for Dialog Systems Development.
Susanne J. Jekat, Lorenzo Tessiore, End-to-End Evaluation of Machine Interpretation Systems: A Graphical Evaluation Tool.
Giorgio Micca, Alessandra Frasca, Maria Gabriella Di Benedetto, Cross-lingual Interpolation of Speech Recognition Models.
Session SP5: Multimodal-Multi media Resources and Tools
Chair: D.M. Roy
Albert Russel, Hennie Brugman, Daan Broeder, Peter Wittenburg, The EUDICO Project, Multi
media Annotation over the Internet.
D. Broeder, H. Brugman, A. Russel, R. Skiba, P. Wittenburg, Towards a Standard for Meta-descriptions of Language Resources.
Steven Bird, David Day, John Garofolo, John Henderson, Christophe Laprun, Mark Liberman, ATLAS: A Flexible and Extensible Architecture for Linguistic Annotation.
Pavel Skrelin, Tatiana Sherstinova, Models of Russian Text/Speech Interactive Databases for Supporting of Scientific, Practical and Cultural Researches.
Dafydd Gibbon, Thorsten Trippel, A Multi-view Hyperlexicon Resource for Speech and Language System Development.
Giovanna Turrini, Laura Cignoni, Alessandro Paccosi, Addizionario: an Interactive Hypermedia Tool for Language Learning.
Session EP1: Evaluation and Written Area
Chair: P. Langlais
Amit Bagga, Enhancing the TDT Tracking Evaluation.
John A. Bateman, Anthony F. Hartley, Target Suites for Evaluating the Coverage of Text Generators.
Atsushi Fujii, Tetsuya Ishikawa, A Novelty-based Evaluation Method for Information Retrieval.
Hervé Déjean, How To Evaluate and Compare Tagsets? A Proposal.
Gees C. Stein, Tomek Strzalkowski, G. Bowden Wise, Amit Bagga, Evaluating Summaries for Multiple Documents in an Interactive Environment.
Paola Merlo, Suzanne Stevenson, Establishing the Upper Bound and Inter-judge Agreement of a Verb Classification Task.
Richard F. E. Sutcliffe, Sadao Kurohashi, A Parallel English-Japanese Query Collection for the Evaluation of On-Line Help Systems.
Malgorzata Marciniak, Agnieszka Mykowiecka, Anna Kupsc, Adam Przepiórkowski, An HPSG-Annotated Test Suite for Polish.
Judith L. Klavans, Nina Wacholder, David K. Evans, Evaluation of Computational Linguistic Techniques for Identifying Significant Topics for Browsing Applications.

Session WP7: Corpus Projects
Chair: R. Grishman
Jaroslava Hlavácová, Rarity of Words in a Language and in a Corpus.
Georges Vignaux, The PAROLE Program.
Maria Fernanda Bacelar do Nascimento, Luisa Pereira, João Saramago, Portuguese Corpora at CLUL.
Serge A.Yablonsky, Russian Monitor Corpora: Composition, Linguistic Encoding and Internet Publication.
Dan Bohus, Marian Boldea, A Web-based Text Corpora Development System.
Marilyn Mason, Issues from Corpus Analysis that have influenced the On-going Development of Various Haitian Creole Text- and Speech-based NLP Systems and Applications.

Session WP8: Corpus Tools
Chair: R. Zajac
Janne Bondi Johannessen, Anders Nøklestad, Kristin Hagen, A Web-based Advanced and User Friendly System: The Oslo Corpus of Tagged Norwegian Texts.
Young-Soog Chae, Key-Sun Choi, Introduction of KIBS (Korean Information Base System) Project.
Nick Hatzigeorgiu, Maria Gavrilidou, Stelios Piperidis, George Carayannis, Anastasia Papakostopoulou, Athanassia Spiliotopoulou, Anna Vacalopoulou, Penny Labropoulou, Elena Mantzari, Harris Papageorgiou, Iason Demiros, Design and Implementation of the Online ILSP Greek Corpus.
Kiril Ribarov, The (Un)Deterministic Nature of Morphological Context.
Saturnino Luz, A Software Toolkit for Sharing and Accessing Corpora Over the Internet.
E. Kavallieratou, N. Liolios, E. Koutsogeorgos, N. Fakotakis, G. Kokkinakis, GRUHD: A Greek database of Unconstrained Handwriting.

Session WP9: Applications using Written Language Resources
Chair: G. Stainhaouer
John Bateman, Elke Teich, Geert-Jan Kruijff, Ivanna Kruijff-Korbayová, Serge Sharoff, Hana Skoumalová, Resources for Multilingual Text Generation in Three Slavic Languages.
Felisa Verdejo, Julio Gonzalo, Anselmo Peñas, Fernando López, David Fernández, Evaluating Wordnets in Cross-language Information Retrieval: the ITEM Search Engine.
Catia Cucchiarini, Johan Van Hoorde, Elizabeth D'Halleyweyn, NL-Translex: Machine Translation for Dutch.
Kyongho Min, William H. Wilson, Yoo-Jin Moon, Typographical and Orthographical Spelling Error Correction.
Constandina Economou, Spyros Raptis, Gregory Stainhaouer, LEXIPLOIGISSI: An Educational Platform for the Teaching of Terminology in Greece.
Kosho Shudo, Masahito Takahashi, Yasuo Koyama, Kenji Yoshimura, Collocations as Word Co-ocurrence Restriction Data - An Application to Japanese Word Processor-.

13:40-15:00 Lunch

15:00-17:20 4 Sessions in parallel

Session SO6: Speech Recognition and Related Issues
Chair: H. J. M. Steeneken
Maria Canelli, Daniele Grasso, Margaret King, Methods and Metrics for the Evaluation of Dictation Systems: a Case Study.
Alvin Martin, Mark Przybocki, Design Issues in Text-Independent Speaker Recognition Evaluation.
Stavroula-Evita Fotinea, Ioannis Dologlou, Stylianos Bakamidis, Gregory Stainhaouer, George Carayannis, Perceptual Evaluation of a New Subband Low Bit Rate Speech Compression System based on Waveform Vector Quantization and SVD Postfiltering.
Katsunobu Itou, Kiyohiro Shikano, Tatsuya Kawahara, Kasuya Takeda, Atsushi Yamada, Akinori Itou, Takehito Utsuro, Tetsunori Kobayashi, Nobuaki Minematsu, Mikio Yamamoto, Shigeki Sagayama, Akinobu Lee, IPA Japanese Dictation Free Software Project.
Finn Tore Johansen, Narada Warakagoda, Børge Lindberg, Gunnar Lehtinen, Zdravko Kacic, Andreh Zgank, Kjell Elenius, Gampiero Salvi, The COST 249 SpeechDat Multilingual Reference Recognizer.
Klaus Bengler, Automotive Speech-Recognition - Success Conditions Beyond Recognition Rates.
Laurie E. Damianos, Jill Drury, Tari Fanderclai, Lynette Hirschman, Jeff Kurtz, Beatrice Oshika, Evaluating Multi-party Multi-modal Systems.
Session WO17: Semantic Lexicons
Chair: D. Theologitis
Adam Kilgarriff, Colin Yallop, What's in a Thesaurus?
Nuria Bel, Federica Busa, Nicoletta Calzolari, Elisabetta Gola, Alessandro Lenci, Monica Monachini, Antoine Ogonowski, Ivonne Peters, Wim Peters, Nilda Ruimy, Marta Villegas, Antonio Zampolli, SIMPLE: A General Framework for the Development of Multilingual Lexicons.
Ivonne Peters, Wim Peters, The Treatment of Adjectives in SIMPLE: Theoretical Observations.
Wim Peters, Ivonne Peters, Lexicalised Systematic Polysemy in WordNet.
Dimitrios Kokkinakis, Maria Toporowska Gronostaj, Karin Warmenius, Annotating, Disambiguating & Automatically Extending the Coverage of the Swedish SIMPLE Lexicon.
Bolette Sandford Pedersen, Sanni Nimb, Semantic Encoding of Danish Verbs in SIMPLE - Adapting a Verb Framed Model to a Satellite-framed Language.
Bernardo Magnini, Gabriela Cavaglià, Integrating Subject Field Codes into WordNet.
Session WO18: Morphology in Lexical and Textual Resources
Chair: N. Bel
Dan Tufis, Péter Dienes, Csaba Oravecz, Tamás Váradi, Principled Hidden Tagset Design for Tiered Tagging of Hungarian.
Frank Van Eynde, Jakub Zavrel, Walter Daelemans, Part of Speech Tagging and Lemmatization for the Spoken Dutch Corpus.
Thorsten Brants, Inter-annotator Agreement for a German Newspaper Corpus.
Davide Turcato, Janine Toole, Stavroula Tsiplakou, Trude Heift, Paul McFetridge, An Approach to Lexical Development for Inflectional Languages.
Fiammetta Namer, Georgette Dal, GéDériF: Automatic Generation and Analysis of Morphologically Constructed Lexical Resources.
Harris Papageorgiou, Prokopis Prokopidis, Voula Giouli, Stelios Piperidis, A Unified POS Tagging Architecture and its Application to Greek.
Jana Klímová, Jan Kocek, Karel Oliva, Derivation in the Czech National Corpus.
Session EO5: Information Retrieval and Question Answering Evaluation
Chair: S. Markantonatou
Martin Braschler, Donna Harman, Michael Hess, Michael Kluck, Carol Peters, Peter Schäuble, The Evaluation of Systems for Cross-language Information Retrieval.
Satoshi Sekine, Hitoshi Isahara, IREX: IR & IE Evaluation Project in Japanese.
Patrick Kremer, Laurent Schmitt, Textual Information Retrieval Systems Test: The Point of View of an Organizer and Corpuses Provider.
Charles L. Wayne, Multilingual Topic Detection and Tracking: Successful Research Enabled by Corpora and Evaluation.
Eric J. Breck, John D. Burger, Lisa Ferro, Lynette Hirschman, David House, Marc Light, Inderjeet Mani, How to Evaluate Your Question Answering System Every Day... and Still Get Real Work Done.
Ellen M. Voorhees, Dawn M. Tice, The TREC-8 Question Answering Track.
Christine Michel, Cardinal, Nominal or Ordinal Similarity Measures in Comparative Evaluation of Information Retrieval Process.

17:20-17:40 Coffee Break

17:40-18:30 Closing Session

21:00 GALA - Hilton Hotel
ML>