Customizing Knowledge in NLP Applications:
Strategies, Issues, and Evaluation


Motivation and Aims

The workshop addresses one of the key challenges in the development of NLP applications: bridging a generic framework with domain/task specific requirements. The issue reduces to the problem of customization of linguistic software and the degree to which this effort can be limited once a given component (i.e., grammar, lexicon, thesaurus, interpreter, tagger, etc.) is used across applications and across domains.

As natural language products play an increasingly more prominent role in the market for knowledge management, information extraction, search and navigation, the suppliers of technologies still struggle with the ability to produce high-quality software which can be deployed to new domains and relatively similar tasks in a short time within budget.

The issue of genericity versus specificity plays out in a number of areas; the key to solving the problem is that of identifying whether there is a particular locus to the dilemma: the architecture of the system, the language resources, the application components.

The problem of customization directly affects the architecture, the development process as well as the evaluation benchmarks for NLP systems. Although the notion of "knowledge bottleneck" has been essentially attached to NLP systems relying on knowledge representation strategies, even statistical NLP systems suffer from a customization problem.

The goal of the workshop is to emphasize the tension as well as the potential for crossfertilization between knowledge-based and corpus-based approaches to customization. While both approaches are needed, the key issue is how to reconcile potential contrasts and define the optimal balance between the two in order to maximise benefits for the content creation/management/delivery applications of focus, e.g. categorization, search, navigation, retrieval, extraction, personalization, generation etc.

Topics of Interest

The workshop aims at bringing together people from both academia and industry to address the variety of topics in the areas of customization, knowledge representation and acquisition, and metrics for measuring complexity. We invite submissions of papers in all areas of customization of NLP components, including, but not limited to, the following topics:

Submission Details

Papers should be submitted electronically to r-knippen@attglobal.net and should be in Word or postscript format. Papers should be no longer than 3,000 words, including the abstract. Contributors should also provide their affiliation and email contact.
Presentations will be allotted 20 minutes, followed by a 10 minute discussion.
Upon notification of acceptance, authors will be provided with the LREC stylesheet and make any necessary reformatting for the camera-ready version to be published in the proceedings.

Important Dates

Deadline for workshop submission25th February 2002
Notification of acceptance15th March 2002
Final version of paper for workshop proceedings8th April 2002
Workshop28th May 2002 (morning session)

Organizing Committee

Federica BusaWebeggfederica_busa@yahoo.com
Evelyne ViegasMicrosoft Corporationevelynev@microsoft.com
Antonio SanfilippoSra Internationalantonio_sanfilippo@sra.com
Robert KnippenLingoMotors Inc.r-knippen@attglobal.net
Connie ParkesDictaphoneCornelia.Parkes@dictaphone.com
Saliha AzzamMicrosoft Corporationsalihaa@microsoft.com
Piek VossenIrion Technologies-
Remi ZajacSystran Corporation-

Workshop Registration Fees

The registration fees for the workshop are: The fees cover the following services: a copy of the proceedings of the attended workshop, coffee-breaks and refreshments.
Participation in the workshop is limited by the venue. Requests for participation will be processed on first come first served basis. Registration will be handled by the LREC Secretariat.

For any further questions related to the workshop, please email Federica Busa, federica_busa@yahoo.com