Back to Main Conference 2002
LREC 2002main
Syntactic Analysis in the Spoken Dutch Corpus (CGN)
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)
Abstract
The paper describes the syntactic annotation of the Spoken Dutch Corpus ("Corpus Gesproken Nederlands" or CGN), the Dutch-Flemish project (1998-2003) aiming at the collection, description and annotation of ten million words of spoken Dutch. In the first part, the background of the parsing strategy is discussed, as well as some details concerning the actual implementation of the parsing process. The second part discusses some examples of practical applications of the result of the parsing process.