Back to Main Conference 2000
LREC 2000main
Providing Internet Access to Portuguese Corpora: the AC/DC Project
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC 2000)
Abstract
In this paper we report on the activity of the project Computational Processing of Portuguese (Processamento computacional do portugues) in what concerns providing access to Portuguese corpora through the Internet. One of its activities, the AC/DC project (Acesso a corpora/Disponibilizacao de Corpora, roughly ''Access and Availability of Corpora'') allows a user to query around 40 million words of Portuguese text. After describing the aims of the service, which is still being subject to regular improvements, we focus on the process of tagging and parsing the underlying corpora, using a Constraint Grammar parser for Portuguese.