Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
DiNoS: Creating a Data-Driven German Noun Phrase Lexicon from Universal Dependencies
Paper Fields
Click the edit button next to a field to report a correction.
DiNoS: Creating a Data-Driven German Noun Phrase Lexicon from Universal Dependencies
To foster investigations of noun phrase (NP) inflection in German at scale, this paper introduces DiNoS (Distributional Noun Structure), a data-driven lexicon of NP heads, which includes statistical information on the dependents and the morphosyntactic features of their original in-context appearances. We make available the source code for the extraction of NPs from CoNLL-U treebanks, which includes rule-based heuristics to improve feature annotation coverage and ensures a homogeneous lemmatisation strategy across treebanks. While the resulting JSON-based lexicon is suitable for no-code interaction for non-experts, it is further supported by a toolkit for the automatic calculation of, and access to, various statistical overviews. In this paper, we present the heuristics employed to extract NP datasets from the German Universal Dependencies’ Hamburg Dependency and GSD treebanks. In addition, we provide a preview of the emerging DiNoS lexica’s properties and discuss some implications of noun and determiner word form ambiguity for NP complexity.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.