Summary of the paper

Title The U.S. Policy Agenda Legislation Corpus Volume 1 - a Language Resource from 1947 - 1998
Authors Stephen Purpura, John Wilkerson and Dustin Hillard
Abstract We introduce the corpus of United States Congressional bills from 1947 to 1998 for use by language research communities. The U.S. Policy Agenda Legislation Corpus Volume 1 (USPALCV1) includes more than 375,000 legislative bills annotated with a hierarchical policy area category. The human annotations in USPALCV1 have been reliably applied over time to enable social science analysis of legislative trends. The corpus is a member of an emerging family of corpora that are annotated by policy area to enable comparative parallel trend recognition across countries and domains (legislation, political speeches, newswire articles, budgetary expenditures, web sites, etc.). This paper describes the origins of the corpus, its creation, ways to access it, design criteria, and an analysis with common supervised machine learning methods. The use of machine learning methods establishes a baseline proposed modeling for the topic classification of legal documents.
Language Single language
Topics Corpus (creation, annotation, etc.), Document Classification, Text categorisation, Text mining
Full paper The U.S. Policy Agenda Legislation Corpus Volume 1 - a Language Resource from 1947 - 1998
Slides -
Bibtex @InProceedings{PURPURA08.105,
  author = {Stephen Purpura, John Wilkerson and Dustin Hillard},
  title = {The U.S. Policy Agenda Legislation Corpus Volume 1 - a Language Resource from 1947 - 1998},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA