Back to Main Conference 2004
LREC 2004main

Bayesian Semantics Incorporation to Web Content for Natural Language Information Retrieval

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/3nhibi65xvmo

Abstract

For the present work, we endeavor with the important aspect of information retrieval of Web content using natural language queries. Currently, markup languages and formalisms do not fully provide mechanisms for effective and accurate analysis of Web content but rather provide means for describing the content in a more human-centric approach. As a result, natural language queries cannot be handled by the Internet search engines. Other approaches use grammar markup labels that attempt to fully match an unforeseen query. For the purposes of this paper, we introduce the theoretical and implementation issues of a novel, statistical framework that can cope with Web content analysis and information retrieval using natural language. The framework is based on Bayesian networks, a tool for knowledge representation and reasoning under conditions of uncertainty. The Web page designer provides the lexical items that contain useful information and labels the corresponding semantic interpretation, from a pre-defined set of domain categories. This knowledge is used for learning the structure and the parameters of a Bayesian network. At the time a user’s query is encountered, the network is used in order to return pages that contain the most related semantic content to the user’s query.

Details

Paper ID
lrec2004-main-356
Pages
N/A
BibKey
maragoudakis-fakotakis-2004-bayesian
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • MM

    Manolis Maragoudakis

  • NF

    Nikos Fakotakis

Links