Back to Main Conference 2018
LREC 2018main

FrNewsLink : a corpus linking TV Broadcast News Segments and Press Articles

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/592zva5xsfoj

Abstract

In this article, we describe FrNewsLink, a corpus allowing to address several applicative tasks that we make publicly available. It gathers several resources from TV Broadcast News (TVBN) shows and press articles such as automatic transcription of TVBN shows, text extracted from on-line press articles, manual annotations for topic segmentation of TVBN shows and linking information between topic segments and press articles. The FrNewsLink corpus is based on 112 (TVBN) shows recorded during two periods in 2014 and 2015. Concomitantly, a set of 24,7k press articles has been gathered. Beyond topic segmentation, this corpus allows to study semantic similarity and multimedia News linking.

Details

Paper ID
lrec2018-main-329
Pages
N/A
BibKey
camelin-etal-2018-frnewslink
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • NC

    Nathalie Camelin

  • GD

    Géraldine Damnati

  • AB

    Abdessalam Bouchekif

  • AL

    Anais Landeau

  • DC

    Delphine Charlet

  • YE

    Yannick Estève

Links