Back to Main Conference 2020
LREC 2020main

CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data

Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020)

DOI:10.63317/4g79sew7si35

Abstract

No abstract available.

Details

Paper ID
lrec2020-main-494
Pages
pp. 4003-4012
BibKey
wenzek-etal-2020-ccnet
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-34-4
Conference
Twelfth Language Resources and Evaluation Conference
Location
Marseille, France
Date
11 May 2020 16 May 2020

Authors

  • GW

    Guillaume Wenzek

  • ML

    Marie-Anne Lachaux

  • AC

    Alexis Conneau

  • VC

    Vishrav Chaudhary

  • FG

    Francisco Guzmán

  • AJ

    Armand Joulin

  • EG

    Edouard Grave

Links