PressMint QuickCheck: Operationalising Readiness Diagnostics for Interoperable Historical Newspaper Corpora
Proceedings of the First Workshop on Creating Interoperable Corpora of Historical Newspapers
Abstract
PressMint QuickCheck is a lightweight, reproducible readiness diagnostic for historical newspaper collections. Given a candidate dataset (ZIP export or IIIF manifests), it detects which components are present, identifies interoperability-critical metadata gaps, and applies lightweight OCR sanity checks. It produces three standardised artefacts: a human-readable readiness report, a minimal normalised manifest (CSV), and a tentative v1 scorecard (suitability_score 0-4) for prioritisation across collections. The workflow is delivered as a Colab-first notebook (no installation required). A key design decision treats content_language and metadata_language declarations as first-class interoperability signals, reflecting the multilingual scope of PressMint and ParlaMint corpora projects.