Trove newspapers with non-English language content
Updated: 13 September 2024
This dataset contains information about newspapers published in languages other than English that have been digitised and made available through Trove. Data about the languages present in newspapers was generated by harvesting a sample of articles from each newspaper using the Trove API, and then using language detection software on the OCRd text of each article.
Files¶
newspapers_non_english.csv
¶
The dataset contains the following columns:
Column | Contents |
---|---|
id |
newspaper id |
title |
newspaper title |
language |
language code |
proportion |
proportion of articles in this language |
number |
number of articles sampled |
language_full |
full language name |
Download from GitHub Explore in Datasette
non-english-newspapers.md
¶
This is a markdown-formatted list created by grouping the dataset by newspaper title. It includes details of the main languages in each newspaper.
Generated by¶
Getting help¶
Cite as¶
Sherratt, Tim. (2024). GLAM-Workbench/trove-newspapers-non-english (version v1.1). Zenodo. https://doi.org/10.5281/zenodo.13761509