Analyse rates of OCR correction
The full text of newspaper articles in Trove is extracted from page images using Optical Character Recognition (OCR). The accuracy of the OCR process is influenced by a range of factors including the font and the quality of the images. Many errors slip through. Volunteers have done a remarkable job in correcting these errors, but it's a huge task. This notebook explores the scale of OCR correction in Trove.
Other options¶
- Run live on Binder (no authentication required)
- Download from GitHub
- View using NBViewer
Additional documentation¶
Getting help¶
Cite as¶
Sherratt, Tim. (2022). GLAM-Workbench/trove-newspapers (version v1.3.4). Zenodo. https://doi.org/10.5281/zenodo.6746078