Skip to content

Harvesting data about a domain using the IA CDX API

Works with IA

In this notebook we'll look at how we can get domain level data from the IA CDX API. In most other notebooks using the CDX API we've harvested data into memory and then saved to disk later on. Because we're potentially harvesting much larger quantities of data, we're going to reverse this and save harvested data to disk as we download it.

Run live on Binder

Other options

Additional documentation

Getting help

Cite as

Sherratt, Tim; Jackson, Andrew & Bickford, Jake. (2023). GLAM-Workbench/web-archives (version v1.2.0). Zenodo. https://doi.org/10.5281/zenodo.7898218

Section sponsor

The Web Archives section of the GLAM Workbench is sponsored by the British Library.