Skip to content

Exploring subdomains in the whole of

Works with IA

Dendrogram of govau

Most of the notebooks in this repository work with small slices of web archive data. In this notebook we'll scale things up a bit to try and find all of the subdomains that have existed in the domain. As in other notebooks, we'll obtain the data by querying the Internet Archive's CDX API. The only real difference is that it will take some hours to harvest all the data. Once we have the data we'll do some analysis, and visualise the domain hierarchy as a dendrogram.

Run live on Binder

Other options

Additional documentation

Getting help

Cite as

Sherratt, Tim; Jackson, Andrew & Bickford, Jake. (2023). GLAM-Workbench/web-archives (version v1.2.0). Zenodo.

Section sponsor

The Web Archives section of the GLAM Workbench is sponsored by the British Library.