Harvest of unique urls from the gov.au domain
Harvested in April 2022
This is a dataset of (mostly unique) urls in the gov.au domain harvested using the IA CDX API. It is saved as newline-delimited JSON, with one JSON object per line. Each JSON object contains the following fields:
Field | Description |
---|---|
urlkey |
The domain of the url in SURT (Sort-friendly URI Reordering Transform) format |
timestamp |
Time and date when the url was captured |
original |
The archived url |
Download from CloudStor (75.7gb)
Related resources¶
Additional documentation¶
Getting help¶
Cite as¶
Sherratt, Tim; Jackson, Andrew & Bickford, Jake. (2023). GLAM-Workbench/web-archives (version v1.2.0). Zenodo. https://doi.org/10.5281/zenodo.7898218
Section sponsor
The Web Archives section of the GLAM Workbench is sponsored by the British Library.