Harvest of unique urls from the gov.au domain
Harvested in April 2022
This is a dataset of (mostly unique) urls in the gov.au domain harvested using the IA CDX API. It is saved as newline-delimited JSON, with one JSON object per line. Each JSON object contains the following fields:
| Field | Description | 
|---|---|
urlkey | 
The domain of the url in SURT (Sort-friendly URI Reordering Transform) format | 
timestamp | 
Time and date when the url was captured | 
original | 
The archived url | 
Download from CloudStor (75.7gb)
Related resources¶
Additional documentation¶
Getting help¶
Cite as¶
Sherratt, Tim; Jackson, Andrew & Bickford, Jake. (2023). GLAM-Workbench/web-archives (version v1.2.0). Zenodo. https://doi.org/10.5281/zenodo.7898218
Section sponsor
The Web Archives section of the GLAM Workbench is sponsored by the British Library.