Skip to content

Harvest the full collection of Pandora titles

This notebook harvests a complete collection of archived web page titles from Pandora, the National Library of Australia's selective web archive.

Pandora has been selecting web sites and online resources for preservation since 1996. It has assembled a collection of more than 80,000 titles, organised into subjects and collections. The archived websites are now part of the Australian Web Archive (AWA), which combines the selected titles with broader domain harvests, and is searchable through Trove. However, Pandora's curated collections offer a useful entry point for researchers trying to find web sites relating to particular topics or events.

Preview

Expand

Using this notebook

To run this notebook using the ARDC Binder service you'll need to log in using an account from an Australian university or research organisation. If you don't have an account, try MyBinder instead.

Run live on ARDC Binder

The MyBinder service doesn't require any authentication, but it can be slow to start and will sometimes fail when busy. If you have a login at an Australian university, you'll probably get better results with ARDC Binder.

Run live on MyBinder

Binder is great for experimentation and quick tasks, but for some projects you might need a dedicated, persistent environment in which to work. There's information on other options in the run these notebooks section.

Additional documentation

Getting help

Cite as

Sherratt, Tim. (2024). GLAM-Workbench/trove-web-archives (version v1.0.1). Zenodo. https://doi.org/10.5281/zenodo.11124402