Skip to content

Harvest Pandora subjects and collections

This notebook helps you create a dataset of archived urls using Pandora's subject and collection groupings.

The Australian Web Archive makes billions of archived web pages searchable through Trove. But how would you go about constructing a search that would find websites relating to election campaigns? Fortunately you don't have to, as Pandora provides a collection of archived web resources organised by subject and collection. By using harvests of Pandora's subject hierarchy and a complete list of archived titles, this notebook makes it easy for you to create custom datasets relating to a specific topic or event.



Using this notebook

To run this notebook using the ARDC Binder service you'll need to log in using an account from an Australian university or research organisation. If you don't have an account, try MyBinder instead.

Run live on ARDC Binder

The MyBinder service doesn't require any authentication, but it can be slow to start and will sometimes fail when busy. If you have a login at an Australian university, you'll probably get better results with ARDC Binder.

Run live on MyBinder

Binder is great for experimentation and quick tasks, but for some projects you might need a dedicated, persistent environment in which to work. There's information on other options in the run these notebooks section.

Additional documentation

Getting help

Cite as

Sherratt, Tim. (2024). GLAM-Workbench/trove-web-archives (version v1.0.1). Zenodo.