Trove API v3
Trove API access restrictions
With the cancellation of my Trove API keys by the National Library of Australia, I've made the difficult decision to stop work on Trove and archive all related code repositories.
The Trove sections of the GLAM Workbench will remain online, but they won't be updated. Everything here is openly-licensed, so feel free to take what’s useful and develop it further yourself.
Given the fact that the NLA is willing to change the API terms of use to restrict access without any consultation, provides no transparency around acceptable use of full text content, and is willing to cancel API keys without warning, I can no longer recommend Trove as a reliable source for digital research.
Version 2 of the Trove API has now been decommissioned, so you have to use v3. The Trove documentation describes v3 as 'finalised', but there are still a number of outstanding bugs.
Timeline¶
- March 2023: Trove API v3 beta (release 1)
- May 2023: Trove API v3 beta (release 2)
- June 2023: Trove API v3 beta (release 3)
- silence for many months...
- September 2024: Trove API v2 decommissioned without warning
Trove API Console¶
The Trove API Console has been updated to add examples from v3 of the API. It provides many sample queries that you can run and modify without the need for an API key.
Migration tips¶
All the breaking changes I know about are listed below, but here's a short summary of things you should look out for when migrating your code from version 2 to version 3 of the Trove API.
Update API version¶
This is an obvious one – change the /v2/ in your API urls to /v3/!
Moving from zones to categories¶
You can't just change all the the zone parameters to category as there's no one to one correspondence between the old zones and new categories. In some cases you can add the l-artType facet to try to match the old behaviour.
| Zone | Category |
|---|---|
newspaper |
newspaper – add l-articleType=newspapers (note the plural) to keep gazettes out |
gazette |
newspaper – add l-articleType=gazette |
book |
book – mostly the same but seems to have gained some formats and lost theses |
article |
this zone has been split – NLA digitised articles are in the magazine category, other articles are in research and book, periodical records are in book |
picture |
image – this category now includes the contents of the former map zone, use l-artType=Pictures and photos to filter the maps out |
music |
music |
map |
picture – add l-artType=Maps |
collection |
diary |
list |
list |
people |
people |
Be aware of changes to record structure¶
This one is easy to overlook and likely to have you scratching your head at unexpected errors. The actual structure of most JSON and XML records has changed. In most cases the top level container has been removed, so any code that drills down through the record hierarchy to find a value will break. See below for the specific changes for each endpoint.
There's also some changes in the way lists of resources are provided. For example, facets are now always returned as an array of objects no matter how many you request. Previously if you requested a single facet, you got a single object.
If something's not working after you migrate – check these things first! They've already tripped me up a couple of times.
Other odd and annoying changes¶
Some changes just seem to be designed to break your code. In v2 the text version of a newspaper title was found in record["title"]["value"], but in v3 this has changed to record["title"]["title"]. The word and illtype facets have changed to wordCount and illustrationType. Why?
See below for various odd changes to List records.
Some url values have changed¶
The value of trovePageUrl has changed. Previously it was of the form https://trove.nla.gov.au/ndp/del/page/[PAGE ID], but in v3 it is https://nla.gov.au/nla.news-page[PAGE ID]. If, like me, you've been using regular expressions to extract the numeric identifier you'll have to update your code.
Don't lose your keys!¶
API keys now expire after 12 months, even if they're in use! You'll get an email asking if you want to renew your key – don't ignore this email or your application will stop working!
Summary of breaking changes¶
Info
This is a work in progress as I work my way through the API changes. Additions are very welcome.
This summary only includes breaking changes – the things you'll need to change in your code before v2 of the API is switched off in early 2024. For a full list of changes see the introduction to the v3 API, and the v3 technical guide.
The release notes don't mention the changes to the JSON responses. In general, the changes flatten the JSON responses by removing top-level wrappers that served no function. This means the way you access data within responses has changed. For example, assuming that you've saved the JSON response as a variable called data, to access the title of a newspaper article retrieved via the /newspaper/[article id] endpoint in version 2 you would use:
data["article"]["heading"]
While in version 3 you'd use:
data["heading"]
There are similar changes across all endpoints.
All endpoints¶
- API keys will now automatically expire 12 months after activation! (There will be an option to renew.)
- change
v2tov3in the endpoint url! callbackparameter for JSONP removed, use CORS insteaddefault encoding now JSON not XML (I've been told this is a bug and release 3 will make XML the default again)Default encoding is once again XML (same as API v2).
/result endpoint¶
zoneparameter has been replaced withcategory, possible values are: all, newspaper, magazine, image, research, book, diary, music, people, list; note that in most cases there is no one-to-one correspondence between zones and categories as the boundaries of what's included have changed, for example, the 'newspaper' category includes the contents of both the 'newspaper' and 'gazette' zones- the handling of 'blank' searches has changed –
in version 2 you could use a " " space as a value forrelease 2 of the v3 beta made theqallowing you to search for everything, this doesn't work in v3beta, however, using the unicode null value,%00, does seem to workq(query) parameter optional. This means blank searches are supported without any need for workarounds. - the behaviour of the
includeparameter has changed – previously you could supply multiple values toincludeas a comma-separated string, now to specify mutiple values you have to includeincludemultiple times (eg. instead of&include=tags,commentsyou need&include=tags&include=comments). - some facet names have changed:
wordis nowwordCount, andilltypeis nowillustrationType - JSON response format changes:
- top-level
responsekey removed zonekey changed tocategory- the meaning of the
nameparameter inside eachcategoryvalue has change – the old name value can be accessed via thecodekey, whilenamenow returns the category's display name. - If you requested a single facet in version 2, it was returned as a JSON object. In version 3, facets are always returned as an array of objects no matter how many you request. So if you include the
facet=formatparameter in your request, the facet terms within each category will be incategory["facets"]["facet"][0]rather thancategory["facets"]["facet"]. Compare the v2 result with the v3 result.
- top-level
Newspapers and gazette records (both /result and /newspaper/[article id])¶
- urls of newspaper and gazette articles and pages returned by the API have changed to use NLA identifiers. In most cases this shouldn't break anything as they still go to the same place. However, if you've been extracting the numeric page identifier from the
trovePageUrlthen you might need to adjust your code. Previously thetrovePageUrlvalue was of the formhttps://trove.nla.gov.au/ndp/del/page/[PAGE ID], but in v3 it ishttps://nla.gov.au/nla.news-page[PAGE ID]. - the structure of the
titleelement has changed – in v2 the text version of a newspaper title was found inrecord["title"]["value"], but in v3 this has changed torecord["title"]["title"]
List records (both /result and /list/[list id])¶
- in v2 the dates a list was created and updated were available at
record["created"]andrecord["lastupdated"], but in v3 both dates are underrecord["date"]sorecord["date"]["created"]andrecord["date"]["lastupdated"]. - the values of
tagCountandcommentCounthave moved – in v2 they were available atrecord["tagCount"]andrecord["commentCount"], but in v3 they have moved down a level torecord["tagCount"]["value"]andrecord["commentCount"]["value"]
/work/[work id] endpoint¶
- JSON response format changes:
- top-level
workkey removed
- top-level
/newspaper/[article id] endpoint¶
- JSON response format changes:
- top-level
articlekey removed
- top-level
/newspaper/titles endpoint¶
- JSON response format changes:
- top-level
responseandrecordskeys removed
- top-level
/newspaper/titles/[title id] endpoint¶
- JSON response format changes:
- top-level
newspaperkey removed
- top-level
/list/[list id] endpoint¶
- JSON response format changes:
- top-level
listkey removed – note too that in v2listreturned an array (even though there was only one value), in v3 there is just a single value, not an array; so instead of usingdata["list"][0]["title"]to get the title of a list, you'll need to usedata["title"].
- top-level
contributor endpoint¶
There are major changes here that aren't mentioned in the release notes. Previously a call to this endpoint without additional parameters just returned a long list of contributors. Now, you need to supply a q parameter to filter the results.
Release 2 of the v3 API beta made theqparameter added and required – to do a blank search and get all contributors (ie the v2 behaviour), include theqparameter with no value, eg:contributor?q=&encoding=json&reclevel=fullqparameter optional.- JSON response format changes:
- top-level
responsekey removed
- top-level
contributor/[NUC] endpoint¶
- JSON response format changes:
- top-level
contributorkey removed
- top-level
magazine/titles/ endpoint¶
This endpoint was introduced in release 2 of the Trove API v3 beta, however, it's currently returning no results. I've been told this will be fixed in release 3. Fixed in release 3.