Access MARC data describing an item in the SLV catalogue¶
If you have an item's Alma identifier, you can retrieve structured metadata describing the item from the SLV catalogue in a couple of ways. One approach is to download a text representation of the item's MARC record and extract data from it. This notebook provides some examples of how you can do this.
You can find an item's Alma identifier by looking for 'Record ID' in the 'Details' section of the catalogue entry.
import re
import requests
To get the text representation of an item's MARC record, request a url of the form:
https://find.slv.vic.gov.au/primaws/rest/pub/sourceRecord?docId=alma[ALMA ID]&vid=61SLV_INST:SLV
inserting the Alma ID where indicated.
def get_marc_record(alma_id):
"""
Gets a text representation of an item's MARC record.
"""
response = requests.get(
f"https://find.slv.vic.gov.au/primaws/rest/pub/sourceRecord?docId=alma{alma_id}&vid=61SLV_INST:SLV"
)
return response.text
marc = get_marc_record("9921188273607636")
print(marc)
leader 01506cem a2200373 a 4500 001 9921188273607636 005 20240528080520.0 007 aj aanzn 008 101005s1968 vra a s 0 eng d 034 1#$aa $b15840 $dE1425000 $eE1425000 $fS0380000 $gS0380000 035 ##$a(AuCNLKIN)000027964179 035 ##$a(OCoLC)221962153 035 ##$a2118827 035 ##$a(Voyager)2118827-slvdb-Voyager 035 ##$aIE7027444 040 ##$aVSL $beng $cVSL $dVSL $dVSL $dVSL $dVSL $dVSL $dVSL $dVSL $dVSL 042 ##$aanuc 043 ##$au-at-vi 110 1#$aVictoria. $bDepartment of Crown Lands and Survey. 245 10$aToorak, County of Hampden $h[cartographic material] / $cdrawn and reproduced at the Department of Lands and Survey, Melbourne. 255 ##$aScale [ca. 1: 15 840] $c(E 142°50'/S 38°00'). 260 ##$aMelbourne : $bDept. of Lands and Survey, $c1968. 300 ##$a1 map ; $con sheet 76 x 102 cm. 500 ##$aCadastral map showing parish boundaries and land ownership. 540 ##$aNo copyright restrictions apply. 542 ##$lThis work is out of copyright 650 #0$aReal property $zVictoria $zToorak (Parish) $vMaps. 651 #0$aToorak (Vic. : Parish) $vMaps. 830 #0$aParish maps of Victoria. 950 ##$aMaps $bStillImage $cimage/tiff $d1 $f1968 $o1 map ; on sheet 76 x 102 cm. $qToorak, County of Hampden 956 ##$a10381/139039 $bONE $c1415415 $eIE7027444 $f9921188273607636 $gDigitised $hslvdb $iAVAILABLE $jSIP3535 $kdq005511 984 ##$aVSL $cheld 997 ##$7CRSU 999 ##$9OC
As you can see above, each line of the MARC record includes a tag ( eg 245) and a series of values, separated from the tag by a tab character. The values are defined by series of subfields whose labels begin with a $ sign (eg $a). For example, to find the title of the item you'd look in tag 245 for subfield a. There are also two characters at the beginning of each set of values used as indicators to provide additional information.
There are specialised MARC tools available for parsing and manipulating records, but they might be a bit complex for your needs. You can find tag/subfield values just by using regular expressions to extract them from the MARC text.
def get_marc_value(marc, tag, subfield=None):
"""
Gets the value of a tag/subfield from a text version of an item's MARC record using regular expressions.
"""
try:
# Get the line that starts with the specified tag
tag = re.search(rf"^{tag}\t.+", marc, re.M).group(0)
if subfield:
# If a subfield has been requested, get the subfield value
value = re.search(rf"\${subfield.lstrip('$')}([^\$]+)", tag).group(1)
else:
# If no subfield has been requested, just return the tag value
value = tag.split()[1:]
except AttributeError:
return None
return value.strip(" .,")
get_marc_value(marc, "245", "$a")
'Toorak, County of Hampden'
An alternative approach is to convert the whole MARC record into a Python dictionary by splitting the lines on the tab characters and dollar signs. You can then access the tags and subfields from the dict.
def convert_marc_to_dict(marc):
"""
Converts the MARC text record into a dict, organised by tag and $ subfields.
Indicators are ignored.
"""
marc_dict = {}
# Loop through each line by splitting the text on newline characters
for line in marc.split("\n"):
if line:
# Split tag from values on tab characters
tag, values = line.split("\t")
# If there are no subfields (no $ signs in the values) add the tag and value to the dict
if "$" not in values:
marc_dict[tag] = values.strip()
# If there are subfields we'll process each one and add to the dict
else:
marc_dict[tag] = {}
# Strip the two indicator characters from the front of the values and split on $ sign
# Loop through all the subfields
for subfield in values[2:].split("$"):
if subfield:
# Get the subfield label from the front of the string
# Add the label and value to the dict
marc_dict[tag][f"${subfield[0]}"] = subfield[1:].strip()
return marc_dict
marc_dict = convert_marc_to_dict(marc)
marc_dict
{'leader': '01506cem a2200373 a 4500',
'001': '9921188273607636',
'005': '20240528080520.0',
'007': 'aj aanzn',
'008': '101005s1968 vra a s 0 eng d',
'034': {'$a': 'a',
'$b': '15840',
'$d': 'E1425000',
'$e': 'E1425000',
'$f': 'S0380000',
'$g': 'S0380000'},
'035': {'$a': 'IE7027444'},
'040': {'$a': 'VSL', '$b': 'eng', '$c': 'VSL', '$d': 'VSL'},
'042': {'$a': 'anuc'},
'043': {'$a': 'u-at-vi'},
'110': {'$a': 'Victoria.', '$b': 'Department of Crown Lands and Survey.'},
'245': {'$a': 'Toorak, County of Hampden',
'$h': '[cartographic material] /',
'$c': 'drawn and reproduced at the Department of Lands and Survey, Melbourne.'},
'255': {'$a': 'Scale [ca. 1: 15 840]', '$c': "(E 142°50'/S 38°00')."},
'260': {'$a': 'Melbourne :',
'$b': 'Dept. of Lands and Survey,',
'$c': '1968.'},
'300': {'$a': '1 map ;', '$c': 'on sheet 76 x 102 cm.'},
'500': {'$a': 'Cadastral map showing parish boundaries and land ownership.'},
'540': {'$a': 'No copyright restrictions apply.'},
'542': {'$l': 'This work is out of copyright'},
'650': {'$a': 'Real property', '$z': 'Toorak (Parish)', '$v': 'Maps.'},
'651': {'$a': 'Toorak (Vic. : Parish)', '$v': 'Maps.'},
'830': {'$a': 'Parish maps of Victoria.'},
'950': {'$a': 'Maps',
'$b': 'StillImage',
'$c': 'image/tiff',
'$d': '1',
'$f': '1968',
'$o': '1 map ; on sheet 76 x 102 cm.',
'$q': 'Toorak, County of Hampden'},
'956': {'$a': '10381/139039',
'$b': 'ONE',
'$c': '1415415',
'$e': 'IE7027444',
'$f': '9921188273607636',
'$g': 'Digitised',
'$h': 'slvdb',
'$i': 'AVAILABLE',
'$j': 'SIP3535',
'$k': 'dq005511'},
'984': {'$a': 'VSL', '$c': 'held'},
'997': {'$7': 'CRSU'},
'999': {'$9': 'OC'}}
marc_dict["245"]["$a"]
'Toorak, County of Hampden'
# IGNORE TESTING ONLY
get_marc_value(marc, "245", "$a") == "Toorak, County of Hampden"
marc_dict["245"]["$a"] == "Toorak, County of Hampden"
True
Created by Tim Sherratt for the GLAM Workbench. If you find this useful, you can sponsor me on GitHub.