Let’s Get Digital: Collections Management and Data Science in a post COVID-19 World
By Emily Pearce Seigerman, Museum Assistant, Numismatics, Yale University Art Gallery
Museological degree programs geared toward would-be collections managers teach students the proper mechanisms for care of collections. “Care of collections” spans everything from database data entry, to preventative conservation methods, to cultural heritage law. At the core of collections management is the skill to create and understand object numbering—catalog numbers and accession numbers being the king and queen of museum integers. As collections quantities increase and technology continually develops, collections managers are seeing an ever-increasing backlog of objects requiring digitization or digital cataloging. The COVID-19 pandemic has strikingly reinforced the need for good, digitally accessible object records—in other words good, clean, accessible data—which can only be produced through proper data entry (via cataloging) and data manipulation. To meet the growing call for “good data” and to prepare for future catastrophe, collections managers must cultivate the skills of a data scientist.
While small museums and historic houses are often stewarded by cultural heritage’s unsung heroes who wear the hats of collections manager, Registrar, Curator, Custodian, Public Programmer, and Educator, larger more segmented cultural repositories have seen an ever-increasing divide between the role of collections manager and Database Administrator. As this divide has widened, collections management education has neglected lessons on computer literacy skills. Documentation and data asset or information technology departments tend to exist outside of registration and collections management departments and typically are responsible for the back-end processing for collections management systems (also called collections documentation or information systems). The physical separation from data-driven departments has largely discouraged collections managers (or their management) from investing in computer literacy skill development despite the immense usage of databases required for modern care of and access into collections.
Collections Management Systems (CMS) are pivotal tools for capturing and tracking object and institutional data.[1] These relational databases must be navigable by staff and—if the CMS is the basis for online collection searching—researchers and the public. While smaller museums tend to rely on single spreadsheet programs like MS Access to capture catalog information, larger repositories have come to rely on relational databases. Relational databases store entered data in tables that are then linked to other tables by common data. The links, or joins, allow for easy access of new tables created via search queries. For good tables to be generated good data is required; and good data comes from good data entry. Collections staff are typically responsible for the initial cataloging of incoming objects and thus are the first hands to produce object-specific metadata in a museum’s CMS. Because of this fact, collections managers should have a basic knowledge of data science, including an intermediate familiarity with spreadsheet programs like Microsoft Excel, to ensure production of good, clean data. Spreadsheets allow for the classification and analysis of data stored in rows and columns which then generate tables and graphs to visually capture data. Most importantly, spreadsheets allow for easy, quick, and large-scale edits and manipulation in data.
A basic understanding of spreadsheet formulae, pivot tables, and data-entry requirements grants collections managers and their teams a pathway to catalog large numbers of objects at an increased rate with more accuracy rather than creating individual catalog records. It also makes possible batch cleaning old institutional data. The ability to quickly clean catalog records saves hours of labor for collections and data management teams. A relational database skillset also dispels the barrier of often misunderstood job-specific languages and communication errors between collections and data managing departments. As such, basic data and computer science courses should be incorporated into musicological education and studies in collections care.
Like all aspects of cultural heritage stewardship, daily practice of collections care has drastically shifted over the past few months. Cultural heritage institutions have become reliant on online platforms to engage with their communities. The focus of collections staff can no-longer center on daily preventative care for physical objects due to institutional closures and public safety social distancing regulations. As museums continue to depend on digital presence to engage researchers and the public, collections management must allocate more time to improving collections data. As staff remain quarantined at home, collections professionals should consider professional development learning opportunities in data science courses (like Excel, programing languages like Python or SQL, or Business Intelligence). Fluency in spreadsheet manipulation will allow a host of collections professionals to edit, clean, and update old object data from home (should they have access to it). The socially distanced museum is an opportunity for collections staff to hyper-focus on cleaning old data thus bringing catalog information up-to-date and more easily discoverable for digital visitors.
With current social distancing and work from home precautions in place throughout New England, those institutions with already robust digital collections records are facing less disconnect than those trying to create online collections access points. Collections management staff should work with their information technology and/or data asset management teams to take advantage of the frustration caused by physical absence from our beloved collections by cleaning digital object catalog information. Doing so will not only better prepare museums for any future catastrophes; it will also better connect our institutions with the communities, researchers, and visitors our collections represent.
In the Museum Computer Network’s March 2020 blog post, “The 8 Essential Things Museums are Providing Right Now”, Lori Byrd-McDevitt pinpoints some successes of our colleagues across the U.S. experimenting with digital access via both traditional and new access points. At the heart of each experience—and all cultural institutions—is connecting with collections and the comfort/lessons/humor/reverence/pleasure they impart to visitors particularly at a time of great distress. As Museum Studies programs continue to produce the next generation of collections and data management professionals, let this moment be studied as an example of successful use of cultural heritage data. By investing in computer literacy now, collections managers can make cultural heritage accessible even (or especially) in difficult times.
[1] Some frequent used CMS’s are The Museum System (TMS), Ke Emu, Past Perfect, Conservation Space, and Argus.