Skip to main content

Librarian's Guide to Artstor

Metadata Enhancements

Images in Artstor are contributed to the database from institutions and archives around the world. To make these images available in a single database, Artstor modifies the metadata contributors provide in a number of ways.

Metadata Enhancements
Artstor enables all collections in the Digital Library to be searched and browsed by object-type classification (e.g. painting, architecture, etc.), country/region, and earliest and latest date. This is accomplished by adding information to records rather than altering the source data. The classification terms are applied from an in-house, controlled list (painting, sculpture, etc.); the country terms are from the Getty Research Institute's Thesaurus of Geographic Names (TGN); and numeric earliest and latest dates are created for each record. Our Artstor Advanced Search and Browse functions depend on these uniform access points to improve the user's ability to search and browse across all the Artstor collections. These enhancements are not visible within the metadata records but rather operate "behind-the-scenes" within the database to facilitate searching and browsing.

Clustering Images
With over 2 million assets (and growing), it becomes increasingly difficult to readily find all the images of the same work. We often receive duplicates or details of the same work from an individual contributor or from multiple contributors. To help alleviate this problem, Artstor is grouping or "clustering" duplicates and details representing a unique work. We cluster these images and details of the same work "behind" the highest quality image of the whole work we have available.

Associated Images
Drawing upon data from user-curated image groups, Artstor is also able to present researchers with the option of discovering "associated images." Through mathematical analysis, Artstor can determine which images have been "saved" in association with other specific images in groups that have been created by instructors. We assume that images saved repeatedly in such "associated" groups are related in ways that are useful to teachers and scholars. These collaboratively filtered groups conveniently bring together many works associated with the lead images. At times, these juxtapositions in the "long tail" can be surprising, original, and intellectually stimulating.

Controlled Vocabularies
Because of the diversity among
Artstor collections, standard vocabularies or thesauri are particularly significant for discovery. Artstor uses the Getty Research Institute's Union List of Artist Names (ULAN) to match artists' names with an authoritative creator record. Through these matches, links are established between the source creator name and the ULAN creator record which allows, for example, a user searching for works by Gerrit von Honthorst to find images that have the name Gherardo della Notte or Gherardo Fiammingo in the source data record. We intend to extend this matching of source data to external vocabularies into other areas of information such as repository names, geographical locations, styles, and periods.

Metadata Policy and Standards

The Artstor Digital Library aggregates collections from a wide range of sources such as museums, libraries, photo archives, scholars, photographers, artists, and artists' estates. Since each institution and individual can have differing uses of and requirements for metadata, the cataloging and descriptive data we receive vary greatly in the use (or absence) of standardized vocabularies, in the choice of terminology, and in the metadata schema used to organize the data. Even within a single institution, the focus, and thus the cataloging, may vary. A museum conservation department's terminology and focus will differ from that used by the curatorial departments, and a curatorial department that is responsible for archaeological material will collect and store information in ways that differ significantly from, say, a modern painting department. Since the holdings of museums, artists, archives, and other contributors are most often unique, those cataloging methods and controlled vocabularies have often been created for the local environment, and not necessarily to enable sharing those collections with a broad range of educational and scholarly users. While this outlook is changing as the usefulness of sharing descriptive data and using shared standards is recognized, Artstor still receives hundreds of thousands of records from contributors whose descriptive data were created well before the potential value of sharing was recognized and before any generally accepted standards were in use; to this day, the community does not share any one standard and data continues to be heterogeneous. The variation in quality, authority, and consistency that one sees within the Artstor Digital Library is the result, then, of our bringing together contributions from a wide variety of sources.

Artstor, by necessity, considers the data from multiple perspectives. From one vantage point, the data reflects the interests, concerns, and point of view of the contributing institution. In this sense the data serves to describe the works and images in terms selected by the original source. Artstor makes some attempt to preserve, almost in an archival sense, the characteristics of the source data. Yet this attempt must often be modified to bring all data sources into enough harmony that search and discovery across the Digital Library are as effective as possible. In addition, our users expect the descriptive data to be accurate and "correct." Satisfying this view brings with it special challenges in areas where facts are scarce and/or scholarly opinions differ. As information for images, objects, architecture, and other materials changes over time, and as scholarship continues to evolve, to uncover new findings, to spark new debates, or to trigger reassessments of attributions, dates, and other descriptive information, Artstor works with our numerous contributors to update and refresh the metadata records in the Digital Library. We are committed to facilitating the ongoing work of educators and students as well as scholars and researchers (who welcome and often benefit from varying metadata records that might reflect different modes of thinking, historical trends, or phases of scientific research). For these reasons, we attempt to preserve and share as many versions of images and metadata records as are provided to us by our international contributors, while foregrounding the highest resolution image (and its associated metadata record) available. The descriptive data records one sees in Artstor are thus a compromise; the measures described below are intended to enhance discovery and access for all of our educational users—scholars, curators, educators, librarians, and students.