Laying the Groundwork (March 17, 2021)
Sharon M. Leon
March 17, 2021
Since beginning our work on July 1, 2020, the On These Grounds team has been diligently working to design and test a linked data ontology that describes the events experienced by enslaved people who labored at colleges and universities. This work is so interdisciplinary that the effort has taken full advantage of the skills and experience of our group of historians, archivists, librarians, and digital scholars. Now as we approach the end of March, we are on the cusp of the next stage of our work: releasing the alpha event model and recruiting a group of external testing partners.
Our work developing the alpha model has been deeply grounded in our engagement with collections materials. Among our first tasks as a group was to survey the records available to us for description. Each member of the teams at Georgetown, the University of Virginia, and Michigan State University selected a sample document to describe from the available collections. Outlining the key elements of event-related data that were available, we shared our findings and expectations about thorough and responsible description. Those opening conversations laid the groundwork for the creation of a core set of information that we would want to represent regardless of the type of event at hand. The discussions also immediately plunged us into some of the thorny questions about representing uncertainty and inferred information within the seemingly concrete and finite world of structured data.
Next, the teams developed a set of event types that reflected the histories represented within their collections. By examining the unique nature of the holdings at Georgetown, UVA, and those made available through digitized collections from other universities, the team was able to develop a preliminary set of event types that could account for the variety of recorded experiences in these records. Regardless of the eventual shape of the descriptive model, these event types make up the heart of the work by signaling the wide variety of experiences enslaved people lived through during their forced labor for colleges and universities. While the team initially focused on 14 general kinds of events, we quickly realized that we would need to further specify subtypes within those categories. The result was a fairly robust controlled vocabulary for event types that would fuel the subsequent stages of the model development.
Event types in hand, together the team developed a set of properties that were common across all the types and then added several properties that would provide additional kinds of data that was important and shared across at least a few event types. Several principles guided this approach.
- First, we wanted to focus on the most common elements of data, making a model that could accommodate lots of variation in record type.
- Second, we were focused on constructing a model that was usable and not so detailed that it would overwhelm the person doing the description.
- Third, we wanted to create a model that did not foreclose the possibility of adding to our event type vocabulary as we became aware of new possibilities.
The result was a simple descriptive model that mobilized 35 properties, five of which were required, that covered the basic details of the event (12 properties), the important people involved, including the primary enslaved person (four properties), and a number of additional specific pieces of information that would vary depending on the type.
To contrast this simple model, the team turned to the controlled vocabulary of events to develop a more complex model. This model traced out those general event types into a set of event subclasses, each with a core set of properties and a set of additional properties that were specific to capturing the contours of the particular kind of event at hand. This approach would make much more finely grained description of events possible. Of course, the trade-off of employing this complex model would come in the extended processing time and workflow complexity demanded by working with 14 subclasses.
In preparing for the comparative testing of a simple descriptive model with a single event class, and a complex event model with a number of event subclasses, the teams reviewed their collections to select a base of documents that corresponded to as many of the event types as they could find within their holdings. Using these test collections, the team has recorded qualitative feedback about each session of working with the models to determine the adequacy of properties for capturing information, the relative processing time, and the ease of workflow. Now, as we approach the end of March, our comparative testing is coming to a close. We are in the process of analyzing some quantitative data to measure the usage of the particular properties associated with the subclasses in the complex model. All of this testing data will inform our choices as we select, refine, and document the descriptive model.
In the coming weeks, we will release the OTG alpha model and documentation in conjunction with a call for testing partners. Beginning in July, the selected testing partners will spend a year working with their own collections using the OTG Event Model alpha, and generating feedback that will fuel the model refinement and guide the creation of a set of materials designed to support archivists, cataloguers, and historians as they work to document the experience of the enslaved people who labored in the service of their colleges and universities.
We urge you to look for the release and invite you to join us as we embark upon the next stage of our work!