Since my last post undoubtedly left readers with the idea that I have my head in the clouds about the future of bibliographic metadata, I wish to present here some of the reasons why I think this can work. Many of you were probably left thinking: Yea, right. Get together a committee of a gazillion different folks and decide on a new record format that works for everyone. That, of course, would not be possible. But that's not the task at hand. The task at hand is actually about the opposite of that. Here are a few parameters.
#1 What we need to develop is NOT a record format
The task ahead of us is to define an open set of data elements. Open, in this case, means usable and re-usable in a variety of metadata contexts. What wrapper (read: record format) you put around them does not change their meaning. Your chicken soup can be in a can, in a box, or in a bowl, but it is still chicken soup. That's the model we need for metadata. Content, not carrier. Meaning, not record format. Usable in many different situations.
#2 Everyone doesn't have to agree to use the exact same data elements
We only need to know the meaning of the data elements and what relationships exist between different data elements. For example, we need to know that my author and your composer are both persons and are both creators of the resource being described. That's enough for either of us to use the other's data under some circumstances. It isn't hard to find overlapping bits of meaning between different types of bibliographic metadata.
Not all metadata elements will overlap between communities. The cartographic community will have some elements that the music library community will never use, and vice versa. That's fine. That's even good. Each specialist community can expand its metadata to the level of detail that it needs in its area. If the music library finds a need to catalog a map, they can "borrow" what they need from the cartographic folks.
Where data elements are equivalent or are functionally similar, data definitions should include this information. Although defined differently, you can see that there are similarities among these data elements.
pbcoreTitle = a name given to the media item you are catalogingAll of these are types of titles, and have a similar role in the descriptive cataloging of their respective communities: each names the target resource. These elements therefore can be considered members of a set that could be defined as: data elements that name the target resource. Having this relationship defined makes it possible to use this data in different contexts and even to bring these titles together into a unified display. This is no different to the way we create web pages with content from different sources like Flickr, YouTube, and a favorite music artist's web site, like the image here.
RDA:titleProper = A word, character, or group of words and/or characters that names a resource or a work contained in it.
MARC:245 $a = title of a work
dublincore:title = A name given to the resource
In this "My Favorites" case, the titles come from the Internet Movie Database, a library catalog display, the Billboard music site, and Facebook. It doesn't matter where they came from or what the data element was called at that site, what matters is that we know which part is the "name-of-the-thing" that we want to display here.
#3 You don't have to create all new data elements for your resources if appropriate ones already exist
When data elements are defined within the confines of a record, each community has to create an entire data element schema of their own, even if they would be coding some elements that are also used by other communities. Yet, there is no reason for different communities to each define a data element for an element like the ISBN because one will do. When data elements themselves are fully defined apart from any particular record format you can mix and match, borrowing from others as needed. This not only saves some time in the creation of metadata schemas but it also means that those data elements are 100% compatible across the metadata instances that use them.
In addition, if there are elements that you need only rarely for less common materials in your environment, it may be more economical to borrow data elements created by specialist communities when they are needed, saving your community the effort of defining additional elements under your metadata name space.
To do all of this, we need to agree on a few basic rules.
1) We need to define our data elements in a machine-readable and machine-actionable way, preferably using a widely accepted standard.
This requires a data format for data elements that contains the minimum needed to make use of a defined data element. Generally, this minimum information is:
- a name (for human readers)
- an identifier (for machines)
- a human-readable definition
- both human and machine-readable definitions of relationships to other elements (e.g. "equivalent to" "narrower than" "opposite of")
2) We must have the willingness and the right to make our decisions open and available online so others can re-use our metadata elements and/or create relationships to them.
3) We also must have a willingness to hold discussions about areas of mutual interest with other metadata creators and with metadata users. That includes the people we think of today as our "users": writers, scholars, researchers, and social network participants. Open communication is the key. Each of use can teach, and each of us can learn from others. We can cooperate on the building of metadata without getting in each others' way. I'm optimistic about this.