Saturday, November 23, 2019

The Work

The word "work" generally means something brought about by human effort, and at times implies that this effort involves some level of creativity. We talk about "works of art" referring to paintings hanging on walls. The "works" of Beethoven are a large number of musical pieces that we may have heard. The "works" of Shakespeare are plays, in printed form but also performed. In these statements the "work" encompasses the whole of the thing referred to, from the intellectual content to the final presentation.

This is not the same use of the term as is found in the Library Reference Model (LRM). If you are unfamiliar with the LRM, it is the successor to FRBR (which I am assuming you have heard of) and it includes the basic concepts of work, expression, manifestation and item that were first introduced in that previous study. "Work," as used in the LRM is a concept designed for use in library cataloging data. It is narrower than the common use of the term illustrated in the previous paragraph and is defined thus:
Class: Work
Definition: An abstract notion of an artistic or intellectual creation.
In this definition the term only includes the idea of a non-corporeal conceptual entity, not the totality that would be implied in the phrase "the works of Shakespeare." That totality is described when the work is realized through an LRM-defined "expression" which in turn is produced in an LRM-defined "manifestation" with an LRM-defined "item" as its instance.* These four entities are generally referred to as a group with the acronym WEMI.

Because many in the library world are very familiar with the LRM definition of work, we have to use caution when using the word outside the specific LRM environment. In particular, we must not impose the LRM definition on uses of the work that are not intending that meaning. One should expect that the use of the LRM definition of work would be rarely found in any conversation that is not about the library cataloging model for which it was defined. However, it is harder to distinguish uses within the library world where one might expect the use to be adherent to the LRM.

To show this, I want to propose a particular use case. Let's say that a very large bibliographic database has many records of bibliographic description. The use case is that it is deemed to be easier for users to navigate that large database if they could get search results that cluster works rather than getting long lists of similar or nearly identical bibliographic items. Logically the cluster looks like this:

In data design, it will have a form something like this:

This is a great idea, and it does appear to have a similarity to the LRM definition of work: it is gathering those bibliographic entries that are judged to represent the same intellectual content. However, there are reasons why the LRM-defined work could not be used in this instance.

The first is that there is only one WEMI relationship for work, and that is from LRM work to LRM expression. Clearly the bibliographic records in this large library catalog are not LRM expressions; they are full bibliographic descriptions including, potentially, all of the entities defined in the LRM.

To this you might say: but there is expression data in the bibliographic record, so we can think of this work as linking to the expression data in that record. That leads us to the second reason: the entities of WEMI are defined as being disjoint. That means that no single "thing" can be more than one of those entities; nothing can be simultaneously a work and an expression, or any other combination of WEMI entities. So if the only link we have available in the model is from work to expression, unless we can somehow convince ourselves that the bibliographic record ONLY represents the expression (which it clearly does not since it has data elements from at least three of the LRM entities) any such link will violate the rule of disjointness.

Therefore, the work in our library system can have much in common with the conceptual definition of the LRM work, but it is not the same work entity as is defined in that model.

This brings me back to my earlier blog post with a proposal for a generalized definition of WEMI-like entities for created works.  The WEMI concepts are useful in practice, but the LRM model has some constraints that prevent some desirable uses of those entities. Providing unconstrained entities would expand the utility of the WEMI concepts both within the library community, as evidenced by the use case here, and in the non-library communities that I highlight in that previous blog post and in a slide presentation.

To be clear, "unconstrained" refers not only to the removal of the disjointness between entities, but also to allow the creation of links between the WEMI entities and non-WEMI entities, something that is not anticipated in the LRM. The work cluster of bibliographic records would need a general relationship, perhaps, as in the case of VIAF, linked through a shared cluster identifier and an entity type identifying the cluster as representing an unconstrained work.

* The other terms are defined in the LRM as:

Class: Expression
Definition: A realization of a single work usually in a physical form.

Class: Manifestation
Definition: The physical embodiment of one or more expressions.

Class: Item
Definition: An exemplar of a single manifestation.