Thursday, July 16, 2009

Yee: Questions 3-5

[This entire thread has been combined on the futurelib wiki. Please add your comments and ideas there.]

Question 3
Is RDF capable of dealing with works that are identified using their creators?
Yee goes on to say:
We need to treat author as both an entity in its own right and as a property of a work.... Is RDF capable of supporting the indexing necessary to allow a user to search using any variant of the author's name and any variant of the title of a work... etc.
I'm not entirely sure of the point of these questions, but they appear to me to be mainly about applications and system design, not RDF, which is the same advice she has gotten from others. Let me say that as I understand RDF, it is particularly suited to allowing entities like author to be used in a variety of relationships. So a person entity can be the author of one book, the illustrator of another, and the translator of yet another. But there's something here about "identifying a work using the creator" and I think that is entirely a question of how we decide to identify works, and is unrelated to the capabilities of RDF.

The identification of all of the FRBR Group 1 entities raises many interesting questions. The fact is that we do not have a real identifier for any of them, with the possible except of the barcodes that libraries place on items. But Works, Expressions and Manifestations are lacking in true identifiers. As Jonathan Rochkind has pointed out, we use identifiers like OCLC numbers and LCCNs as pseudo-identifiers for manifestations because most of the time they work pretty well. Many systems rely heavily on ISBNs, which work reasonably well for modern published books and have the advantage of being printed on the books themselves, thus making a connection between the physical object and the metadata. Other than that, though, we're not very well set as far as identifiers go.

Yee talks about the use of the main author + title (or uniform title) as a work identifier, but even those are not a true identifier for the Work, at least not in the sense of a URI. As long as we rely on display forms we won't have an identifier that we can share with anyone whose author or title display may vary from ours (and even within the AACR2 community, there are differences in choices about names and a great gap in the actual use of uniform titles). It should be possible to create an authority-type record for name/title pairs that would include the variants from different practices, and assign a single identifier for it. But we have to stop thinking that we can create identifiers out of display forms -- that's not going to allow us to share our data outside of a tightly controlled cataloging tradition.

What I also read here is a frustration that our current systems do not produce a linear display that is analogous to the display in the card catalog (and is one of the goals of our cataloging practices). I'll pose my own question here, which is: can we create a system design that imitates the linear card catalog and at the same time provide us with the Catalog/Web 2.0 features that some members of our community desire? If not, how do we resolve these apparent conflicting requirements? (BTW, Beth Jefferson of Bibliocommons gave at talk at ALA in which she said that in their usability research, users invariable disliked -- or even hated -- the linear alphabetic display that so many librarians find necessary. I believe that statistics show that the browse function in current catalogs is seldom used. I suspect that most use is by library staff.)

Question 4
Do all possible inverse relationships need to be expressed explicitly, or can be they inferred?
If they are truly reciprocal, they can be inferred. It will require rules (the reciprocal of parent of = child of, the reciprocal of is author is has author). How this is handled internally in applications is something else, that is whether they create the inverse relationships in local storaage or are able to traverse them in any direction using rules on the fly. But I see no need to create the inverse relationships in one's metadata standard.

Question 5
Can RDF solve the problems we are having now because of the lack of transitivity or inheritance in the data models that underlie current ILSes, or will RDF merely perpetual these problems?
I answer this (first post, my #3) when I talk about the inconsistencies in authority data that make it very hard to make the appropriate inferences about relationships between data elements. It is possible that we could use RDF as the basis of our data and create these same ambiguities, but I hope that we will use the opportunity of moving to a new set of rules and a new data format to correctly restructure our data so that it does have the functionality we want.


Eric said...

I've commented on question 4 at:

Brief summary: Inverse properties are neither required nor are they good practice, and Martha Yee is a programmer.

Karen Coyle said...

Eric's post linked