Thursday, September 21, 2006

Description and Access?

The revision of the library cataloging rules that is underway is being called "Resource Description and Access" or RDA. Although it is undoubtedly an unpopular view point, I would like to suggest that description and access are two very different functions and that they should not be covered by a single set of rules, nor should they necessarily be performed by a single metadata record.

The pairing of description and access is functionality based on card catalog technology. A main purpose of the 19th and 20th century card catalog was access. Indeed, great discussion took place in the late 19th century about the provision of a public access point for library users: access through authors, titles, and subjects. The descriptive element, the main body of the card, was essentially a bibliographic surrogate helping users make their decision on whether to go over to the shelf to look for the book. Before easy reproduction of cards, that is, before LoC began selling card sets early in the 20th century, access cards did not carry the full description of the book. Instead, all catalog cards except the main entry card had a brief entry that would allow the user to find the main entry card which had the full description.

The combination of description and access is a habit that has carried over from the card catalog and has left a legacy that to many of us is so natural that we have trouble seeing it for what it is. For example, it is because of this combination that we create artificial "headings" that cause us to display author names in the famed "last name first" order. The heading is designed for access in a system where the means of finding items is through a linear alphabetical order, which even in library systems is no longer the predominant finding method. These headings set library systems apart from popular information systems such as Amazon, Barnes & Noble, Google Books. As a matter of fact, you can find examples of library catalogs that attempt a popular display by displaying the title and statement of responsibility as the main display, hiding the now odd-looking headings. What these headings say to anyone who is tech-savvy is that libraries are hindered by an obsolete technology. Libraries still create these contorted headings when markup of data can make display and ordering of data flexible and user friendly.

Not only does our use of arcane headings set libraries apart from more popular information resources, our concepts of "description" and "access" are not serving our users. The description provided by libraries might serve to identify the work bibliographically (something that matters to libraries for collection development purposes but is not of great interest to library users), however it doesn't describe the work to users in a way that can help them make a selection. We need at least reviews, thumbnails of images, sample chapters, and even local commentary ("Required reading for Professor Smith's class in European History"). And as for access, we know that the library-assigned subject headings are woefully inadequate discovery tools.

RDA claims that its purpose in the area of description "should enable the user to: a) identify the resource described...." Yet today we are in dire need of machine-to-machine identification, which RDA does not address. Increasingly our catalogs are interacting with other sources of discovery, such as web sites, search engines, and courseware. "Identification" that must be interpreted by a human being is going to be less and less useful as we go forward into an increasingly digital and networked information environment.

We are also greatly in need of an ability to share our data with systems that are not based on library cataloging. Each rule that varies from what would be common practice moves libraries further from the information world that our users occupy in their daily life. It is somewhat ironic that many pages of rules instruct catalogers on the choice of the "title proper," which is then marred by the addition of the statement of responsibility, a bit of library arcana that no one else considers to be part of the title of the work. And who else would create a title heading "I [heart symbol] New York"?

All this to say that the next generation library catalog cannot succeed if it is to be based on a set of rules that still carry artifacts from the days of the physical card catalog. It's time to get over the concepts of description and access that were developed in the 19th century. Let's move on, for goodness sake.

10 comments:

Mia said...

Right, but not all of the description conventions can be ascribed to card catalog arcana. Last name, First Name being one. The addition of the comma symbol lightens the cognitive overload for me, the human. But this convention is not synonomous with the outdated "main entry". The sequence of the letters in our alphabet (a through z) is an over-arching constraint. Each alphabetic letter added to the length of a string needs to distinguish itself from neighboring strings; nearly-identical alpha strings need to sort in that sequence (if there is some other alpha sequence, I'd like to know about it!). Humans orient themselves within that sequence (otherwise, one cannot make sense of the symbols). I don't have a problem with Describing a string of author letter-sequences as "last name, first name". I'm not sure how I will readily distinguish LastName or Firstname containers when looking at the raw elements. Well, all this just to say that humans need a relatively recognizable (and teachable) method to recognize, capture, and transcribe in as unambiguous a way possible, while they are in the process of describing the document.

Karen Coyle said...

See, I think the last name, first name convention is ALL about the card catalog. It's there because of the need to file cards in order and to use that order to retrieve catalog entries. I would bet that there are few times when alphabetical order is the useful order for a search result. It has no relationship to keyword searching, and is only needed when retrieval is linear.

Mia said...

What about clusters which have a preponderance of name entries that are retrieved (other than bibliographic authors)? How would one cluster to see similar/related entities in the neighboring space? Fictious example: list of patent holders; washing machine inventors; etc. One doesn't need to cluster (agreed) if there are only a handful of things to retrieve. Perhaps I'm missing the obvious.

Karen Coyle said...

Mia, you don't need to create a heading in last name, first name order to sort things alphabetically. Markup can provide everything you need for sorting by last name. It's the fact that rather than markup libraries use an inverted heading that looks so old-fashioned. With markup, headings can be in natural order for most user display but in inverted order - if you want - when alphabetical order is needed. But note that we keep the headings in inverted order even when they don't need to be. It's like we are so stuck on alphabetic order that we can't imagine anything else. Certain big players on the web have entirely dispensed with alphabetical order, and I find that much of the time I don't miss it, although it would be good to have it under some circumstances. But in libraries we are having a hard time embracing any other kind of order, so that when users do a subject search, they often get back records in alphabetical order by main entry -- which is probably totally irrelevant to their search.

Yep, I could go on and on about this one! ;-)

allegro said...

What else is the "comma space" sequence but markup? The simplest possible option for this case, way superior to anything you may invent in XML, and the quickest for data entry. It does allow for easy reversal if what you desire is really the name displayed in "natural" order. But do you really? Displayed as a record heading, the reversed name makes for faster recognition than the natural name. Many authors are well-known by their last name only, so it makes much sense to display it in the first position to make it the first thing to meet the eye. Some of the card conventions were extremely economical and well thought-out, and we do well to think thrice before discarding them.

Anonymous said...

Amen.

Anonymous said...

Good post thanks. I think the idea of "machine readable" access is the most important concept currently facing us in cataloging, that isn't always understood for the importance it has.

Our catalog(s) still do need to provide for discovery though, right? You almost seem to be implying they don't?

Jonathan

Karen Coyle said...

Allegro, you are absolutely right that the comma is a form of markup. Oddly, though, we don't seem to use it as such. We treat the string as a simple string, always displaying it as it was input. I dispute that the last name first form "makes sense" to our users, but we should at least do some studies to find out. However, the fact is that the popular book-related services (Amazon, Google, etc.) do not use this form so our users are becoming accustomed to -- and seem to have no problem with -- a different display of author names. I really think that we're having trouble moving into the post-alphabetic world of keyword-based retrieval. And if we have to have an order, I'm not sure that alphabetical order is the most useful, as we've seen with Google's ranking.

Karen Coyle said...

Jonathan, I think we need a better understanding of how users discover (note recent study in Europe, discussed in Lorcan Dempsey's blog). Then we should design the catalog for the type of discovery that it can best provide. We have to see the catalog in its larger context. Personally, I would like to see the library catalog handle author & title discovery (known item) and a classified subject browse -- a kind of virtual shelf. That is something that is not provided elsewhere. The real problem with the library catalog, though, is that it's just a fraction of the actual information resources available to people. So it is artifically restricted, and that limits its usefulness as a discovery mechanism -- unless you define "discovery" as discovering what the library owns. So it seems that actual discovery should be very broad (WorldCat?) and then there's another function that we could call "what can I find on the shelf here"?

Stephen said...

I agree that description and access need to be kept separate, provided reliable links are maintained to get quickly from one to the other and back. I agree that many kinds of order are useful, and that library catalogs have been too committed to a non-evaluative model. We need to be able to rank catalog records along various dimensions of user interest. But sometimes the user WANTS a more neutral order; and when that's the case, alphabetical order is simple and recognizable for most users. Even keyword indexes, when you pry the system open and get a look at them, are in alpha order, because searching an ordered list is much more efficient (once the list is compiled) than searching a random one. That applies to both people and machines. If mark-up permits a system to shuffle one set of elements into "last_name, first_name" for a list context and "first_name last_name" for a record display context, fine; but we don't want to slip into the mistake of thinking that this will permit us to get by with one name form for description and access. It won't, and we want to keep those separate.