Tuesday, February 16, 2016

More is more

Here's something that drives me nuts:
Library catalog entry for Origin of Species

Library catalog entry for Origin of Species

These are two library catalog displays for Charles Darwin's "The origin of species". One shows a publication date of 2015, the other a date of 2003. Believe me that neither of them anywhere lets the catalog user know that these are editions of a book first published in 1859. Nor do they anywhere explain that this book can be considered the founding text for the science of evolutionary biology. Imagine a user coming to the catalog with no prior knowledge of Darwin (*) - they might logically conclude that this is the work of a current scientist, or even a synthesis of arguments around the issue of evolution. From the second book above one could conclude that Darwin hangs with Richard Dawkins, maybe they have offices near each other in the same university.

This may seem absurd, but it is no more absurd than the paucity of information that we offer to users of our catalogs. The description of these books might be suitable to an inventory of the Amazon.com warehouse, but it's hardly what I would consider to be a knowledge organization service. The emphasis in cataloging on description of the physical item may serve librarians and a few highly knowledgeable users, but the fact that publications are not put into a knowledge context makes the catalog a dry list of uninformative items for many users. There are, however, cataloging practices that do not consider describing the physical item the primary purpose of the catalog. One only needs to look at archival finding aids to see how much more we could tell users about the collections we hold. Another area of more enlightened cataloging takes place in the non-book world.

The BIBFRAME AV Modeling Study was commissioned by the Library of Congress to look at BIBFRAME from the point of view of libraries and archives whose main holdings are not bound volumes. The difference between book cataloging and the collections covered by the study is much more than a difference in the physical form of the library's holdings. What the study revealed to me was that, at least in some cases, the curators of the audio-visual materials have a different concept of the catalog's value to the user. I'll give a few examples.

The Online Audiovisual Catalogers have a concept of primary expression, which is something like first edition for print materials. The primary expression becomes the representative of what FRBR would call the work. In the Darwin example, above, there would be a primary expression that is the first edition of Darwin's work. The AV paper says "...the approach...supports users' needs to understand important aspects of the original, such as whether the original release version was color or black and white." (p.13) In our Darwin case, including information about the primary expression would place the work historically where it belongs.

Another aspect of the AV cataloging practice that is included in the report is their recognition that there are many primary creator roles. AV catalogers recognize a wider variety of creation than standards like FRBR and RDA allow. With a film, for example, the number of creators is both large and varied: director, editor, writer, music composer, etc. The book-based standards have a division between creators and "collaborators" that not all agree with, in particular when it comes to translators and illustrators. Although some translations are relatively mundane, others could surely be elevated to a level of being creative works of their own, such as translations of poetry.

The determination of primary creative roles and roles of collaboration are not ones that can be made across the board; not all translators should necessarily be considered creators, not all sound incorporated into a film deserves to get top billing. The AV study recognizes that different collections have different needs for description of materials. This brings out the tension in the library and archives community between data sharing and local needs. We have to allow communities to create their own data variations and still embrace their data for linking and sharing. If, instead, we go forward with an inflexible data model, we will lose access to valuable collections within our own community.

(*) You, dear reader, may live in a country where the ideas of Charles Darwin are openly discussed in the classroom, but in some of the United States there are or have been in the recent past restrictions on imparting that information to school children.


Brandon Nordin said...

Would also suggest that the de facto cataloguing model is Amazon, with a mix of self generated, publisher and third party, plus user supplied reviews of wide scope, quality and quantity, back with both structured metadata and immediate browse (see inside the book) capability. (Not to mention the ability to create annotated (wish)lists within the catalog for future (and potentially public) reference.) Surely this would be most endusers cultural expectations of the type of info an improved library catalog would/should provide.

Anonymous said...

Catalogers now have tools to record data – data sharing and linking to data created by other communities is possible. The LCNAF record representing the Aeneid of Virgil (http://id.loc.gov/authorities/names/n81013510) provides a wealth of data and links out to VIAF. But you’re right that catalogers need to add the data to the authority record and the relationship data (http://id.loc.gov/authorities/names/n81105854) to catalog records to at least give users a chance to “place the work historically where it belongs.”
Thank you, Karen!

Karen Coyle said...

Brandon, Amazon definitely is a model of what users now expect. That said, I see Wikipedia as another model because it creates links between entries and provides a great deal of context.

And, speaking of Wikipedia, in response to Anon - there is information in the authorities records but it is hidden away. The difficulty is that the information is mostly in the form of textual notes, and the notes themselves are not very granularly coded. So these two notes get the same tag:

- found: Encyc. Brit. (Aeneid)
- found: Britannica online, August 18, 2014 (Aeneid, Latin epic poem written from about 30 to 19 bce by the Roman poet Virgil. Composed in hexameters, about 60 lines of which were left unfinished at his death, the Aeneid incorporates the various legends of Aeneas and makes him the founder of Roman greatness. The work is organized into 12 books that relate the story of the legendary founding of Lavinium (parent town of Alba Longa and of Rome).)

Wikipedia puts information in "infoboxes" where each bit of data gets its own coding. All of this goes into Wikidata, and for the Aeneid you get https://www.wikidata.org/wiki/Q60220. Coded there is author, characters, original date, country of origin, links to editions, original language... all with their own identifiers. So we need to create something similar from the library data. In fact, there are efforts in Wikipedia and WikiData to get better at encoding bibliographic items, including citations (of which Wikipedia has a plethora), and it would be great for the library community, in general, to be more involved in this. See: https://www.wikidata.org/wiki/Wikidata:WikiProject_Books

Yuimi Hlasten said...

As a cataloger, I always enter 775 fields for different editions, with original publisher and pub date.
But our ILS nor WorldCat local doesn't show 775 fields, so users can't see any hierarchical information I entered. Plus even if it showed, it would display as "reproduction of (manifestation)" "also issued as" "equivalent (manifestation)", etc., and probably none of them would make sense to users.
Uniform titles are not required as long as the title remains the same. So yes, basically you would think this is a new book.