Monday, July 25, 2011

RDA in XML - why not give it a shot?

Example of RDA in XML / Example2 of RDA in XML

There's a lot of talk about what we will do with RDA as data - what format we will use, how it will look to users, etc etc etc. In fact, the options are legion. The key point is that we don't have to decide on just ONE WAY to carry and store RDA data elements, as long as we follow a few rules.

As an experiment, I have coded a very simple bibliographic record using two different possible ways to encode RDA in XML. For the XML data elements I use the RDA elements from the Open Metadata Registry. These elements are defined in OWL, and therefore are compatible with semantic web applications. Their use in XML (and by that I mean non-RDF XML) may be a bit questionable, yet at the same time XML may be a good transition format from our current data to a ful RDF-based implementation. I created two XML files: one in which I used text values, much as one would in MARC, and one in which I used URIs for values that have been encoded as vocabularies. Neither has a schema because creating a schema for RDA is a huge undertaking. If there is interest in this method, however, it might be worth... undertaking.

The resulting files don't fit well in a blog post, so I created a page with a side-by-side comparison. Please have a look. Feel free to comment or send me suggestions or corrections. or other ideas on how to do this better.

6 comments:

  1. To answer some of your wonderings:

    French is the language of expression, and an element. Hamlet is the preferred title of the work. So to really flex them to their full utility, they need to be in different fields.
    Also, you have date of publication with the c1987, which indicates the use of the copyright statement to provide the date. That is something else that needs to be split into two separate fields: date of publication and date of copyright.

    I find it interesting to see this in action. As a cataloger, I would also be curious to see what an actual cataloging interface would look like, since I am quite sure we wouldn't be dealing with raw code.

    ReplyDelete
  2. biba - Thanks. So do you think that in RDA there wouldn't be a separate "field" for "Hamlet. French"? That's what I can't figure out. Hamlet is obviously the title of the work, and French is the language of the Expression, so we have a heading made up of one element from FRBR Work and one from FRBR Expression, yet I don't know of a way in FRBR that would allow you to combine bits from different entities. It's very odd.

    To get some idea of what an interface might look like, go to an Open Library edit page like this one:
    http://tinyurl.com/3d2puj9
    and click on "turn on librarian mode". It's not RDA, but I think it would look like a form, and hopefully would have lots of places you could click for help, or to get lists, and it would do auto-fill and find things like similar author names. What makes it hard to design is that there are so darned many fields! So we have to find a clever way to allow the cataloger to add new fields without it being too clunky.

    Actually, I would find it very interesting if some catalogers would use the Open Library form to do some original input and give feedback on it. Ditto using the forms in LibraryThing. Does it work for you, and could it be made robust enough for the level of detail that cataloging requires?

    ReplyDelete
  3. I find it interesting that there is no element for "authorized access point for the work" or for "authorized access point for the expression". If such elements existed, I think that would solve your problem with what to do with French. So the AAP for the work would be Omescu, Ion. Hamlet and the AAP for the expression would be Omescu, Ion. Hamlet. French. I would also expect each of these elements to be treated individually. So Hamlet would be in the work wrapper, as would Omescu, Ion. French would be in the expression wrapper.

    IMHO, AAP really function in RDA as elaborate natural language labels. If you want to identify the work, expression, etc., you should just use a URI. Unless you want to use the AAP to create browse lists ordered by author, they seem sort of pointless to me.

    ReplyDelete
  4. The open library interface is very interesting to me. I too have been wondering what a cataloging interface would look like using xml. One of the biggest complaints I've heard about marking up records with XML is that it's too cumbersome to read and correct. I have used products like XMetal for building finding aids using EAD. XMetal has a function that can turn the xml into a more visually friendly viewing mode, where tags are color coded, which makes it easier to view and edit the data within each field. Adding new fields is relatively easy, and they automatically get color coded appropriately when you add them in. Perhaps something like that could work in a new kind of interface. I should go and try the Open Library interface out more. However, I still have my RDA training wheels on and I need much more practice.

    ReplyDelete
  5. I'm not certain that that particular record is a good example, since if it is a French translation of Othello, it is done wrong in the original record. Is Omescu the author or the translator? Is this really Shakespeare's Othello or an adaptation and/or translation of the original? I may have to request it just to see what is really wrong since not only is it in my library, but we apparently did the original cataloging.

    I think the autofill idea is definitely the way it will have to be when it actually comes to cataloging in RDA, at least for some fields. And other fields will need a lot more granularity than MARC provides. (I'm thinking wistfully of an authority record format that actually allows me to choose separate fields for firstname, nickname, surname(s), date of birth, date of death, dates of activity, etc.)

    I'll see if I can't experiment a little inside Open Library. It wouldn't be difficult to do some parallel cataloging in two systems; it would allow for some direct comparison. LibraryThing -- I played with it once, but didn't find its interface easy to use. It may have improved since then -- I'll look again.

    ReplyDelete
  6. If this record looks "wrong" I should do another one so that the "wrongness" doesn't get in the way. I'll try to find something else that also is a translation so we have the same situation with the uniform title/access point.

    ReplyDelete

Comments are moderated, so may not appear immediately, depending on how far away I am from email, time zones, etc.