Thursday, March 26, 2009

LC discovers infinity

If you were at ALA Midwinter in Denver (January, 2009) you may have been in one of the meetings where the Library of Congress announced its intention to atone for the fiasco. In case you missed that, Ed Summers of LC created an online version of the Library of Congress Subject Heading authority records, re-organized as a SKOS vocabulary and available for linking on the open Web. After being available for about six months (beginning in May of 2008), Ed was asked by his employer to take down the site on December 18, 2008. This was in spite of the fact that the data had been out there long enough to have a number of users, and that the removal broke existing systems that had developed around the data.

[Note, has been re-born as, hosted by Talis.]

The outcry in the community was strong, including a reply to Ed's blog post by Sir Web himself, Tim Berners-Lee. Library of Congress must have been suitably embarassed.

Thus the announcement at Midwinter that LC not only understands the value of linked open access to LCSH, but that all of the vocabularies managed by LC -- from the name authorities to the lists of document types, languages, locations, etc., -- need to be openly available in a format suitable for inclusion in Web services. LC has created a web site to host these vocabularies: On that site they say:
Initially, within 6 to 8 weeks, the Library of Congress will release its first offering: the Library of Congress Subject Headings. This will be an almost verbatim re-release of the system and content once found at the popular prototype service.
They also say:
We aim to make resources available on this site within 6-8 weeks. Check this site regularly for more updates as we continue to develop this service!
The page is dated 1/22/09. My calculations show that 9 weeks have passed. OK, that's only one week over their stated deadline. But nothing on the page has changed. No resources have been made available. An "almost verbatim" release of should not be too hard given that Ed had code written that he has made publicly available.

But even today, the promised service is 6-8 weeks away. It may stay that way for a long time. Maybe even forever.

Why does this matter? It matters because the availability of these vocabularies is essential for the library world to move forward. Some of us have been asking LC to put the vocabularies online in a machine-actionable format for a very long time. The Dublin Core community worked with LC to create a machine-actionable and URI-identified version of the MARC role terms as early as 2005. You can't find this linked from any of the MARC documentation. Some of us brought up the topic ad nauseum at MARBI meetings, but to no avail. Now LC seems to have "gotten it" conceptually but they have yet to show us that they can deliver.

I may seem to be undeservedly impatient on this score, but it's not that we have been waiting for this for 9 weeks: we've been waiting for years. And quite honestly, this is not rocket science, nor does LC have no guidance for how to manage this data. In fact, they could use the NSDL Metadata Registry, or, if they insist on hosting this themselves, the Registry's source code is available. Quite frankly, if LC does not prove to us soon that it can perform this necessary function, I feel that we are quite justified in going forward without them, registering the vocabularies where they can be used and managed by anyone who needs them, and going forward with a transformation of library data that will meet 21st century needs.


Jonathan said...

Wow, when I read that comment the first time it didn't click to me who timbl was.

Getting Tim or W3C to write a letter to LC asking when the 6-8 weeks will be up certainly couldn't hurt.

Anonymous said...

Perhaps the answer ultimately becomes

Diane said...

Great post, Karen--you've managed to pinpoint perfectly the frustration we're all feeling as we sit here waiting for LC to move on their promises. It's certainly true that LC could use the NSDL Registry software (as other national libraries have) or our service, with their domain name, so that anything they register could be moved to a bespoke site when they get their own registry going. We'd be happy to help in any way we can.

Part of my frustration with the current situation is that we've no real idea what's going on. Are the issues technical, political, or a combination? Who is in a position at LC to cut through the barriers on this?

This kind of thing really drags down the rest of the community, since nobody can really plan for the future with this sort of great silence from those who should be providing leadership.

Anirvan said...

Karen, thanks for continuing to push on this.