Coyle's InFormation: 02/01/2011

Sunday, February 06, 2011

Skyriver Replies

Following up on the these early stages of what will probably be an interminable legal case (it's easy to understand why one should avoid going to court whenever possible), The SkyRiver has replied to OCLC's Motion to Dismiss.[1] [2] This is the first document I have seen that to me clearly lays out Skyriver's basic contentions. Note that the major part of this document is the usual lawyerly recitation of cases supporting one statement or the other, and I have no idea what the legal arguments mean or whether they are convincing or not. But here are SkyRiver's primary facts as this document lays them out:

1. OCLC has monopolies in the US academic library market

"OCLC is monopolizing three product or service markets—bibliographic data of libraries’ holdings; cataloging service; and interlibrary lending service (ILL). OCLC is attempting to monopolize a fourth service market—integrated library systems (ILS)." p. 1

2. OCLC has used those monopoly positions to prevent competition

"Since at least 1987, OCLC has demanded that its member libraries agree to terms of membership that prohibit sharing the metadata of their own library holdings contributed to OCLC’s bibliographic database known as WorldCat with any for-profit firms for commercial use and require member libraries to use OCLC’s services. OCLC has imposed these membership terms to prevent the development of competing bibliographic databases, cataloging services or ILL services by erecting barriers to entry in these three markets. OCLC is also using its monopoly power in these three markets in its attempt to monopolize the ILS market." p.1

3. OCLC has targeted SkyRiver's business by using punitive pricing for libraries that use SkyRiver's cataloging services

"OCLC’s conduct has injured SkyRiver by deterring libraries from using its service, and has injured libraries that are using SkyRiver to reduce costs by preventing those libraries from uploading their new records into WorldCat at the price charged to everyone except SkyRiver users." p. 2

Beyond that the arguments become more complex. In particular there is the issue of the 20+ years that OCLC has been building up WorldCat under a policy that has prohibited (acc. to the response, p.4) libraries from sharing their cataloging data with for-profit entities. With no other non-profit entity providing cataloging services to US academic libraries, the records are essentially locked-up in WorldCat and no one else can enter the market.

This brings me to a point that I got wrong in a previous post, which is that Skyriver is asking for access to the WorldCat database. The argument there, if I read it correctly, is that WorldCat is the only major source of academic library holdings that can be used for an effective ILL service. WorldCat is the result of monopoly practices. To allow for competition, WorldCat (e.g. bibliographic data and holdings) should be made available for a reasonable price to competing ILL providers. While this seems jarring at first, the more I think about it the more sense it makes.

What the response does not say explicitly, and perhaps it would be irrelevant in a legal case, is that one could look on WorldCat as a shared community resource, not the property of OCLC. In fact, OCLC uses this kind of argument in its record use policy, but somehow leads to the conclusion that WorldCat should not be used to foster non-OCLC library services. It seems easy to make the opposite argument, which would be that WorldCat could be the basis for a wide range of services that would benefit libraries, even if they do not come from OCLC. Imagine if OCLC were to set non-discriminatory pricing for use of WorldCat and anyone could make use of the WorldCat data. There could be a "share-alike" clause that would require those users to return pertinent information to the bibliographic collective. WorldCat would grow, and the range of products and services available to libraries would grow. This seems like a GOOD THING.

I realize it may not be easy to do the analysis that would lead to pricing that both fosters sharing and makes it possible even for small businesses* to arise in the library market. It should be possible, given today's technology, to do this efficiently but we know very little about the cost structure of WorldCat. It is clear that there are many activities relating to the care and management of that database, all intertwined with OCLC services and valuable research projects, as well as linked deeply into tens of thousands of library systems around the world. Should the court require OCLC to open WorldCat for use, we need to see a transition that is non-destructive to the library ecology.

* The reason I emphasize small businesses is that I believe that smaller, more nimble vendors could exist to serve the needs of specialized and smaller libraries which are not OCLC members at this time. I see the potential to widen the community of sharing, even to include more non-library institutions and businesses. Another GOOD THING, IMO.

Tuesday, February 01, 2011

knowledge Organization in Norway

Last week I attended Kunnskapsorganisasjonsdagene 2011 in Oslo. (Knowledge Organization 2011 conference.) The topics ranged around linked data, the FRs, and RDA. I will try to give some flavor of the event, as I experienced it. That last caveat is because only three of the presentations were in English, the rest in Norwegian, and how much I understood really depended on whether there were slides with a lot of diagrams. I was somewhat in the position of the dog in this cartoon:

with "Ginger" being replaced by "RDA", "MARC", and "Karen Coyle."

I was the first speaker of day 1, and presented on the topic of RDA and linked data. The next talk was from the Pode project, a research project bringing together FRBR and RDF concepts and linking data to dbpedia, VIAF, and Dewey in RDF. I got the impression that while experimental, the results are sophisticated, particularly because of the mix of data sources the project is working with. The afternoon had an introduction to (and, from the moments of laughter, some commentary on) RDA by Unni Knutsen. There appears to be an equal amount of interest and skepticism about RDA. I am not sure that AACR had this same effect outside of the Anglo-American library community, and would be very interested to hear more about the impact of A-A cataloging rules, especially whether this impact is greatly increased due to the degree of international sharing of bibliographic data.

Maja Žumer, of the University of Ljubljana, Slovenia, a member of the FRSAD working group gave the best explanation of the meaning behind FRSAD's "thema" and "nomen" that I have yet heard. It is beginning to make sense. Maja is the co-author of a study on FRBR and library user mental models that was published in the Journal of Documentation in two parts. (Preprints [1] [2]) I will link to her slides when they are made available. A key take-away is that FRBR, FRAD and FRSAD have taken very different approaches that will now need to be reconciled. FRBR presents a closed universe of bibliographic data, with only FRBR entities allowed to be subjects of bibliographic resources. FRSAD essentially opens that up to anything in the known universe. Among other things this creates a possibility to link non-bibliographic concepts to described bibliographic entities. Or, at least, that's how I read it.

I was asked to do a short wrap-up of the first day, and as I usually do I turned to the audience for their ideas. Since we realized we are short on answers and long on questions, we decided to gather some of the burning questions. Here are the ones I wrote down:

If not RDA, what else is there?
Are things on hold waiting for RDA? Are people and vendors waiting to see what will happen?
Why wasn't RDA simplified?
How long will we pay for it?
Will communities other than those in the JSC use it?
Can others join JSC to make this a truly international code?
Should we just forget about this library-specific stuff and use Dublin Core?

I suspect that there are many others wondering these same things.

The next day there were more interesting talks. One was entitled: Må MARC dø? by Magnus Enger of Libriotech. The title means: Must MARC die? The first slide was one that needs little translation. It said simply:

JA!

Tom Scott of BBC gave a visually stunning talk about the data he manages around the nature and wildlife programming. He explained the reasons for pulling data from a variety of sources, including Wikipedia. (See this page -- and note that it encourages readers to improve the Wikipedia entry if they feel it is incorrect or insufficient.)

In another excellent talk, which I hope will come out in an English translation, Kim Tallerås and David Massey did a step-by-step walkthrough of moving from MARC-encoded data into fully linked data format, complete with URIs. There was another talk focusing on the Norwegian webDewey from the national library, with examples of converting that data to RDF.

About that time I ran out of steam, but I will post a link here when the presentations are up online. In spite of the language barrier, much content is accessible from these talks.

As is often the case I was very impressed at the quality of experimentation that is taking place by people who really want to see library data transformed and made web-able. I think we are at the start of a new and highly fruitful phase for libraries.