Yet - there seems to be a real barrier when it comes to libraries being open with their own catalog data. This seems rather odd because most libraries' catalogs are available for open access on the web. But try asking a library for some or all of that data and you suddenly hit a wall. Libraries don't like to say "no" so there's a lot of hemming, hawing, "we'll think about that," that goes on. But an out-and-out "yes" is rare.
I speak about this based on my experience with the Open Library, a project of the Internet Archive. The OL wants to create a humongous bibliographic database (right now only of book records) on the Web, using a wiki-like front-end that would allow anyone to edit the bibliographic data. To me the most interesting aspect of the project is that it would bring bibliographic entries to the web's surface; they could be the subject of links from other documents, and potentially could begin the creation of a bibliographic web linking books to each other. But in spite of putting out a call for bibliographic records and making personal and direct pleas to a number of libraries, the OL has received only a lukewarm response.
To be sure, any data submitted to the Internet Archive becomes publicly available. And at some time in the future it may be possible for people to download individual bibliographic records for their own use. I know that there is some speculation that OCLC "owns" the data and that the OCLC license may not allow this level of re-distribution. I also know that some records in library databases are covered by vendor licenses (other than OCLC). Presumably those could be excluded from the data set. But it still surprises me how un-open libraries are with their own data, given how much they encourage others to be open with theirs.
During the comment period for the Future of Bibliographic Control report, the Open Knowledge Foundation posted a call for library data openness on the OKF wiki. Many dozens signed their names to the OKF's call for open licensing of bibliographic data, including important people like Larry Lessig and Tim O'Reilly. The arguments in OKF's document seemed pretty clear to me:
Bibliographic records are a key part of our shared cultural heritage. They too should therefore be made available to the public for access and re-use without restriction. Not only will this allow libraries to share records more efficiently and improve quality more rapidly through better, easier feedback, but will also make possible more advanced online sites for book-lovers, easier analysis by social scientists, interesting visualizations and summary statistics by journalists and others, as well as many other possibilities we cannot predict in advance.Nothing of this was included in the final report.
Libraries complain that they don't get the kind of attention that Web resources like Google and OL get. They complain about the lack of transparency of the commercial data vendors; that Google won't say how many books it has online nor will it reveal its work on attempts to rank book retrievals. Libraries could be doing this experimentation themselves, and in the open, if their data were on the Web. They could be visible, out there, allowing incredible innovation to happen based on the hundreds of years of collecting materials and creating relatively consistent metadata for those materials. Their reluctance to let their data out of the databases just baffles me, and isn't in concert with their stated goals of open access.