Thursday, February 21, 2008

Girls and the Internet

Today's New York Times has an article about the dominance of young females on the Internet. According to the article, 35% of girls (age 12-17) blog, compared to 20% of boys. And 32% of girls create or work on their own Web pages, compared to 22% of boys. They also greatly outnumber boys in having social networking sites. The article gives examples of girls who create and advise on CSS code creation and who design icons and animated icons. Some of these young women are making money (at least some money, no figures are given) from their sites.

So where was this article placed in the paper? In the business section with other "Internet entrepreneurs"? In the technology section? No, it was in Fashion & Style, right under an article on wedding dresses.

*sigh*

Tuesday, February 19, 2008

Libraries -- Open

I think of libraries as the quintessential open institutions. At least in the U.S. Most libraries are physically open to the public (even those in private institutions), and many serve as community spaces focused on learning, exploring, and simply being. They also promote the open use of what we might call "cultural heritage resources." Libraries fight for open access and they fight censorship all of the way to the Supreme Court.

Yet - there seems to be a real barrier when it comes to libraries being open with their own catalog data. This seems rather odd because most libraries' catalogs are available for open access on the web. But try asking a library for some or all of that data and you suddenly hit a wall. Libraries don't like to say "no" so there's a lot of hemming, hawing, "we'll think about that," that goes on. But an out-and-out "yes" is rare.

I speak about this based on my experience with the Open Library, a project of the Internet Archive. The OL wants to create a humongous bibliographic database (right now only of book records) on the Web, using a wiki-like front-end that would allow anyone to edit the bibliographic data. To me the most interesting aspect of the project is that it would bring bibliographic entries to the web's surface; they could be the subject of links from other documents, and potentially could begin the creation of a bibliographic web linking books to each other. But in spite of putting out a call for bibliographic records and making personal and direct pleas to a number of libraries, the OL has received only a lukewarm response.

To be sure, any data submitted to the Internet Archive becomes publicly available. And at some time in the future it may be possible for people to download individual bibliographic records for their own use. I know that there is some speculation that OCLC "owns" the data and that the OCLC license may not allow this level of re-distribution. I also know that some records in library databases are covered by vendor licenses (other than OCLC). Presumably those could be excluded from the data set. But it still surprises me how un-open libraries are with their own data, given how much they encourage others to be open with theirs.

During the comment period for the Future of Bibliographic Control report, the Open Knowledge Foundation posted a call for library data openness on the OKF wiki. Many dozens signed their names to the OKF's call for open licensing of bibliographic data, including important people like Larry Lessig and Tim O'Reilly. The arguments in OKF's document seemed pretty clear to me:

Bibliographic records are a key part of our shared cultural heritage. They too should therefore be made available to the public for access and re-use without restriction. Not only will this allow libraries to share records more efficiently and improve quality more rapidly through better, easier feedback, but will also make possible more advanced online sites for book-lovers, easier analysis by social scientists, interesting visualizations and summary statistics by journalists and others, as well as many other possibilities we cannot predict in advance.
Nothing of this was included in the final report.

Libraries complain that they don't get the kind of attention that Web resources like Google and OL get. They complain about the lack of transparency of the commercial data vendors; that Google won't say how many books it has online nor will it reveal its work on attempts to rank book retrievals. Libraries could be doing this experimentation themselves, and in the open, if their data were on the Web. They could be visible, out there, allowing incredible innovation to happen based on the hundreds of years of collecting materials and creating relatively consistent metadata for those materials. Their reluctance to let their data out of the databases just baffles me, and isn't in concert with their stated goals of open access.

Sunday, February 03, 2008

The ILS minus the catalog

The greatest amount of action happening today regarding library user services is the separation of the user interface from the integrated library system (ILS). This seems odd, perhaps, since only two decades ago the integration of all of the functions of library systems was seen as a real step forward. Until then, one system had handled acquisitions, another circulation, and another cataloging. Many functions, such as serials check-in and bindery management, were not managed through automation. This situation had a number of problems: different data about the same book were stored in multiple databases or in card files, leading to inconsistencies throughout the system; the data had to be keyed or copied multiple times; system-wide updates were nearly impossible. The "integration" of the integrated library system was the creation of a single database for bibliographic and management data, where all of the information about an item would be stored once and only once. This also was the first time that the full bibliographic record was linked to the library management functions. Independent systems like acquisitions and circulation systems primarily used brief records only. At best, these brief records contained an identifier (such as the item's barcode or call number) that could connect it to other records in other systems. Sometimes even that wasn't possible.

In theory, this database integration is a dandy way to organize your data and the activities that use your data. In reality, the user interface suffered in this design. Not that anyone purposely shorted the user interface, but in a world of scarcity, there are things that just have to get done; and then there are other things. In the have to category, libraries have to make purchases, manage accounts, receive and check-in serials, perform interlibrary loans, and check items out to borrowers. These are clear, quantitative, auditable library functions, and ones that library administrators focus on. These are the functions that can have dollar amounts attached to them in terms of staff time. These functions are the inside view of the library, the library being a library.

User success, on the other hand, is qualitative, hard to define, and does not have a direct effect on the library's bottom line. We count the countables, like numbers of bibliographic records, items circulated, and online database accesses, but there appears to be no penalty for a lousy user interface and no premium for the creation of a good one. If at any point there is a conflict between quantitative library management and qualitative user service, my gut feeling is that the latter loses out.

Users have everything to gain from the separation of the user interface from the library management system. Libraries, however, are in a bit of a bind. The new "user interface on top of the ILS" adds features for users but it doesn't result in any less work in the ILS. Libraries are still hanging all of their management functions off of full bibliographic records in the catalog (which the users no longer see). Librarians still see the data creation functions in the early management steps of acquisitions and receipt to be a direct line to the standard bibliographic record that in the end will appear on the users' screens. They are still storing the full bibliographic records in a local database, although these records are siphoned off nightly to the "real" user interface.

Much of the objection to using more EDI (electronic data interchange) functions with our vendors is that their data doesn't conform to library cataloging. Yet our library management systems are getting further from the user interface. We may need to rethink the library management workflow as well as the basis of our cataloging activity. What could we achieve if we move cataloging and catalogs out of our individual library databases to the network level? Could this provide the basis for increased sharing of the cataloging effort? Are there other efficiencies that could be gained in the "back room" functions of purchasing and managing the library inventory?