Monday, March 19, 2007

Unintended consequences

I was working at the University of California in the early 1980's when the UC union catalog, MELVYL, was developed. Shortly after MELVYL became available as one of the early online public access catalogs, we obtained a copy of NLM's Medline database, consisting of articles and books in a wide variety of fields related to medicine. This was the first article database that we were making available. At that time, the only people who had access to Medline were medical researchers in the four or five medical centers at the university, and they accessed it via the Dialog search service. Dialog charged by the minute and was relatively pricey. Few members of the university community had access to Dialog's databases in any subject area because of the cost.

I don't know how many minutes or hours of searching were done monthly on Medline before we added the database to the university's library system, but within a few years the number of searches on Medline were rivaling the number of searches in the entire union catalog of the 9 university campuses. It was heavily used even on those campuses that had no medical school. Had everyone developed a sudden interest in the bio-medical sciences? Perhaps, but not to that degree. I think that we had created a monster of "availability." As the only freely available online database of articles, Medline became the one everyone searched.

(If anyone is looking for a master's thesis, try looking at the citations in dissertations granted at the University of California for the period 1985-1995, and compare it to the previous decade. Count how many of the cited articles come from journals that are indexed in NLM's database. I have a feeling that it will be possible to see the "Medline-ization" of the knowledge produced by that generation of scholars, from architecture to zoology. )

When we make materials available, or when we make them more available than they have been in the past, we aren't just providing more service -- we are actually changing what knowledge will be created. We tend to pay attention to what we are making available, but not to think about how differing availability affects the work of our users. We are very aware that many of us are searching online and not looking beyond the first two screens, which produces an idiosyncratic view of the information universe. But we don't see when libraries create a similar situation by making certain materials more available than others (for example scanning all of their out-of-copyright works and making them available freely as digital texts, while the in-copyright books remain on the shelves of the library).

There's a discussion going on at the NGC4LIB list about the meaning of "collection." What is a library collection today? Is it just what the library owns? Is it what the library owns and also what it licenses? Does it include some carefully selected Internet resources? Some have offered that the collection is whatever the library users can access through the library's interface. I am beginning to think that access is a tricky concept and it is inevitably tied up with the realities of a library collection. Users will view the library's collection through the principle of least effort. In the user view, ease of access trumps all -- it trumps quality, it trumps collection, it trumps organization. So we can't just look at what we have -- we have to look at what the user will perceive as what we have, and that perception will necessarily be tempered by effort and attention. To our users, what the library has will be what is easiest to locate and fastest to arrive.

In other words, our collection is not a quantity of materials. The collection is a set of services built around a widely divergent set of resources. To the user, the services are the library, especially because any one user will see only a tiny portion of what the library has to offer. The actual collection -- those thousands or millions of library-owned items -- is not what the user sees. The user sees the first two screens of any search.

Hopefully, they are not in main-entry alphabetical order.


Anonymous said...

Hope you don't mind me directing people here to the same article I reccommended on ngc4lib. Think it's still pretty much as good a starting point on 'what do collections mean now' NOW as it was 7 years ago. To me, that's a depressing indication of how little we've done, in thought or in action, since then.

Lee, H-L (2000) What is a collection?. Journal of the American Society for Information Science; 51 (12) Oct 2000, p.1106-13

"Unfortunately, the gaps in a library's OPAC cause a major hurdle for
users. They frustrate users by making part of the collection
inaccessible from the main entry point into the collection: the OPAC. It
also burdens users by forcing them to switch among a number of different
information retrieval systems (IRSs) to find all materials in a
library's collection. Although it is desirable from the user's
perspective to access all information items through an integrated IRS,
this is not the case at present in many American libraries. In system
design, information professionals - librarians in particular - need to
take this consideration seriously. In other words, an integrated
retrieval system should be an indispensable element of a well-developed

Michael McCully suggests you can find it free on the web via google scholar.

Jonathan Rochkind

Roy Tennant said...

The article above isn't actually all that easy to find via Google Scholar, but it's there (it looks like by mistake). See