Thursday, February 08, 2007

Catalog use data - A collective effort?

In all of the discussion around the next generation catalog and what library users really want, I've had a hunger for some real catalog use data. My first instinct was to think: "Someone needs to get a big grant to gather up statistics and analyze them." Then I remembered: we're in the "let's all get together and do it ourselves" era. I also realized that I do not want or need the perfect, scientifically sound study of catalog use; I want some data, reasonably reliable, but also growing and being updated over time. I want something to think about, not something that pretends to give me all of the answers. So...

Can we (the collective we, not the royal we) create a simple way that some libraries can contribute their catalog use data to a common storehouse? In particular, can we create a way that libraries can create a profile and upload data easily?

Here's the kind of data I have in mind:

  • What search types are available? Which of these search types is a default? How often is each search type used?
  • If there is an advanced search page, how often is it used?
  • If there are sort options, what is the default and how often are all of them used?
  • What is the default display, and how often is each display option used?
  • How many records are displayed on average?
  • If there are facets, how often is each facet type selected?
  • What system are you using? (Vendor, brand)
  • What type of library is it? (public, academic, special)
In terms of making it easy, there could be pre-defined sets for known systems that tend to look a lot alike, like III. There can also be a list of search types, perhaps the ones defined for Z39.50.

And not everyone would have to participate, just those who are interested in the future of the catalog. No coercion, no requirements. I'd be happy to see the data from a couple of dozen catalogs. Could we do it? (I volunteer to write specs and documentation, which are in my skill set.)


Anonymous said...


Are you aware of the two year old Library Normative Data Project that's sponsored by Florida State U and SirsiDynix? It has what you've suggested and much much more. We have asked for participation from all of the ILS vendors but without much traction. Libray schools get it for no charge.

Let me know,

Stephen Abram

Karen Coyle said...

Thanks. What I can see here (not being a member) is data that looks like the data collected by some states -- circulation, expenditures, etc. Is there any information about details of catalog use of the sort I posted here?

Emily said...

We could pull this type of data out fairly easily for our faceted catalog (Endeca). Some of those things we do already track on a monthly basis, like how often each sort option is used, how often the facets are used, etc. The data just isn't carefully aggregated and made publicly available. :)

I would be interested in seeing some type of combined data set that might help us understand user behavior. Of course, just getting data doesn't guarantee we'll be able to understand what it implies about users or what we should do in light of it.

-emily lynema
NCSU Libraries

Karen Coyle said...

Emily, you are absolutely right that having the data doesn't tell us WHY users do what they do and what it means. What astonishes me, however, is that we are building catalogs without even knowing WHAT they do. The example that I gave on the NGC4LIB list, that no matter how many items were on a screen our users looked at 2.5 screens on average, is enough to let us know that there's something to be studied there. At least we know that we don't know.

tony mcmullen said...

Karen -- This is my first time commenting on your blog although I've enjoyed lurking for quite some time. I'm very interested in the type of data you describe and do run reports from time to time to glean this information from our database (Voyager). From a recent report, I learned the following:

For months of sept-dec 2006, about 8.5% of all searches came from students dormitories. Of these searches:

· 57% were of the "keyword with relevance" variety

·only 2.5% used the "Advanced" interface

·only 3.6% used the "Boolean" search option (NOTE: others attempted to incorporate Boolean operators into their keyword queires; I did not tabulate these searches (yet)).

Karen Coyle said...

Thanks, Tony. I've received some "off list" comments as well, so I'm going to see if I can merge the information into some useful form. If/when I do, I'll probably post it to the futurelib wiki ( From that, maybe we can figure out an easy reporting mechanism so that we can get a broader view of user behavior.