Tuesday, January 10, 2023

KO is KO'd

A library is intended to be a place of organized knowledge. Knowledge organization (KO) takes place in two areas: the shelf and the catalog. In this post I want to address KO in the catalog.


KO in the catalog makes use of "headings". Headings are catalog entry points, such as the title of a work or the name of an author. Library catalogs also assign topical headings to their holdings.

The "knowledge organization" of the title and author headings consists of alternate versions of those. Alternate forms can be from an unused form (Cornwell, David John Moore) to the used one (Le Carré, John). They can also refer from one form that a searcher may use (Twain, Mark) to a related name that is also to be found in the catalog (Snodgrass, Quintus Curtius).

Subject headings are a bit more complex because they also have the taxonomic relationships of broader and narrower concepts. So a broader term (Cooking) can link to a narrower term (Desserts) in the same topic area. Subject headings also have alternate terms and related terms.

The way that this KO is intended to work is that each heading and reference is entered into the catalog in alphabetical order where the user will encounter them during a search.

Cornwell, David John Moore
    see: Le Carré, John

Twain, Mark
    see also: Snodgrass, Quintus Curtius
    see narrower: Desserts
    see narrower: Frying
    see narrower: Menus
It may seem obvious but it is still important to note that this entire system is designed around an alphabetical browse of strings of text. The user was alerted to the alternate terms and the topical structure during the browse of cards in the card catalog, where the alternate and taxonomic entries were recorded for the user to see. Any "switching" from one term to another was done by the user herself who had to walk over to another catalog drawer and look up the term, if she so chose. The KO that existed in the catalog was evident to the user.


A database of data creates the ability to search rather than browse. A database search plucks precise elements from storage in response to a query and delivers them to the questioner. The "random access" of that technology has all but eliminated the need to find information through alphabetical order. Before the database there was no retrieval in the sense that we mean today, retrieval where a user is given a finite set of results without intermediate steps on their part. Instead, yesterday's catalog users moved around in an unlimited storehouse of relevant and non-relevant materials from which they had to make choices.

In the database environment, the user does not see the KO that may be provided. Even if the system does some term-switching from unused to used terms, the searcher is given the result of a process that is not transparent. Someone searching on "Cornwell, David" will receive results for the name "John Le Carré" but no explanation of why that is the case. Less likely is that a search on "Twain, Mark" will lead the searcher to the works that Twain wrote under the additional alias of "Snodgrass, Quintus Curtius" or that the search on "Cooking" will inform the user that there is a narrower heading for "Menus." A precise retrieval provides no context for the result, and context is what knowledge organization is all about.

Answering a question is not a conversation. The card catalog engaged the user in at least a modicum of conversation as it suggested entry headings other than the ones being browsed. It is even plausible that some learning took place as the user was guided from one place in the list to another. None of that is intended or provided with the database search.

KW is especially not KO

The loss of KO is exacerbated with keyword searching. While one might be able to link a reference to a single-word topic or to a particular phrase, such as "cookery" to "cooking," individual words that can appear anywhere in a heading are even further removed from any informational context. A word like "solar" ("solar oscillations", "solar cars", "orbiting solar observatories") or "management" ("wildlife management", "time management", "library catalog management") is virtually useless on its own, and the items retrieved will be from significantly different topic areas.

Keyword searching is very popular because, as one computer science student once told me, "I always get something." The controversy today over mis-information is around the fact that "something" is a context-free deliverable. In libraries, keyword searching helps users retrieve items with complex headings, but the resulting resources may be so different one from the other that the the retrieved set resembles a random selection from the catalog. Note, too, that even the sophisticated search engines are unable to inform their users that broader and narrower topics exist, nor can they translate from words to topics. Words are tools to express knowledge, but keywords are only fragments of knowledge.

21st Century Goals

I would like to suggest a goal for 21st century librarians, and that is a return to knowledge organization. I don't know how it can be done, but it is essential to provide this as a service to library users who are poorly served by the contextless searches in today's library catalogs. To accomplish this with computer and database technology will probably not make use of the technique of heading assignment of the card catalog. Users might enter the library through a topic map of some type, perhaps. I really don't know. I do know that educating users will be a big hurdle; the facility of typing a few words and getting "something" will be hard to overcome in a world where quick bits of information are not only the norm but all that some generations have ever known. A knowledge system has to be demonstrably better, and that's a tall order.


Shawne Miksa said...

We still train new catalogers/metadata specialists--or perhaps I should just say, "I still teach this in my cataloging course"-- to create the context of which you speak. The most immediate barrier to that context actually being used is the inability of the information retrieval system to show the user that context on the screen in some way other than an alphabetical list of access points (names, subjects, titles, etc.). The unimaginativeness of user interface design has always been puzzling. What percentage of the information we transcribe and encode is actually made use of by the system? This question has always been on my mind, starting with my very first conversations/arguments with the tech people of major bibliographic vendor. I wanted certain MARC fields to be labeled correctly and also to be searchable, but the IT people 'pshawed' the idea and refused. While I am not 100% up on current interfaces, it sounds like perhaps we have the same problem. Lots of contextualized data but no good ways to show it to the searcher. I remember some catalog systems years ago that played around with visual displays--Aqua-something?--but haven't heard much since.

It was recently suggested to me that university students/ users, at the very least, don't understand how to do complicated searching--that they aren't experts and therefore to have something like 'authority control' is not useful. I'm still reeling from the suggestion. Have we come to the point where we don't even think about training the user via bibliographic instruction, etc. in order help them to develop their searching skills? Are we dumbing down our systems even more because we make big assumptions about how users search for and understand the information they retrieve?

I agree that educating users is a hurdle, but its not impossible. People can be quite adaptive. Its not too late to reassert some productive information behaviors.


Karen Coyle said...

Shawn, if there is any fault among the users it is that they have been raised on keyword searching online. However, searching on Wikipedia is left-anchored and I haven't heard that anyone finds that to be unnatural. Admittedly, Wikipedia titles are less complex than our subject headings, but maybe that's a clue for us. I think that the criticism of FAST (that it leads to false drops) is valid, but I see no reason why subject headings could not be presented with a permutation of subheadings, where:

Boats and boating--Alabama--Harris, Lake--Maps

would appear in the user interface as:

Boats and boating--Alabama--Harris, Lake--Maps
Alabama--Harris, Lake--Maps--Boats and boating
Harris, Lake--Maps--Boats and boating--Alabama
Maps--Boats and boating--Alabama--Harris, Lake

Like FAST it may be necessary to make some decisions that are not purely rotation of parts of the subject heading, and I hope that would be minor.

This is not a 'no brainer' - it will complicate systems, especially in the area of update of records, and it will either add processing time or storage, or both. However, if we continue to fail our users we are putting in jeopardy the future of libraries as anything but warehouses.