The particular problem that I have in
mind is the disconnect between library data and library systems in
relation to the category of metadata that libraries call "headings."
Headings are the strings in the library data that represent those
entities that would be entry points in a linear catalog like a card
catalog.
It pains me whenever I am an observer
to cataloger discussions on the proper formation of headings for
items that they are cataloging. The pain point is that I know that
the value of those headings is completely lost in the library systems
of today, and therefore there are countless hours of skilled
cataloger time that are being wasted.
The Heading
Both book and card catalogs were
catalogs of headings. The catalog entry was a heading followed by
one or more bibliographic entries. Unfortunately, the headings serve
multiple purposes, which is generally not a good data practice but is due to the need for parsimony in library data when that data was analog, as in book and card catalogs.
- A heading is a unique character string for the "thing" – the person, the corporate body, the family – essentially an identifier.
Tolkien, J. R. R. (John Ronald Reuel), 1892-1973
- It supports the selection of the entity in the catalog from among the choices that are presented (although in some cases the effectiveness of this is questionable)
- It is an access point, intended to be the means of finding, within the catalog, those items held by the library that meet the need of the user.
- It provides the sort order for the catalog entries (which is why you see inverted forms like "Tolkien, J. R. R.")
United States. Department of State. Bureau for Refugee Programs
United States. Department of State. Bureau of Administration
United States. Department of State. Bureau of Administration and Security
United States. Department of State. Bureau of African Affairs
- That sort order, and those inverted headings, also have a purpose of collocation of entries by some measure of "likeness"
Tolkien, J. R. R. (John Ronald Reuel), 1892-1973
Tolkien SocietyTolkien Trust
The last three functions, providing a sort order, access, and collocation, have been lost in the online
catalog. The reasons for this are many, but the main explanation is
that keyword searching has replaced alphabetical browse as a way to
locate items in a library catalog.
The upshot is that many hours are spent
during the cataloging process to formulate a left-anchored,
alphabetically order-able heading that has no functionality in
library catalogs other than as fodder for a context-less keyword
search.
Once a keyword search is done the
resulting items are retrieved without any correlation to headings. It
may not even be clear which headings one would use to create a useful
order. The set of retrieved bibliographic resources from a single
keyword search may not provide a coherent knowledge graph. Here's an
illustration using the keyword "darwin":
Gardiner, Anne.Melding of two spirits : from the "Yiminga" of the Tiwi to the "Yiminga" of Christianity / by Anne Gardiner ; art work byDarwin : State Library of the Northern Territory, 1993.Christianity--Australia--Northern Territory.Tiwi (Australian people)--Religion.Northern Territory--Religion.Crabb, William Darwin.Lyrics of the golden west. By W. D. Crabb.San Francisco, The Whitaker & Ray company, 1898West (U.S.)--Poetry.Darwin, Charles, 1809-1882.Origin of species by means of natural selection; or, The preservation of favored races in the struggle for life and The descent of man and selection in relation to sex, by Charles Darwin.New York, The Modern library [1936]Evolution (Biology)Natural selection.Heredity.Human evolution.Bear, Greg, 1951-Darwin's radio / Greg Bear.New York : Ballantine Books, 2003.Women molecular biologists--Fiction.DNA viruses--Fiction.
No matter what you would choose as a
heading on which to order these, it will not produce a sensible
collocation that would give users some context to understand the
meaning of this particular set of items – and that is because there
is no meaning to this set of items, just a coincidence of things
named "Darwin."
Included in the headings in the
drop-down are "see"-type terms that, when selected, take the
user directly to the entry for the preferred term. If there is no one
preferred term Wikipedia directs users to disambiguation pages to
help users select among similar headings:
The Wikipedia pages, however, only provide accidental collocation, not the more comprehensive collocation that libraries aim to attain. That library-designed collocation, however, is also the source of the inversion of headings, making those strings unnatural and unintuitive for users. Although the library headings are admirably rules based, they often use rules that will not be known to many users of the catalog, such as the difference in name headings with prepositions based on the language of the author. To search on these names, one therefore needs to know the language of the author and the rule that is applied to that language, something that I am quite sure we can assume is not common knowledge among catalog users.
Headings that have been chosen to be
controlled strings should offer a more predictable user search
experience than free text searching, but headings do not necessarily
provide collocation. As an example, Wikipedia uses the names of its
pages as headings, and there are some rules (or at least preferred
practices) to make the headings sensible. A search in Wikipedia is a
left-to-right search on a heading string that is presented as a
drop-down list of a handful of headings that match the search string:
The Wikipedia pages, however, only provide accidental collocation, not the more comprehensive collocation that libraries aim to attain. That library-designed collocation, however, is also the source of the inversion of headings, making those strings unnatural and unintuitive for users. Although the library headings are admirably rules based, they often use rules that will not be known to many users of the catalog, such as the difference in name headings with prepositions based on the language of the author. To search on these names, one therefore needs to know the language of the author and the rule that is applied to that language, something that I am quite sure we can assume is not common knowledge among catalog users.
I may be the only patron of my small library branch that has known to look for the mysteries by Icelandic author Arnaldur IndriĆ°ason under "A" not "I".De la Cruz, MelissaCervantes Saavedra, Miguel de
What Is To Be Done?
There isn't an easy (or perhaps not
even a hard) answer. As long as humans use words to describe their
queries we will have the problem that words and concepts, and words
and relationships between concepts, do not neatly coincide.
I see a few techniques that might be
used if we wish to save collocation by heading. One would be to allow
keyword searching but for the system to use that to suggest headings
that then can be used to view collocated works. Some systems do allow
users to retrieve headings by keyword, but headings, which are very
terse, are often not self-explanatory without the items they describe. A
browse of headings alone is much less helpful that the association of
the heading with the bibliographic data it describes. Remember that
headings were developed for the card catalog where they were printed
on the same card that carried the bibliographic description.
Another possible area of investigation
would be to look to the classified catalog, a technique that has
existed alongside alphabetical catalogs for centuries. The Decimal
Classification of Dewey was a classified approach to knowledge with a
language-based index (his "Relativ Index") to the classes.
(It is odd that the current practice in US libraries is to have one
classification for items on shelves and an unrelated heading system
(LCSH) for subject access.)
The classification provides the
intellectual collocation that the headings themselves do not provide.
The difficulty with this is that the classification collocates
topically but, at least in its current form, does not collocate the
name headings in the catalog that identify people and organizations
as entities.
Conclusion (sort of)
Controlled headings as access points
for library catalogs could provide better service than keyword search
alone. How to make use of headings is a difficult question. The first
issue is how to exploit the precision of headings while still
allowing users to search on any terms that they have in mind. Keyword
search is, from the user's point of view, frictionless. They don't
have to think "what string would the library have used for
this?".
Collocation of items by topical
sameness or other relationships (e.g. "named for",
"subordinate to") is possibly the best service that
libraries could provide, although it is very hard to do this through
the mechanism of language strings. Dewey's original idea of a
classified order with a language-based index is still a good one,
although classifications are hard to maintain and hard to assign.
If challenged to state what I think the
library catalog should be, my answer would be that it should provide
a useful order that illustrates one or more intellectual contexts
that will help the user enter and navigate what the library has to
offer. Unfortunately I can't say today how we could do that. Could we
think about that together?
Readings
Dewey, Melvil. Decimal classification
and relativ index for libraries, clippings, notes, etc. Edition 7.
Lake Placid Club, NY., Forest Press, 1911.
https://archive.org/details/decimalclassifi00dewegoog
Shera, Jesse H, Margaret E. Egan, and Jeannette M. Lynn. The Classified Catalog: Basic Principles and Practices. Chicago, Ill: American Library Association, 1956
Shera, Jesse H, Margaret E. Egan, and Jeannette M. Lynn. The Classified Catalog: Basic Principles and Practices. Chicago, Ill: American Library Association, 1956