Friday, September 14, 2012

Rich snippets

At the recent Dublin Core annual meeting I heard Dan Brickley talk about Google's use of schema.org for rich snippets. Schema.org is commonly thought of as "search engine optimization" (SEO), which to most people means "how to get onto the first page of a Google results search." But the microdata in web sites can also be used to make the snippets shown more useful by incorporating more information from the web page. The examples above, from the Google rich snippets page, show features like ratings as well as links to actual content within the web page.

Now that WorldCat has schema.org markup, my first thought was: what kind of rich snippet would be good for library data? There is a rich snippet testing tool where you can plug in a URL and see 1) the snippet 2) what microdata is visible to Google. You can plug in a WorldCat permalink and see what the rich snippet result is:

http://www.worldcat.org/oclc/874206  (opens in separate window)

There is no rich snippet displayed here, which tells us that Google hasn't yet developed a rich snippet model for our kind of data. But you can see, in great detail, all of the coded data that is available. (The red warnings indicate that there is data in the OCLC microdata that isn't part of schema.org. OCLC is talking to the schema.org developers to incorporate new elements, some of which show up as warnings here.)

I began to think about how I would like this data used. It could be used to format a more bibliographic-like display, adding author, publisher, pagination. The ISBN could of course link to key online bookstores. (That would also bring in revenue for Google, so might be a popular choice for the search engine.) But what about libraries? How could rich snippets help libraries and library users?

The snippet could  lead back to WorldCat where the user could find a nearby library, but... wait! Google often knows your approximate location, and WorldCat knows whether libraries in your area have the book. AND the library catalog often has information about availability. I don't know how this data would interact with the WorldCat tool, but here's what I would like to see in the snippet:


This definitely goes beyond what "rich snippet" means today, but is not inconsistent with retrievals that pull data from multiple online sales outlets.  In the sales model, Google's assumption is that the searcher wants to obtain (in FRBR-speak) the item, and therefore various outlets that could provide that data are listed. This same logic could apply to libraries, of course. Libraries are a local source of many of the same things that are sold online, so the obtain logic fits.

This analysis of mine obviously ignores the economic incentive for Google to provide library holdings, especially since they would be seen as competing with sales.  I'm just dreaming here, doing the "what if" thing without the practical limitations.






No comments:

Post a Comment

Comments are moderated, so may not appear immediately, depending on how far away I am from email, time zones, etc.