Monday, May 21, 2012

Google goes semantic

In a long-awaited move [1], Google has announced that its search will now be "semantic." They don't actually mean "semantic" in the sense of the semantic web, although there are similarities. While what Google is doing may not formally follow the W3C standards for the semantic web, there is no doubt that they are performing acts of "data linking" that make use of the concepts of linked data. The W3C standards for linked data are designed for openness, so that data from disparate communities can come together. Google has no obligation to play well with others and, as we saw with the development of, is in a position to make its own rules, many of which are known only within the giant Google-verse. They call their technology a "knowledge graph" and talk about "things not strings." I've used this same phrase myself in numerous presentations on linked data.

Google has always been about using links between things on the web to determine its brand of "relevance" of a web resource to a search query. By using existing linked data, via large stores of links like Dbpedia, Wikipedia, Freebase, and presumably others, Google can now expand its offerings from a single list of results to additional information about the topic that might be the intended topic of the searcher. I say "might be" without any irony; whether in a web search engine or a library catalog, the communication between the searcher's mind and the device that provides results is always only approximate. What the additional data provides is not only more context but a more ample explanation of the topics that have been retrieved. No longer do users have to guess from snippets the meaning of the results in the result set, but they can see a Wikipedia-like entry that not only gives them more information, but it contains links to other sources of information of the topic.
"Knowledge Graph" result

"Knowledge graph" detail

At a meeting of the Northern California Technical Services Group in Berkeley last Friday, I said to the group:

Imagine that you have an 18-year-old user who finds a novel on your library's shelf by Oliver North. The user looks up the author in  your catalog and sees that this person has written a few other books, but oddly always with a "co-author." Is someone so inept worth reading? Now imagine that your catalog also presents the user with the context: Ollie North, Iran Contra, and related persons. Suddenly the user sees where North fits into US history, has a chance to find out what an interesting character he is, and the books take on a whole new meaning.

That was before I saw this Google result.

We treat library users as if they are all-knowing; as if they know each author in our catalog, as if the title of the book and the number of pages is sufficient for them to decide if it is a good read or has the information they need. This is so obviously false that I am at a loss to explain how we continue to work under this illusion.

[1] Google purchased the only linked-dated search system, Freebase, in July of 2010, thus tipping their hand that they were moving in that direction. Not only did they acquire Freebase and the skills of its employees, they eliminated a potential rival (although it may be silly to consider that anyone could really be a rival to Google).

No comments: