Saturday, October 18, 2008

The Semantics of Semantic

I've always had a hard time with the Semantic Web because it didn't appear to me to be semantic at all. Thus I started calling it the Syntactic Web since it seemed to be mainly about structure, more like diagramming sentences than having a conversation. I now think I understand why that is.

If you are like me, you assume the term "semantic" is about meaning, and in particular the meaning in words and language. That's what you'll find in the dictionary. It is only recently that I learned that there is another use of the term "semantic" and that is in an area of mathematics called "formal semantics". Basically, formal semantics are about formal languages, such as mathematics, programming languages, and such. You can perform operations, often called "inferences," based on the rules of formal languages. This is an example:

A ☠ B
B ☠ C
therefore, A ☠ C

Using some set of rules, this statement is true even though A, B and C are black boxes, as is the relationship "☠". The statement after "therefore" can be calculated without ever considering that A, B or C have any meaning beyond being the symbols A, B and C.

This is a whole different meaning of "meaning" compared to the meaning of words in the human language sense. This is the kind of meaning that works with machines and algorithms and is therefore quite suited to automation. And it is this meaning of semantic that is meant for the Semantic Web.

Well, no wonder those of us who aren't mathematicians (or more specifically, involved in the use of formal languages) have been confused by the Semantic Web! It isn't semantic at all in the human language sense, it is mainly about structure and syntax. It's a shame that its developers used the confusing term "semantic" in its name. Although not incorrect, it is definitely a minority view of the meaning (that is, the semantics) of the term, and leads to confusion.

With this new knowledge, I would characterize the semantic web as a basic structure within which one could insert human-meaningful data, and a set of rules that make that structure operational in a computing environment. It greatly resembles the structure of simple human utterances, like "Moby Dick is the title of this book," although the semantic web would express this something like:

URI:abcd (has relation) URI:1234 (with) URI:xxzz

While Semantic Web devotees may be enthusiastic about that statement, most of us are going to get more out of: "Moby Dick is the title of this book." In other words, we only connect with the statement when it has human-understandable meaning; the formal language alone just doesn't do it. It's rather like the difference between the architectural drawings and the actual building: architects can understand what the drawings represent, but most of us will have to wait until the building is completed and we can walk though it in order to experience what the drawings mean.

Now I have to ask myself what to do with this knowledge. I don't think it makes sense to require everyone to be an architect in order to walk through a building, and I know for sure that all of us can speak in sentences even if we aren't experts in linguistics. So it should be possible to interact with, nay even be an active creator of, the semantic web without being conversant with the field of formal semantics. At the moment, though, I don't know how that's going to work, but I do know that if it doesn't work that way the Semantic Web isn't going very far. It just has to become wysiwim - what you see is what I mean. I suggest something Pipes-like might do the trick.