Saturday, October 18, 2008

The Semantics of Semantic

I've always had a hard time with the Semantic Web because it didn't appear to me to be semantic at all. Thus I started calling it the Syntactic Web since it seemed to be mainly about structure, more like diagramming sentences than having a conversation. I now think I understand why that is.

If you are like me, you assume the term "semantic" is about meaning, and in particular the meaning in words and language. That's what you'll find in the dictionary. It is only recently that I learned that there is another use of the term "semantic" and that is in an area of mathematics called "formal semantics". Basically, formal semantics are about formal languages, such as mathematics, programming languages, and such. You can perform operations, often called "inferences," based on the rules of formal languages. This is an example:

A ☠ B
B ☠ C
therefore, A ☠ C

Using some set of rules, this statement is true even though A, B and C are black boxes, as is the relationship "☠". The statement after "therefore" can be calculated without ever considering that A, B or C have any meaning beyond being the symbols A, B and C.

This is a whole different meaning of "meaning" compared to the meaning of words in the human language sense. This is the kind of meaning that works with machines and algorithms and is therefore quite suited to automation. And it is this meaning of semantic that is meant for the Semantic Web.

Well, no wonder those of us who aren't mathematicians (or more specifically, involved in the use of formal languages) have been confused by the Semantic Web! It isn't semantic at all in the human language sense, it is mainly about structure and syntax. It's a shame that its developers used the confusing term "semantic" in its name. Although not incorrect, it is definitely a minority view of the meaning (that is, the semantics) of the term, and leads to confusion.

With this new knowledge, I would characterize the semantic web as a basic structure within which one could insert human-meaningful data, and a set of rules that make that structure operational in a computing environment. It greatly resembles the structure of simple human utterances, like "Moby Dick is the title of this book," although the semantic web would express this something like:

URI:abcd (has relation) URI:1234 (with) URI:xxzz

While Semantic Web devotees may be enthusiastic about that statement, most of us are going to get more out of: "Moby Dick is the title of this book." In other words, we only connect with the statement when it has human-understandable meaning; the formal language alone just doesn't do it. It's rather like the difference between the architectural drawings and the actual building: architects can understand what the drawings represent, but most of us will have to wait until the building is completed and we can walk though it in order to experience what the drawings mean.

Now I have to ask myself what to do with this knowledge. I don't think it makes sense to require everyone to be an architect in order to walk through a building, and I know for sure that all of us can speak in sentences even if we aren't experts in linguistics. So it should be possible to interact with, nay even be an active creator of, the semantic web without being conversant with the field of formal semantics. At the moment, though, I don't know how that's going to work, but I do know that if it doesn't work that way the Semantic Web isn't going very far. It just has to become wysiwim - what you see is what I mean. I suggest something Pipes-like might do the trick.


Mark Andrews said...

The Semantic Web is not first about making the 'web more usable for humans. Its first job is to make the 'web more usable for this thing:

about which LibraryLand and VendorLand are clueless. And we're worried about Google....

Diane said...

Thanks, Karen--this is a cogent explanation of a disconnect that I've never managed to articulate. It explains a lot, doesn't it?


Alexander Johannesen said...

A slight extension to this "confusion" is that the Semantic Web, when you use all layers of their stack, *leads* to semantics as you and I know it. The first layer is indeed this formal language and syntax thing, but then as you pour on the special underused RDF and OWL stuff (OWL in itself has three layers of ontologies, just to make you cringe) you create applications that use inference to gain the semantics of that data.

Both uses of "semantics" end up in the same place, but the Semantic Web" version has a few underlying layers that formalizes it first, and yes, cause confusion and anger and frustration and ... :)

Sue Woodson said...

Oh, I get it now. Thanks for clarifying. It had always seemed odd to me before.

This version of 'semantics' only seems to deal with sentences that refer--sentences like ‘the box is red’ and ‘Sally hit John.’

I wonder what it would do with indirect language like ‘Can you pass me the salt?’ or a sentence that, when spoken, is an action ‘I bet you $5 I can jump higher than you can.’ Check out this site for an example of language that goes far beyond mere reference: (Yes, I’m from Baltimore)

You might argue that, since we’re only trying to describe items with things like subject headings and tags we only need to use referential language. But when our patrons look for material in DataVats of text, this is a topic that might come up again.

Karen Coyle said...

Sue, great comments, thanks. I hear the SW folks talk about making "statements" about things, so that's the focus. It's basically descriptive language. Then you can ask questions about those statements, like "what's the biggest lake in the world?" So it's not a substitute for human language, but an artificial language with a particular scope. I think it seems natural to folks who spend a lot of time with mathematics, computer programming, or other artificial languages, but for anyone whose concept of language is Ezra Pound, it feels very strange.

Cathy Legg said...

I think it's more that the original vision was to make the SW ordinary-language-semantic, but all they have managed to achieve (if that) is to make it formal-semantic. I discuss the issue you raise of of URIs to 'namespaces' only being unique indices, and indices doing little to define meaning, in a paper "Ontologies and the Semantic Web" (ARIST, 2007), if you're interested.

And to Mark Andrews, as someone who has worked with Cyc I wouldn't say that it can understand the Web at this stage by any means!

Nice blog - thoughtful!
Cathy Legg

Anonymous said...

"When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean -- neither more nor less."

Semantics in a nutshell; scholarship is knowing who did the choosing; and perhaps librarianship where to find it?

Beware of "ontology" too ... qq