The typical situation is that you have a database with your data. Searches go against that database, the results are extracted, a program formats these results into a web page, and the page is sent to the screen. Let's say that your database has data about authors, titles and dates. These are stored in your database in a way that you know which is which. A search is done, and let's say that the results of the search are:
author: Williams, RThis is where you are in your data flow:
title: History of the industrial sewing machine
The next thing that happens (and remember, I'm speaking very generally) is that the results then are fed into a program that formats them into HTML, probably within a template that has all your headers, footers, sidebars and branding and sends the data to the browser. The flow now looks like
Williams, R. History of the industrial sewing machine. 1996.Without any fancy formatting, the HTML for this is:
<p>Williams, R. History of the industrial sewing machine. 1996.</p>Now we can see the problem that schema.org is designed to fix. You started with an author, a title and date, but what you are showing to the world is a string of characters are that undifferentiated. You have lost all the information about what these represent. To a machine, this is just another of many bazillions of paragraphs on the web. Even if you format your data like this:
<p>Author: Williams, R.</p>What a machine sees is:
<p>Title: Williams, R. History of the industrial sewing machine</p>
<p>blah: blah</p>What we want is for the program that is is formatting the HTML to also include some metadata from schema.org that retains the meaning of the data you are putting on the screen. So rather than just putting HTML formatting, it will add formatting from schema.org. Schema.org has metadata elements for many different types of data. Using our example, let's say that this is a book, and here's how you could mark that up in schema.org:
<div vocab="http://schema.org/">Again, this is a very simple example, but when we test this code in the Google Rich Snippet tool, we can see that even this very simple example has added rich information that a search engine can make use of:
<span property="author">Williams, R.</span> <span property="name">History of the industrial sewing machine</span>. <span property="datePublished">1996</span>.
The review as seen in a browser (includes schema.org markup)
The review as seen by a tool that reads the structured schema.org data.
From these you can see a couple of things. The first is that the schema.org markup does not change how your pages look to a user viewing your data in a browser. The second is that hidden behind that simple page is a wealth of rich information that was not visible before.
Now you are probably wondering: well, what's that going to do for me? Who will use it? At the moment, the users of this data are the search engines, and they use the data to display all of that additional information that you see under a link:
In this snippet, the information about stars, ratings, type of film and audience comes from schema. org mark-up on the page.
Because the data is there, many of us think that other users and uses will evolve. The reverse of that is that, of course, if the information isn't there then those as yet undeveloped possibilities cannot happen.