I received a notice today about a conference being held by O'Reilly on digital publishing. The conference has some tutorial sessions on using XML to create digital books, but I fear that these will not include the work being done to create an XML standard for ebooks. Once again proof that the East coast (where "traditional" publishing takes place) and the West coast (where technology happens) are very far apart.
The International Digital Publishing Forum (IDPF - once known as the Open eBook Forum) has recently announce a beta version of its e-book coding standard. I've been watching, and sometimes participating in, this group for a while, and I really think they deserve our attention and support.
To begin with, the IDPF publication structure standard (still termed OEBPS - Open Ebook Publication Structure) is designed to be used by publishers in the preparation of files that will be sent to the technology companies that transform the raw files into actual ebooks. As you know, there are dozens of e-book formats (PDF, Microsoft Reader, Mobipocket, Palm reader... etc.). The publishers need to create a single file that can be transformed into all of those formats, and the OEBPS standard is designed to meet that need. It is also designed to be an ebook format in its own right, and the upcoming Adobe ebook reader, "Adobe Digital Editions," based on Adobe's flash technology, will be able to display books in the OEBPS format.
The standard will seem overly simple to many people. It is that way on purpose. The original OEBPS standard used HTML, based on the assumption that even the publishers, who are notoriously lacking in technology chops, would have someone on board who knows HTML. The second version of the standard, the one out for comment, uses XHTML and CSS. I think this is brilliant. It means that 1) anyone can create a book and 2) anyone can display it, even in a simple browser. The KISS principle is essential for industry acceptance of the standard.
Another key thing to mention is that the OEBPS has been greatly influenced by members of the accessibility community who participate in the IDPF. The Digital Talking Book standard, which was first developed by the DAISY consortium and is now NISO standard Z39.86, uses an earlier version of the OEBPS as its book structure. This is the format that allows synchronization between a text and a reading of the text by a human reader, making it ideal for sighted and non-sighted readers alike (read it in bed, then continue listening in the car).
There is a DTD for the publication structure, although I am currently unable to get it to validate and behave. I have a question out to the authors of the DTD and will post here when I get an answer. Meanwhile, you can comment out the offending entity definition and play with the DTD.
A companion to the ebook standard is the Open Packaging Format (OPF). The easiest way to understand this is to take a look at it. Download Thoughts.epub.
Now open it in Winzip -- yes, it is a simple zipped file. In it you can find the raw xhtml of the publication; an OPF file that is the manifest for the package, and contains Dublin Core metadata for the item; a file that contains the mimetype; any images or other files that are required by the document; and an XML document that defines the overall container. Note that this is a very simple publication. The examples in the documentation show how you would create a document with multiple chapters, cover art, and illustrations. It also covers the areas of encryption and keys, for files that will be transmitted in protected formats. There is a nifty tutorial that steps you through the creation of an OCF file using Winzip.
If you have comments, suggestions, questions, or whatever, go to the discussion area of the IDPF web site and say your piece. And let me know if you have any thoughts on these standards, especially as to how they might be applicable to digital libraries.