Tuesday, February 16, 2016

More is more

Here's something that drives me nuts:
Library catalog entry for Origin of Species

Library catalog entry for Origin of Species

These are two library catalog displays for Charles Darwin's "The origin of species". One shows a publication date of 2015, the other a date of 2003. Believe me that neither of them anywhere lets the catalog user know that these are editions of a book first published in 1859. Nor do they anywhere explain that this book can be considered the founding text for the science of evolutionary biology. Imagine a user coming to the catalog with no prior knowledge of Darwin (*) - they might logically conclude that this is the work of a current scientist, or even a synthesis of arguments around the issue of evolution. From the second book above one could conclude that Darwin hangs with Richard Dawkins, maybe they have offices near each other in the same university.

This may seem absurd, but it is no more absurd than the paucity of information that we offer to users of our catalogs. The description of these books might be suitable to an inventory of the Amazon.com warehouse, but it's hardly what I would consider to be a knowledge organization service. The emphasis in cataloging on description of the physical item may serve librarians and a few highly knowledgeable users, but the fact that publications are not put into a knowledge context makes the catalog a dry list of uninformative items for many users. There are, however, cataloging practices that do not consider describing the physical item the primary purpose of the catalog. One only needs to look at archival finding aids to see how much more we could tell users about the collections we hold. Another area of more enlightened cataloging takes place in the non-book world.

The BIBFRAME AV Modeling Study was commissioned by the Library of Congress to look at BIBFRAME from the point of view of libraries and archives whose main holdings are not bound volumes. The difference between book cataloging and the collections covered by the study is much more than a difference in the physical form of the library's holdings. What the study revealed to me was that, at least in some cases, the curators of the audio-visual materials have a different concept of the catalog's value to the user. I'll give a few examples.

The Online Audiovisual Catalogers have a concept of primary expression, which is something like first edition for print materials. The primary expression becomes the representative of what FRBR would call the work. In the Darwin example, above, there would be a primary expression that is the first edition of Darwin's work. The AV paper says "...the approach...supports users' needs to understand important aspects of the original, such as whether the original release version was color or black and white." (p.13) In our Darwin case, including information about the primary expression would place the work historically where it belongs.

Another aspect of the AV cataloging practice that is included in the report is their recognition that there are many primary creator roles. AV catalogers recognize a wider variety of creation than standards like FRBR and RDA allow. With a film, for example, the number of creators is both large and varied: director, editor, writer, music composer, etc. The book-based standards have a division between creators and "collaborators" that not all agree with, in particular when it comes to translators and illustrators. Although some translations are relatively mundane, others could surely be elevated to a level of being creative works of their own, such as translations of poetry.

The determination of primary creative roles and roles of collaboration are not ones that can be made across the board; not all translators should necessarily be considered creators, not all sound incorporated into a film deserves to get top billing. The AV study recognizes that different collections have different needs for description of materials. This brings out the tension in the library and archives community between data sharing and local needs. We have to allow communities to create their own data variations and still embrace their data for linking and sharing. If, instead, we go forward with an inflexible data model, we will lose access to valuable collections within our own community.

(*) You, dear reader, may live in a country where the ideas of Charles Darwin are openly discussed in the classroom, but in some of the United States there are or have been in the recent past restrictions on imparting that information to school children.

Sunday, January 17, 2016

Sub-types in FRBR

One of the issues that plagues FRBR is the rigidity of the definitions of work, expression, and manifestation, and the "one size fits all" nature of these categories. We've seen comments (see from p. 22) from folks in the non-book community that the definitions of these entities is overly "bookish" and that some non-book materials may need a different definition of some of them. One solution to this problem would be to move from the entity-relation model, which does tend to be strict and inflexible, to an object-oriented model. In an object-oriented (OO) model one creates general types with more specific subtypes that allows the model both to extend as needed and to accommodate specifics that apply to only some members of the overall type or class. Subtypes inherit the characteristics of the super-type, whereas there is no possibility of inheritance in the E-R model. By allowing inheritance, you avoid both redundancy in your data but also the rigidity of E-R and the relational model that it supports.

This may sound radical, but the fact is the FRBR does define some subtypes. They don't appear in the three high-level diagrams, so it isn't surprising that many people aren't aware of them. They are present, however in the attributes. Here is the list of attributes for FRBR work:
title of the work
form of work
date of the work
other distinguishing characteristic
intended termination
intended audience
context for the work
medium of performance (musical work)
numeric designation (musical work)
key (musical work)
coordinates (cartographic work)
equinox (cartographic work)
I've placed in italics those that are subtypes of work. There are two: musical work, and cartographic work. I would also suggest that "intended termination" could be considered a subtype of "continuing resource", but this is subtle and possibly debatable.

Other subtypes in FRBR are:
Expression: serial, musical notation, recorded sound, cartographic object, remote sensing image, graphic or projected image
Manifestation: printed book, hand-printed book, serial, sound recording, image, microform, visual projection, electronic resource, remote access electronic resource
These are the subtypes that are present in FRBR today, but because sub-typing probably was not fully explored, there are likely to be others.

Object-oriented design was a response to the need to be able to extend a data model without breaking what is there. Adding a subtype should not interfere with the top-level type nor with other subtypes. It's a tricky act of design, but when executed well it allows you satisfy the special needs that arise in the community while maintaining compatibility of the data.

Since we seem to respond well to pictures, let me provide this idea in pictures, keeping in mind that these are simple examples just to get the idea across.

The above picture models what is in FRBR today, although using the inheritance capability of OO rather than the E-R model where inheritance is not possible. Both musical work and cartographic work have all of the attributes of work, plus their own special attributes.

If it becomes necessary to add other attributes that are specific to a single type, then another sub-type is added. This new subtype does not interfere with any code that is making use of the elements of the super-type "work". It also does not alter what the music and maps librarians must be concerned with, since they are in their own "boxes." As an example, the audio-visual community did an analysis of BIBFRAME and concluded, among other things, that the placement of duration, sound content and color content in the BIBFRAME Instance entity would not serve their needs; instead, they need those elements at the work level.*

This just shows work, and I don't know how/if it could or should be applied to the entire WEMI thread. It's possible that an analysis of this nature would lead to a different view of the bibliographic entities. However, using types and sub-types, or classes and sub-classes (which would be the common solution in RDF) would be far superior to the E-R model of FRBR. If you've read my writings on FRBR you may know that I consider FRBR to be locked into an out-of-date technology, one that was already on the wane by 1990. Object-oriented modeling, which has long replaced E-R modeling, is now being eclipsed by RDF, but there would be no harm in making the step to OO, at least in our thinking, so that we can break out of what I think is a model so rigid that it is doomed to fail.

*This is an over-simplification of what the A-V community suggested, modified for my purposes here. However, what they do suggest would be served by a more flexible inheritance model than the model currently used in BIBFRAME.

Tuesday, January 12, 2016

Floor wax, or dessert topping?

As promised, my book, FRBR: Before and After, is now available in PDF as open access with a CC-BY license. I'd like to set up some way that we can turn this into a discussion, so if you have a favorite hanging out place, make a suggestion.

Also, the talk I gave at SWIB2015 is now viewable on youtube: Mistakes Have Been Made. That is a much shorter (30 minutes) and less subdued explanation of what I see as the problems with FRBR. If that grabs you, some chapters of the book will give you more detail, and the bibliography should keep anyone busy for a good long time.

Let me be clear that I am not criticizing FRBR as a conceptual model of the bibliographic universe. If this view helps catalogers clarify their approach to cataloging problems, then I'm all for it. If it helps delineate areas that the cataloging rules must cover, then I'm all for it. What I object to is that implication that this mental model = a data format. Oddly, both the original FRBR document  and the recent IFLA model bringing all of the FR's together, are very ambiguous on this. I've been told, in no uncertain terms, by one of the authors of the latter document that it is not data model, it's a conceptual model. Yet the document itself says:
The intention is to produce a model definition document that presents the model concisely and clearly, principally with formatted tables and diagrams, so that the definitions can be readily transferred to the IFLA FRBR namespace for use with linked open data applications. 
 And we have the statement by Barbara Tillett (one of the developers of FRBR) that FRBR is a conceptual model only:
"FRBR is not a data model. FRBR is not a metadata scheme. FRBR is not a system design structure. It is a conceptual model of the bibliographic universe." Barbara Tillett. FRBR and Cataloging for theFuture. 2005
This feels like a variation on the old Saturday Night Live routine: "It's a floor wax. No! It's a dessert topping!" The joke being that it cannot be both. And that's how I feel about FRBR -- it's either a conceptual model, or a data model. And if it's a data model, it's an entity-relation model suitable for, say, relational databases. Or, as David C. Hay says in his 2006 book "Data Model Patterns: A Metadata Map":
Suppose you are one of those old-fashioned people who still models with entity classes and relationships.
It's not that entities and relations are useless, it's just that this particular style of data modeling, and the technology that it feeds into, has been superseded at least twice since the FRBR task group was formed: by object-oriented design, and by semantic web design. If FRBR is a conceptual model, this doesn't matter. If it's a data model -- if it is intended to be made actionable in some 21st century technology -- then a whole new analysis is going to be needed. Step one, though, is getting clear which it is: floor wax, or dessert topping.

Saturday, January 02, 2016

Elkins Park

Many of you will have heard the name "Elkins Park" for the first time this week as the jurisdiction of Bill Cosby indictment. It is undoubtedly the most famous thing that Elkins Park has been known for. However, it has a connection to books and libraries that those of us involved with books and libraries should celebrate as a counter to this newly acquired notoriety.

The Elkins family was one of the 19th century's big names in Pennsylvania. William Lukens Elkins was one of the first "oil barons" whose company was the first to produce gasoline, just in time for the industrial and transportation revolution that would use untold gallons of the stuff. His business partner was Peter Widener, and the two families were intertwined through generations, their Philadelphia mansions built across the street from each other.

Eleanor Elkins, daughter of WL Elkins, married George Widener, son of Peter Widener, essentially marrying the two families. Eleanor and George had a son, Harry. Unfortunately, they were rich enough to book passage on the maiden voyage of the Titanic in 1912. George and Harry perished; Eleanor Elkins Widener survived.

Harry Widener had been an avid book collector, sharing this interest with his best friend, William McIntire Elkins. Harry had graduated from Harvard, as had his friend WM Elkins, and his will instructed his mother to donate his collection to Harvard, "to be known as the Harry Elkins Widener Collection." His mother took it one further and funded the creation of a new library that would house both her son's collection but also the entire Harvard library collection. Yes, I am talking about the now famous Widener Library.

His friend, William McIntire Elkins, lived until 1947, and during his lifetime amassed a huge rare book collection. He was particularly interested in Dickensiana, which included not only first serially published editions of Dickens' works, but also Dickens' desk, candlesticks, ink well, etc. His will left his entire collection to the Free Library of Philadelphia. Well, it turned out that it was not only his book collection, but the entire room, which was moved into the library, looking much as it did during Elkins' life.

Elkins Park is named, of course, for the Elkins family whose massive estate held an impressive array of mansions and grounds. Much of the estate has been divided up and sold off, but portions remain.

So that's the story of Elkins Park and how it fits into libraries and the rarified world of rare books. But I have another small bit to add to the story. In 1947, at the time of his death, John and Eleanor King, my grandparents, were working for (as he was known in my family) "old man Elkins" -- that is William McIntire Elkins. My grandfather was gardener and chauffeur for Elkins, and my grandmother was (as she called it) the executive of the household. When the library was transferred to the Free Library of Philadelphia, photographs were taken out the windows so that the "view" could be reproduced. Those photographs show the grounds that my grandfather cared for, and the room itself was undoubtedly very familiar to my grandmother (although she would never admit to having done any dusting with her own hand). A small amount bequeathed to them in Elkins' will allowed them to own their own property for the first time, just a few acres, but enough to live on, with sheep and chickens and a single steer. I have early memories of that farm, and a few family photos. Little did any of us know at the time that I would reconnect all of this because of books.

Tuesday, November 03, 2015

The Standards Committee Hell

I haven't been on a lot of standards committees, but each one has defined a major era in my life. I have spent countless hours in standards committees. That's because a standards committee requires hundreds of hours of reading emails, discussing minutiae (sometimes the meaning of "*", other times the placement of commas). The one universal in standards creation is that nearly everyone comes to the work with a preconceived idea of what the outcome should be, long before hearing (but not listening to) the brilliant and necessary ideas of fellow members of the committee. Most of these standards-progressing people are so sure that their sky is the truest blue that they hardly recognize the need to give passing attention to what others have to say.

In one committee I was on, the alpha geek appeared the first day with a 30-page document in hand, put it on the table, and said: "There. It's done. We can all go home now." He was smiling, but it wasn't a "ha ha" smile, it was a "gotcha" smile. That committee lasted over two years, two long, painful years in which we never quite climbed out of the chasm that we were thrown into on that first day. Over that two-year period we chipped away at the original document, transformed a few of its more arcane paragraphs into something almost readable, and eventually presented the world with a one hundred page document that was even worse than what we had started with. Thus is the way of standards.
"...it is so perfect in fact that the underlying model can be applied to any - absolutely any - technology in the universe."
A particular downfall of standards committees is what I will call "the perfect model." I can only describe it with an analogy. Let's say that you are designing a car (by committee, of course), and one member of the group is an engineer with a particular passion for motors. In fact, he (yes, so far I've only run into "he's" of this nature) has this dream of the perfect internal combustion engine. Existing engines have made too many compromises -- for efficiency and economy and whatever other corners manufacturers have desired to cut. But now there is the opportunity to create the standard, the standard that everyone will follow and that will make every internal combustion engine the perfect, beautiful engine. The person (let's call him PersonB, reserving PersonA for oneself, or perhaps the chair of the committee, or, depending on the standards body, for the founder of the standards body and inspiration for all things technological) has developed a new four-stroke engine, which he modestly names with an acronym that includes his name. We'll call this the FE (famous engineer) 4x2 engine. The theory of the FE4x2 is as finely honed as the tolerances between the pistons and their housing; it is so perfect in fact that the underlying model can be applied to any - absolutely any - technology in the universe. Because of the near-divine nature of this model, the use of common terminology cannot describe its powers. Perhaps it would be preferable to not name the model and its features at all, leaving it, like Yahweh, to be alluded to but never spoken. However, standards bodies must describe their standards in documents, and even sell these to potential creators of the standard product, so names for the model and its components must be chosen. To inspire in all the importance of the model, terms are chosen to be as devoid of meaning as possible, yet so complex that they produce awe in the reader. Note that confusion is often mistaken for awe by the uneducated.

 Our committee now has described the perfect engine using the universal model, but the standards organization survives on hawking specifications to enterprising souls who will actually create and attempt to sell products that can be certified by the August Authoritative Standards Organization. This means that the thing the standard describes has to be packaged for use. Because the model is perfect, the package surely cannot be mundane. You don't put this engine in something resembling a Sears and Roebuck toaster oven. No, the package must have class, style, and a certain difficulty of use that makes the owner of the final product really think hard about what each knob is for. In fact, it would be ideal if every user would need to attend a series of seminars on the workings of this Perfect Thing. There's a good market for consultants to run these seminars, especially those members of the community who haven't got the skill to actually manufacture the product themselves. Those who can't do, as the saying goes, teach.

The final package needs also to justify the price that will be charged by purveyors of this product. It needs to be complex but classy. It has to waft on the wind of the past while promising an unspecified but surely improved future. The car committee needs to design a chassis that is worthy of the Perfect Engine. Committee members would love for it to be designed around a yet-to-be developed material, one that just screams Tomorrow! Again, though, there is that need to sell the idea to actual manufacturers, so the committee adds to the standard a chassis made of tried-and-true materials that must be tortured into a shape that could be, but probably will not be, what the not-yet-real future technology allows.
"But what about the children?"
Whatever you do, do not be the person on the committee who asks: But what about the driver? How comfortable will it be? Will it be safe? Can children ride in it? (Answer: no, anyone who cares about the Perfect Engine will obviously have the sense to eschew children, who will only distract the adult's attention from the admiration of the Perfect Engine.) And never, ever point out that the design does not include doors for entering the vehicle. It's perfect, okay, just leave it at that. This is how we get a standard, and the industry around a standard; an industry that exists because the standard is so deeply just and true and right that no one can figure out how to use it, yet, because it is a standard from the August Authoritative Standards Organization, the rightity and trueness of the standard simply cannot be questioned. Because it is, after all, a standard, and standards exist to be obeyed.
"I've got mine!"
Another downfall of a standards committee is when the committee has one or more members of the "I've got mine" type. These are folks who already have a product of the genre the standard is meant to address, and their participation in the committee is to assure that their product's design becomes the standard. There are lots of variations on this situation. A committee with only one "I've got mine" becomes a simple test of wills between the have and the have nots. A committee with more than one "I've got mine" becomes a battleground. The have nots on this committee might as well just go home because their views of what is needed are so irrelevant to the process that they can have the same effect on the outcome of the standards work by not being there. Who wins the battle depends on many things, of course, but I'd usually advise that you bet on the largest, richest "I've got mine." It is especially helpful if the "I've got mine" holds patents in the area and can therefore declare (true story) "If you create it, we'll destroy you with with patent claims."

Like the engineer of the perfect model, the "I've got mine" has an idee fixe. In this case, though, the idee may not be perfect or complete or even usable. But it exists, and "I've got mine" does not want to change. Therefore every idea that is not already in the product of "I've got mine" meets with great resistance. At various points in the discussion, "I've got mine" threatens to take his ball and go home. For reasons that have never been clear to me, the committee takes this threat seriously and caves in to "I've got mine" even though most members of the committee actually understand that the committee would be more successful without this person.
"...even though they repeat often the mantra "We can always blow it up and start over" they never, never start over."
This then takes me to downfall number 3: once standards committees dig themselves into a hole, once they have started down a path that is quite clearly not going to result in success, and even though they repeat often the mantra "We can always blow it up and start over" they never, never start over. The standard that comes out always looks like the non-standard that went in on day one, regardless of how dysfunctional and mistaken that is. This is one of the reasons why there are standards on the books that were developed through great effort and whose person hours would add up to hundreds of thousands or even millions of dollars spent and yet they have not been adopted. Common sense allows people outside of the bubble of the standards committee process to admit that the thing just isn't going to work. No way. That's the best possible outcome; the worst possible outcome is that through an excess of obedience in a community with a hive mind the standard is adopted and therefore screws everything up for that community for decades, until a new standards committee is launched.
"...we can have a new standard, but nothing can really change."
If you think that committee will solve the problem, then I suggest you go back to top of this essay and begin reading all over again. Because by now you should be anticipating downfall number 4: we can have a new standard, but nothing can really change. The end result of applying the new standard has to be exactly the same as the result obtained from the old standard. The committee can therefore declare a great success, and everyone can give a sigh of relief that they can go on doing everything the same way they ever did, perhaps with slightly different terminology and a bunch of new acronyms.

Now off I go to read some more emails, asking myself:  "Is this the time to ask: what about the children?"

Friday, October 30, 2015

Libraries, Books, and Elitism

"So is the library, storehouse and lender of books, as anachronistic as the record store, the telephone booth, and the Playboy centerfold? Perversely, the most popular service at some libraries has become free Internet access. People wait in line for terminals that will let them play solitaire and Minecraft, and librarians provide coffee. Other patrons stay in their cars outside just to use the Wi-Fi. No one can be happy with a situation that reduces the library to a Starbucks wannabe."
James Gleick, "What Libraries (Still) Can Do" NYRDaily October 26, 2015

This is one of the worst examples of snobbery and elitism in relation to libraries that I have seen in a long time. It is also horribly uninformed. Let me disassemble this a bit.

First, libraries as places to gather is not new. Libraries in ancient Greece were designed as large open spaces with cubbies for scrolls around the inside wall. Very little of the space was taken up with that era's version of the book. They existed both as storehouses for the written word but also a place where scholars would come together to discuss ideas. Today, when students are asked what they want from their library, one of the highest ranked services is study space. There is nothing wrong with studying in a library; in fact, as anyone with a home office knows, having a physical space where you do your studying and thinking helps one focus the mind and be productive. 

Next, the dismissive and erroneous statement that people use "terminals" (when have you last heard computers called that?) to play solitaire and Minecraft completely ignores that fact that many of our information sources today are available only through online access, including information sources available to most users only through the library. If you want to look up journal articles you need the library's online access. Second, many social services are available online. The US government and most state governments no longer provide libraries with hard copies of documents, but make them available online. From IRS tax preparation help to information about state law and city zoning ordinances, you absolutely must have Internet access. Internet access is no longer optional for civic life. I can't imagine that anyone is waiting in line at a library for a one-hour slot to build their Minecraft world, but if they are, then I'm fine with that. It's no less "library-like" than using the library to read People magazine or check out a romance novel. (Gleick is probably against those, too.)

Gleick doesn't seem to know (and perhaps Palfrey, whose book he is reviewing, ditto) that libraries have limits on ebook lending.
And a library that could lend any e-book, without restriction, en masse, would be the perfect fatal competitor to bookstores and authors hoping to sell e-books for money. Something will have to give. Palfrey suggests that Congress could create “a compulsory license system to cover digital lending through libraries,” allowing for payment of fair royalties to the authors. Many countries, including most of Europe, have public lending right programs for this purpose.
This completely misses the point. Libraries already lend e-books, with restriction, and they pay for them in the same way that they pay for paper books -- by paying for each copy that they lend. Suggesting a compulsory license is not a solution, and the public lending right that is common in Europe is for hard copy books as well as e-books. The difference being that the payment for lending in those countries does not come out of library budgets but is often paid out of a central fund supporting the arts. Given that the US has a very low level of government funding for the arts, and that libraries are not funded through a single government mechanism, a public lending payment would be extremely difficult to develop in this country.  There is the very real risk that it would take money out of already stretched library budgets and would  further disadvantage those library systems that are struggling the most to overcome poor local funding.

I don't at all mind folks having an opinion about libraries, about what they like and what they want. But I would hope that a researcher like Gleick would do at least as much research about libraries as he does about other subjects he expounds on. They - we - deserve the same attention to truth.

Tuesday, October 13, 2015

SHACL - Shapes Constraint Language

If you've delved into RDF or other technologies of the Semantic Web you may have found yourself baffled at times by its tendency to produce data that is open to interpretation. This is, of course, a feature not a bug. RDF has as the basis of its design something called the "Open World Assumption". The OWA acts more like real life than controlled data stores because it allows the answers to many questions to be neither true nor false, but "we may not have all of the information." This makes it very hard to do the kind of data control and validity checking of data that is the norm in databases and in data exchange.

There is an obvious need in some situations to exercise constraints on the data that one manages in RDF. This is particularly true within local systems where data is created and updated, and when exchanging data with known partners. To fill this gap, the semantic web branch of the World Wide Web Consortium has been working on a new standard, called the SHApes Constraint Language (SHACL), that will perform for RDF the function that XML schema performs for XML: it will allow software developers to define validity rules for a particular set of RDF.

SHACL has been in development for nearly a year, and is just now available in a First Public Working Draft. A FPWD is not by any means a finished product, but is far enough along to give readers an idea of the direction that the standard is taking. It is made available because comment from a larger community is extremely important. The introduction to the draft tells you where to send your comments. (Note: I serve on the working group representing the Dublin Core community, so I will do my best to make sure your comments get full consideration.)

Like many standards, SHACL is not easy to understand. However, I think it will be important for members of the library and other cultural heritage communities to make an effort to weigh in on this standard. Support for SHACL is strong from the "enterprise" sector, people who primarily work on highly controlled closed systems like banks and other information intense businesses. How SHACL benefits those whose data is designed for the open web may depend on us.

SHACL Basics

The key to understanding SHACL is that SHACL is based in large part on SPARQL because SPARQL already has formally defined mechanisms that function on RDF graphs. There will be little if any SHACL functionality that could not be done with SPARQL. SPARQL queries that perform some of these functions are devilishly difficult to write so SHACL should provide a cleaner, more constraint-based language.

SHACL consists of a core of constraints that belong to the SHACL language and have SHACL-defined properties. These should be sufficient for most validation needs. SHACL also has a template mechanism that makes it possible for anyone to create a templated constraint to meet additional needs.

What does SHACL look like? It's RDF, so it looks like RDF. Here's a SHACL statement that covers the case "either one foaf:name OR (one foaf:forename AND one foaf:lastname):

    a sh:Shape ;
sh:scopeClass foaf:Person ;
    sh:constraint [
        a sh:OrConstraint ;
                sh:property [
                    sh:predicate foaf:name ;
                    sh:minCount 1 ;
                    sh:maxCount 1 ;
                sh:property [
                    sh:predicate foaf:forename  ;
                    sh:minCount 1 ;
                    sh:maxCount 1 ;
                ] ;
                sh:property [
                    sh:predicate foaf:lastname  ;
                    sh:minCount 1 ;
                    sh:maxCount 1 ;
    ] .

SHACL constraints can either be open or closed. Open, the default, constrains the named properties but ignores other properties in the same RDF graph. Closed, it essentially means "these properties and only these properties; everything else is a violation."

There are comparisons, such as "equal/not equal" that act on pairs of properties. There are also constraints on values such as defined value types (IRI, data type), lists of valid values, and pattern matching.

The question that needs to be answered around this draft is whether SHACL, as currently defined, meets our needs -- or at least, most of them. One way to address this would be to gather some typical and some atypical validation tests that are needed for library and archive data, and try to express those in SHACL. I have a few examples (mainly from Europeana data), but definitely need more. You can add them to the comments here, send them to me (or send a link to documentation that outlines your data rules), or post them directly to the working group list if you have specific questions.

Thanks in advance.