Monday, October 14, 2013

Who uses Dublin Core? - the original 15

The original 15 Dublin Core elements are included in the Dublin Core Metadata Terms using the namespace http://purl.org/dc/elements/1.1/. There is an "updated" version of each of the original terms in the namespace http://purl.org/dc/terms (dcterms). The difference is that the /dc/terms includes formal domains and ranges, in conformance with linked data standards; the original 15 elements in the /dc/elements/1.1/ namespace have no domain or range constraints defined. This means that the original 15, often given the namespace prefix of "dc:" or "dce:", are compatible with legacy uses of the Dublin Core elements.

In the first post of this series, I showed that the most used terms are from the dcterms vocabulary, followed immediately by a cluster of terms from the dce namespace. In addition, the majority of the top dcterms are the linked data equivalents of the dce terms, thus confirming the "coreness" of the original Dublin Core 15.

From this explanation one might expect that the uses of dce in the wilds linked data would be limited to legacy data. That does not, however, seem to be the case. Out of a total of 125 datasets from the Linked Open Vocabularies,  nearly half (60) use both the linked data vocabulary (dcterms) and the dce terms. Of the top five datasets with the greatest number of uses of dce, only one, "Wikipedia 3," does not also use the dcterms.

Europeana Linked Open Data 
Wikipedia 3 
Linked Open Data Camera dei deputati 
B3Kat - Library Union Catalogues of Bavaria, Berlin and Brandenburg 
Yovisto - academic video search 
 
There are reasons why datasets may use both "generations" of the Dublin Core vocabulary. One is that their data contains a mix of legacy metadata and linked data, either because the dataset has grown over time, or because the set combines data from different sources. Another is that there may be situations in which the dcterms use of domains and ranges is too restrictive for the needs of the data creators.

The LOV dataset of dce usage has over 24 million uses (compared to 192 million uses of dcterms). Library and bibliographic data is again by far the majority of the use, although it is rivaled by government data, in part because of the over 4 million uses contributed by the Italian Camera dei deputati, which also uses dcterms but to a lesser extent. In fact, government data is overall a strong contender in the dce space.

My overall conclusion from looking at this data is that Dublin Core is used widely for bibliographic and non-bibliographic data; that there is a new "core" based on usage that overlaps greatly with the old core; some dcterms elements are hardly used at all in these datasets; and finally that both the linked data dcterms and the legacy dce elements show themselves to be useful, even in the linked data environment.


Related posts:

No comments: