Monday, January 14, 2008

New DCMI specifications

I'm proud to announce an updated set of Dublin Core specifications, some of which I've made substantial contributions to: First, "New DCMI Recommendation on Expressing Dublin Core Metadata using RDF", that I've contributed heavily to (though Andy Powell did start the process!). The other set: "Major update of DCMI Metadata Terms documentation", that I've mostly contributed documentation and schema comments and validation. The third: "The Singapore Framework for Dublin Core Application Profiles", which is a first stab at defining the notion of Application Profile a little more explicitly. Today is a good day.

Friday, December 07, 2007

Microsoft and ISO - how to destroy the ISO process

Report from the SC34 convener Martin Bryan is stepping down (after his three year term) as convener of ISO/IEC JTC1 SC34, the place of the OOXML debacle. Says he:

This year WG1 have had another major development that has made it almost impossible to continue with our work within ISO. The influx of P members whose only interest is the fast-tracking of ECMA 376 as ISO 29500 has led to the failure of a number of key ballots.
(ECMA 376 is OOXML), and
The days of open standards development are fast disappearing. Instead we are getting “standardization by corporation”, something I have been fighting against for the 20 years I have served on ISO committees.
So, Microsoft is single-handedly responsible for destroying the standardization efforts of numerous engaged individuals and organizations, as well as the work in an important international standardization organization. I can't find words for the disappointment I feel. Peter Kopelman, take some responsibility for the business ethics of your company!

Friday, November 30, 2007

Semanticizing metadata specifications

Jason Wrage asks:

Would you mind taking a moment to summarize the process of making a specification semantically compatible? I assume that this might entail development of a vocabulary and embedding RDF within the target specification?
That is an excellent question, and something I've spent a few years contemplating. To begin with, there is a huge difference between designing a specification semantically from scratch, and "semanticizing" it after the fact. In general, it depends a lot on the specification at hand, and in particular on things like:
  1. Is the specification based on some form of vocabulary-independent abstract model
  2. Is the specification expressed in some kind of modeling language (UML etc)
  3. Are the entities in the specification explicit?
  4. How does the specification handle identity for the metadata terms?
and so on. I have experiences with semanticizing IEEE LOM, and the answers to the above in the LOM case is:
  1. Not explicitly - but the LOM tree structure is almost an abstract model.
  2. No
  3. No - there are many entities in the model that are not explicit (The Educational category/entity is a major issue)
  4. Tree-based identification such as General.Title
Based on the above, one can start to see the issues:
  1. Tree-based and semantic models don't fit well. We will have to disassemble the tree to semanticize, and then reconstruct it afterwards
  2. No UML model means no alternative to the tree view, so we need to base our decisions on the tree directly.
  3. We will have major headaches trying to identify the entities.
  4. We will need to make sure that information about the position in the hierarchy when introducing new properties. Compare General.Description and Educational.Description - very different semantics.
I wrote in length about the process here. The general method for LOM was:
  1. Isolate properties and objects. The first step involves extracting an object-oriented view of the LOM data model. What LOM elements are objects, and which are relations between objects? This sounds relatively easy, but it's in effect the core of the semantic translation.
  2. Find related Dublin Core elements and encodings. For the LOM case, it was very important to try to reuse existing vocabulary. After having found the relevant Dublin Core elements, the precise relation to the Dublin Core element needed to be defined. There are essentially four ways in which a LOM element might be related to Dublin Core:
    1. By being identical to some Dublin Core Element.
    2. By being a sub-property (=refinement) of a Dublin Core Element.
    3. By being a super-property of a Dublin Core Element
    4. By using literal values that could be specified using a Dubin Core Syntax Encoding Scheme.
  3. Define RDF vocabulary matching your model
  4. Making RDF namespaces available on the web, following vocabulary publishing guidelines
Nowadays, there are a few additional steps that might be interesting.
  • The Dublin Core Description Set Profile model allows for the construction of application profiles of RDF data, promising syntactic validation of Dublin Core metadata. This is otherwise something that many people miss when going from XML to RDF. A general RDF equivalent is something Alistair Miles has written about.
  • GRDDL support in your XML formats will allow semantic web clients to extract RDF information from your XML data. With the above vocabularies, such data can be of high quality.
I'm sure there are more things as well. See also the articles linked from this page. Not sure this is summary, but still....

Thursday, November 29, 2007

Copyriot » Some notes on General Rights Management

The most interesting thing I've read about copyright, DRM, piratism etc. in a long time. Copyriot » Some notes on General Rights Management

Tuesday, November 20, 2007

Putting REST into perspective

From T.V Raman at the W3C Tech Plenary XML Applications: 2^W: The Second Coming Of The Web

Where Web 1.0 was about bringing useful content to the Web, Web 2.0 is about building Web artifacts out of Web parts. URLs play a central role in enabling such re-use --- notice that a necessary and sufficient condition for something to exist on the Web is that it be addressable via a URL.
The post puts the REST design principles and the importance of Web Architecture in perspective.

Sunday, November 11, 2007

The XO laptop is out

Buy one for $399, and one is given to a child in the developing world. One Laptop Per Child -- XO Giving Campaign starts tomorrow, and lasts for two weeks.

Explaining knowledge representation

Hmm, finally an interesting introduction to knowledge representation: How To Tell Stuff To A Computer - The Enigmatic Art of Knowledge Representation