CMML is a great idea with a lot of potential in future use. And the best thing is that it is actually an Internet-Draft, so it is a standard – or at least becoming one.
There’s a missing part, though. In case of video content one should be able to define regions within a frame (or in a time-interval). This would allow one to mark objects/areas of the video stream, that would contain an URI for more detailed information.
The whole thing could be very easily supported by adding a new tag to <clip>
tag. Something like <div>
(stolen from html) tag to describe an area (and have just like in case of div) with absolute position withing the original frame size of the stream. This of course requires that CMML knows the frame size of the video stream!
An example. The original example is from the wikipedia, just extended with a <div>
tag. The only problem with this solution that one can only define rectangles. This could be solved to allow of usage either SVG namespace, or just simply describe polygons with their vertexes. The good thing with SVG is that it is again a standard, that assures us interoperability!
I have contacted the authors of the CMML standard, let’s see if they like the idea. well anyways, tell me if you do at least…