Semantic XHTML

Brian Cantoni has a good writeup of a talk about Semantic XHTML given by Kevin Marks and Tantek Celik. The slides are available online as well. There’s some good stuff in there. Lately I’ve been working a bunch with the idea of mixing in additional information with web content. The ideas become much more interesting with the read-write-web in my opinion. There’s an elegant layered model for moving user content around evolving. The APIs that started out with blogging, like MetaWeblog and Atom, provide a transport mechanism. Semantic XHTML can provide the structure for those posts to turn a blog post into something more, like an address book update or a calendar event. And those two realms can remain distinct. A standard blog post without semantic information remains a valid post, using Semantic XHTML in some cases doesn’t preclude using standard HTML in others. For instance the metadata for a micro-content system could be passed as semantic XHTML in a blog post. Systems used to tend to either require metadata or not allow it at all. New systems, like the del.icio.us bookmarking system, allow for optional metadata and don’t enforce any restrictions on how the info is used.

The term for this free-for-all kind of metadata is currently folksonomy. And like many of the points of interest within the area called social software, it’s simply a reversal of what “common knowledge” computer information management says is an unworkable system. Traditional knowledge said that unstructured metadata for categorization would lead to chaos, you need a strict taxonomy. Del.icio.us has proven that chaos does indeed have it’s uses, even without a taxonomy the system is still very useful. Traditional knowledge says that any public internet resource needs access control and authentication. Wikis (and the Wikipedia in particular) have proven that there are cases where a free-for-all actually increases the rate of production. Even after you factor in time to reverse graffiti and malicious updates. So what are some of the other assumptions that we can attempt to reverse in looking for areas where common knowledge tells us some system is “unworkable”?

Well, how about the application interfaces themselves? Traditional system architecture says that in order to build an application you need a set of cooperating APIs, and that accessing remote resources requires something like SOAP/WSDL or CORBA in order to provide a description of the interface and introspection and versioning. But there’s been a lot of resistence to this view of the network, such as the RESTian response to web services. Most recently the JotSpot tool seems to have embraced the idea also. Applications aren’t bits of binary code linking together local libraries into a strict framework of classes and objects, they’re transformations and pointers linking togther and unifiying resources out on the web. It’s a completely different take on the control vs. flexibility tradeoff than is normally taken. Don’t errect a strict barrier arround your application to isolate it from the world. Provide every affordance and convenience to allow data in any form to contribute. I personally expect a lot more work in this direction over the next couple of months. It just logically lays in line with a series of movements that have been taking shape and gaining momentum for at least a year.