Tuesday, November 25, 2008

JSON Schema is a Game-Changer

I have just become aware of a proposal that could change my opinion of JSON, of XML and a number of other positions that I had.

In the paper I co-authored on SOFEA, we were emphatic that JSON could not cut it as a format for Data Interchange because it lacked sufficient rigour to enforce service contracts. One of the main points behind a *Service-Oriented* Front-End Architecture was the ability to connect seamlessly to services, and services (by definition) need to have formal contracts. A front-end that doesn't respect data becomes a weak link in the end-to-end chain of data integrity and defeats a major goal of SOA.

With regard to data, we need to be able to specify three things - data types, data structures and data constraints (rules). JSON has very loose data types. It does support hierarchical data structures, but doesn't enforce data constraints. XML in contrast supplies all three, making it a superior choice.

I will freely admit that our choice of XML over JSON was not made without regret. JSON is far simpler to work with than XML, and one of our goals with SOFEA has been simplicity. We had to give a reluctant thumbs-down to JSON only because of its lack of rigour.

But now at last, it appears that our requirement for rigorous contract definition and enforcement is being addressed with JSON. This is the JSON Schema proposal from Kris Zyp.

I used to make a distinction between SOFEA and other similar approaches such as SOUI and TSA (Thin Server Architecture) based on this one aspect of rigorous contracts around data. I said at the time that better XML tooling would blunt JSON's edge in ease of use, but the opposite has happened. Better schema definition in JSON has instead blunted XML's edge in rigour. If JSON Schema becomes a reality, the distinction between SOFEA and its various cousins dissolves, and SOFEA will no longer be an XML-only architecture. All these architectures will be essentially the same.

Looking beyond SOFEA, I see JSON Schema as having very big implications for SOA itself. In an extreme scenario, the need for XML itself goes away! If we can define data rigorously and move it around in a structure that verifiably conforms to that definition, then our requirement is satisfied. XML may end up being seen as the EJB of data structures - clunky, unwieldy, intrusive, and ultimately replaced by a Spring-like lightweight rival that sacrifices nothing by way of rigour.

This is a development that definitely bears watching. There is a JSON Schema Google Group that is fairly active, and anyone with an interest in contributing should probably join this group.

Wednesday, November 12, 2008

Google Flu Trends

Now here's a development that's both heartening and disturbing.

Google Flu Trends is a new tool from the philanthropic foundation Google.org.

The idea is simple but revolutionary. Most statistics about epidemics are trailing indicators, i.e., they collect and organise data after events have happened. Google Flu Trends is about collecting and organising data as searches take place. The idea is that people will do Google searches on terms that affect them at the moment. So searches on "flu" will tend to rise when influenza is doing the rounds, "hayfever" searches will rise when hayfever season hits, and so on. By tracking where the searches are coming from, Google can provide a real-time (as opposed to a lagging) indicator of where official responses need to be targetted.

This is a heartening development because it promises a more rapid response to future pandemics like the Asian Bird Flu virus outbreak. The earlier warning and more precise pinpointing of affected areas can speed up intervention, save lives and waste fewer resources.

This is also a profoundly disquieting development in spite of Google's reminders about its privacy policy. What is being used in Google Flu Trends is aggregate data, but it shows that detailed per-user data with location-specificity is available to Google and can conceivably be used for less philanthropic purposes as well.