Sunday, December 09, 2007

"XML" Should be a First Class DataType in Java, or My E4X Epiphany

I've been playing with E4X for a few days now. I never realised that manipulating XML could be so easy. No heavyweight API like JAXB or its marginally simpler cousin JDOM. E4X is just so simple and natural, with XPath-like expressions being part of the basic syntax. A pity Java has nothing like this.

When I mentioned this today to my friend and frequent co-author Rajat Taneja, he had, as usual, just one comment to make, - a statement of astonishingly simple insight. "Java needs to have XML as a First Class DataType," he said, "You should be able to declare a variable as being of type "XML", then set it to an XML document or InputStream and it should even be able to validate itself against the schema that the document references."

I know that Java 6 has vastly improved support for XML processing, but that's by bringing JAXB into the language. It's not the same thing. What Rajat says is required is a "java.lang.XML" datatype. Without it, Java just cannot cut it in the brave new world of SOA. Strong words, I know, but my recent experience with E4X has convinced me that XML manipulation has to be dead easy, because the centre of gravity of the software industry has shifted away from implementation languages like Java and C# and towards representational languages like XML (italics represent my own terminology).

If you play around a bit with WSO2 Mashup Server, you'll see what I mean.

If BPEL is the language to orchestrate verbs (SOAP operations), I would put my money on E4X as the language to aggregate nouns (XML documents that are REST resource representations or part of SOAP responses). I believe that this is the destiny of portal servers too. They must turn into mashup servers to survive, and I think a language like E4X is what they need to aggregate content before converting it into a presentation format. The latter function is the mainstay of XSLT, but so far, we haven't had a clear candidate for the former, no universally accepted Content Aggregation Language.

Until now, that is.

I'm pretty sure that E4X isn't going to be the ultimate tool for XML processing, but it has broken new ground and represents the next generation in XML manipulation capability. It's exciting to imagine what new technologies and tools will follow in its wake.

What we need now is an architecture to guide this new technology through its infancy - a Service-Oriented Content Aggregation Architecture. Give Rajat some time :-).

1 comment:

Keith Chapman said...

The E4X quick start guide which can be found at http://wso2.org/project/mashup/0.2/docs/index.html
has a nice little comparison on some XPath expressions and there E4X equivalent. It also shows how XML can be navigated with utmost ease.