The Wisdom of Ganesh: December 2007

Friday, December 28, 2007

The Anti-Mac Interface and SOA

SOAP and REST - I must sound like a cracked record.

An area I've started exploring is expressive power. Which paradigm allows us to talk about a domain in a richer way?

Many years ago, usability gurus Don Gentner and Jakob Nielsen wrote a provocatively-titled article called The Anti-Mac Interface. They hastened to explain that they were fans of the Mac, and that their paper was an attempt to explore what the human user interface would be like if each of the Macintosh's design principles was systematically and deliberately violated.

I will quote a couple of paragraphs from that paper here, especially where they talk about the limitations of the friendly point-and-click paradigm.

The see-and-point principle states that users interact with the computer by pointing at the objects they can see on the screen. It's as if we have thrown away a million years of evolution, lost our facility with expressive language, and been reduced to pointing at objects in the immediate environment. Mouse buttons and modifier keys give us a vocabulary equivalent to a few different grunts. We have lost all the power of language, and can no longer talk about objects that are not immediately visible (all files more than one week old), objects that don't exist yet (future messages from my boss), or unknown objects (any guides to restaurants in Boston).

Note three important classes of objects that the "see-and-point" paradigm is unable to cater for:

1. resources that are not immediately visible
2. resources that don't exist yet
3. resources that are unknown

I've deliberately used the term "resources" there. Open-ended question: does REST's insistence on identifiable resources expose it to an analogous shortcoming of the kind Gentner and Nielsen have identified with GUI interfaces?

Gentner and Nielsen go on to say

If we want to order food in a country where we don't know the language at all, we're forced to go into the kitchen and use a see-and-point interface. With a little understanding of the language, we can point at menus to select our dinner from the dining room. But language allows us to discuss exactly what we would like to eat with the waiter or chef. Similarly, computer interfaces must evolve to let us utilize more of the power of language. Adding language to the interface allows us to use a rich vocabulary and gives us basic linguistic structures such as conditionals. Language lets us refer to objects that are not immediately visible. For example, we could say something like "Notify me if there is a new message from Emily." Note we are not advocating an interface is based solely on language. Neither does the interface have to understand full natural language. Real expressive power comes from the combination of language, examples, and pointing.

Again, I can't help wondering, does SOAP's free-format style give it greater expressiveness than REST with its deliberately constrained interface? Certainly the composition of services into processes backed up by rules engines provides the ability to formulate complex conditional expressions.

To continue with Gentner and Nielsen's delightful analogy, is homo restus then a caveman who has to point to something visible in his immediate environment and make one of four different grunts to indicate what he means? And is his opposite number a glib-tongued SOAP salesman unconstrained by language?

Interesting.

Thursday, December 27, 2007

No, Really. What is SOAP? What is BPEL?

Sometimes the answers to simple questions are profound.

I have been asking myself a couple of very simple questions for some time now, and although I initially didn't want to accept some of the answers I got, I guess I can't avoid their implications.

What is SOAP? I'm deliberately not looking at any of the SOAP spec documents for the answer. I want to know the value proposition of SOAP.

Similarly, what is BPEL (or WS-BPEL, if you prefer)?

I can think of two quick answers:

1. The value proposition of SOAP is interoperability.
2. The value proposition of BPEL is portability.

To my mind, interoperability is at least an order of magnitude more important than portability.

SOAP is the "wire protocol" in one view of SOA. Within the "cloud" or "SOA Fabric", the only things one sees are SOAP messages flying past. Nobody sees BPEL within the SOA cloud. BPEL is an implementation language for the execution of processes, just like Java is an implementation language for general-purpose business logic. It runs on a node. All interactions between that node and others are through SOAP messages. A BPEL process consumes SOAP-based services and in turn exposes itself as a SOAP service.

So is BPEL really all that important in the scheme of things? Wouldn't any other portable language do? I think I've stumbled upon a candidate language, and it's not Java.

Lately, I've been playing around with the WSO2 Mashup Server, and I'm increasingly getting the excited feeling of a kid who's somehow come into possession of a machine-gun. This is not a toy, folks. Are the adults aware of what this thing can do?

Most people seem to think mashups are a cute visual gimmick. The WSO2 guys themselves don't have the air of people handing out machine-guns. The examples they bundle with the server are classic mashup fare - TomatoTube, in which you take the top-rated movies from Rotten Tomatoes and mash the list up with trailers from YouTube, enabling you to see the trailers of top-rated movies. Very cute and harmless.

But now for the machine-gun bit. The development language offered by Mashup Server is JavaScript (think general-purpose programming language). JavaScript augmented by E4X (think really easy XML manipulation). Mashup Server hides SOAP very effectively, although its interfaces to the outside world are SOAP (SOAP 1.1 and 1.2, also REST/HTTP, but more about that later). SOAP is out there in the cloud. But here, within this processing node, it's just XML representations of data and JavaScript code to process it with, thanks to the why-didn't-I-think-of-it simplicity of E4X. Surely we can do more than mashups with that kind of power...

I've found myself thinking a disturbing thought: If no one sees BPEL in the forest, then does it really exist? What if BPMN-based process modelling tools spat out E4X-enhanced JavaScript code instead of BPEL? Would anyone know or care? Take the output of the modelling tool and drop the file onto a server. The process is ready to run. All external interfaces are SOAP-based, just like with a BPEL implementation. Got any problems with that?

There's more revolutionary potential here than in the Communist Manifesto. Another of the cutesy examples bundled with Mashup Server is a REST service. You can do HTTP GETs, PUTs, POSTs and DELETEs on a simple city resource to manipulate its weather data. Very harmless, but again, the developer has very simple JavaScript-based access to REST services.

So is the WSO2 Mashup Server the one that will bring balance to the Force? A powerful programming language. Laughably easy XML manipulation. Simple access to SOAP services and REST resources. Transparent publication of itself as a service or resource in turn. Isn't this the holy grail of service composition?

Content management, process orchestration, what's in a name? I'm beginning to think BPEL is dead. Its value proposition doesn't stack up anymore. SOAP still makes sense, but not BPEL.

WSO2 Mashup Server seems to be the industry's best-kept secret for now. Keep the safety catch on, and watch where you point that thing.

Why RPC is Evil, Redux

Why am I blogging about this again? I talked about RPC being evil before.

Well, Mark Little referred to my post on Paying the RESTafarians Back in Their Own Coin and went into some detail on RPC and distributed systems in general. I'm sure that Mark (having a PhD in Distributed Systems) understands the fundamental issue with RPC, but since he doesn't make the point obviously enough in his post, it's possible that some of his readers may not get it. So let me illustrate my point with an actual example.

To recap: RPC is evil because it tries to make remote objects look local, and the illusion fails subtly in certain use cases that are quite common, damaging the integrity of the applications involved. When I say "object" here, I am not referring to objects in the OO sense, but anything that exists in memory, even an integer.

When I first learned C programming (in 1986 - man, has it been 21 years already?), I learned that swapping two numbers with a function call wasn't so straightforward. The following naive example doesn't work:

main()
{
int a = 5;
int b = 6;

printf( "Before swapping, a = %d and b = %d\n", a, b );

swap( a, b );

printf( "After swapping, a = %d and b = %d\n", a, b );
}

void swap( int x, int y )
{
int temp = x;
x = y;
y = temp;
}

The printout of the program will be:

Before swapping, a = 5 and b = 6
After swapping, a = 5 and b = 6

Clearly, the naive approach doesn't work, because C functions are called with pass-by-value semantics. This means that the swap() function is given copies of the two variables a and b. What it swaps are the copies. The original variables a and b in the calling function are unchanged.

The correct way to do the swap in C, it turns out, is to pass the memory addresses of the two variables to the swap() function, which then reaches back to swap the two variables by dereferencing the addresses:

main()
{
int a = 5;
int b = 6;

printf( "Before swapping, a = %d and b = %d\n", a, b );

swap( &a, &b ); // Note: we're passing the addresses of the variables now

printf( "After swapping, a = %d and b = %d\n", a, b );
}

void swap( int *x_p, int *y_p ) // The function receives the addresses of the variables, not the variables themselves
{
int temp = *x_p; // The asterisk helps to dereference the address and get at the value held at that location
*x_p = *y_p;
*y_p = temp;
}

Now the printout says:

Before swapping, a = 5 and b = 6
After swapping, a = 6 and b = 5

Subtle point: Note that pass-by-value semantics are alive and well, but now we're passing copies of the addresses. That doesn't matter, because when we dereference the addresses, we find we're looking at the original variables (multiple copies of an address will all point to the same memory location).

Now imagine that someone comes in offering to take my swap() function and run it on a remote host. They claim that they can make the two functions (the calling main() function and the swap() function) appear local to each other, so that I don't have to change either of them in any way. Is this claim credible?

This is exactly the promise of RPC, and if you believe the promise, you're in for a great deal of trouble.

What RPC will do is place a "stub" function on the machine running main(), and a "skeleton" on the machine running swap(). To main(), the stub function will appear like the original swap() function, and to swap(), the skeleton will behave like the original main() function. Between the two of them, the stub and skeleton will attempt the hide the presence of the network between main() and swap(). How do they do this?

The stub will quietly dereference the memory addresses being passed by main() and extract the values 5 and 6. Then it will send the values 5 and 6 over the network. The skeleton will create two integer variables in the remote system's memory and populate them with the values 5 and 6, then pass the addresses of these two variables to the swap() function. The swap() function will happily swap the two variables on the remote system, and the function call will return. Our printout will say:

Before swapping, a = 5 and b = 6
After swapping, a = 5 and b = 6

What happened? Didn't we pass the addresses of the variables instead of their values? Why didn't it work?

It didn't work because you cannot make pass-by-value look like pass-by-reference, and that's because memory references are meaningless outside the system they refer to. Think about this a little bit, and how we can solve the problem.

Let's say another person approaches me now, offering to host my swap() function on a remote system, but this time, they make it clear that only pass-by-value will be supported and I will have to deal with it. In this case, I need to make some changes to my main() and swap() functions. I first declare a structure to hold two integers in a separate file as shown below:

/* File commondecl.h */
// Declare structure tag for use elsewhere
struct number_pair_st_tag
{
int first;
int second;
};

I then use this common declaration to define a structure in both the main() and swap() functions:

/* File containing main() function */
#include "commondecl.h"

main()
{
int a = 5;
int b = 6;

// Define a structure to hold two variables
struct number_pair_st_tag returned_pair_st;

printf( "Before swapping, a = %d and b = %d\n", a, b );

returned_pair_st = swap( a, b ); // Call by value and receive by value

// Explicitly assign what is returned to my original variables.
a = returned_pair_st.first;
b = returned_pair_st.second;

printf( "After swapping, a = %d and b = %d\n", a, b );
}

/* File containing swap() function */
#include "commondecl.h"

struct number_pair_st_tag swap( int x, int y )
{
// Define a structure to hold the swapped pair
struct number_pair_st_tag swapped_pair_st;

// The swapping is more direct this time, no need for a temp variable.
swapped_pair_st.first = y;
swapped_pair_st.second = x;

// Explicitly return the swapped pair by value
return swapped_pair_st;
}

The printout will now read:

Before swapping, a = 5 and b = 6
After swapping, a = 6 and b = 5

The swap worked, and it worked because there was all-round honesty this time about the fact that variables would be passed by value. Yes, I had to re-engineer my code, but now my code will work regardless of whether the two functions are local to each other or remote.

A couple of points:
1. Isn't this wasteful for local calls?
2. Isn't this brittle? There's a shared file "commondecl.h" between the two systems.

Well, yes, of course it's wasteful for local calls. But seriously, how wasteful? We're talking about in-memory copying here, which is generally insignificant compared to the network latency we always incur in the case of remote communications. True, if objects are very large and the copying is done in iterative loops, the performance hit could become significant, but the fundamental issue is that correctness trumps performance. We would rather have systems that work correctly but less efficiently than systems that work incorrectly.

As for brittleness, that's what contracts are for in the SOA world. The "commondecl.h" file is part of the contract between the two systems. In fact, in the general case, we would be declaring both the passed and the returned parameters within such a contract, and we would expect both systems to honour that contract. Coupling systems to a contract rather than to each other is not brittle, it's loose coupling.

I hope these examples make it very clear why RPC in all its forms is evil incarnate. RPC is a dishonest mechanism, because it promises to achieve something that it simply cannot deliver - a convincing illusion that remote objects are local. This is one of the reasons for Martin Fowler's First Law of Distributed Objects ("Don't Distribute your Objects"). The opposite illusion, that all objects are remote (and that therefore, everything passed between them is a copy ("pass by value")) is achievable and honest, but could be marginally inefficient when two objects are local to each other.

System integrity is paramount, and therefore we can only use honest mechanisms, even if their efficiency is suboptimal. Messaging is an honest system, because it explicitly deals in copies ("pass by value"). Distributed systems should therefore be based on a messaging paradigm.

As I said at the beginning of this post, I have delved deeper into why XML-RPC is doubly evil and SOAP-RPC is triply evil here. Hopefully it should be clear now why I'm so much against SOAP-RPC and such a fan of SOAP messaging.

Wednesday, December 26, 2007

Domain-Driven Design Really Quickly

Domain-Driven Design (DDD) by Eric Evans is one of the hottest books going around at the moment in the software development community. I believe it’s a must-read for any serious application designer or architect.

This book is an important synthesis of good design practice evolved over many person-years of experience. At 576 pages, however, this is not exactly light reading, which prompted Abel Avram and Floyd Marinescu of InfoQ to produce "Domain-Driven Design Quickly" in about a hundred pages. While a godsend to the time-poor, this is still a significantly heavy document that would take a person 2 to 4 hours to read and comprehend. What developers need is a more gentle ramp-up into the book. Hence my effort to state the gist of the latter book in one page. I call it (what else?) Domain -Driven Design Really Quickly.

I don’t pretend that this one-page summary is a substitute for reading either book. I suggest reading this page first, then reading Abel and Floyd’s 100-pager, and then engaging Eric’s book itself. Each iteration will hopefully prepare you for the next without overwhelming your mind at any stage.

I apologise if I've omitted something really important in the process or put words into any author’s mouth that he didn’t intend. Let me know of any errors by leaving a comment here, and I'll put out a revised version.

There's also the DDD book by Jimmy Nilsson (Applying Domain-Driven Design and Patterns) which I recently bought but haven't read yet. Stay tuned and I'll post my thoughts on this book as well.

My (unsubstantiated) Snapshot View of SOAP/WS-* and REST

At the moment, I haven't got the links to back me up, but I thought it's important to capture the thought before it escapes me.

I think SOAP/WS-* is an ambitious 100% vision for SOA that is 60% implemented.
And I think REST is a pragmatic 80% vision for SOA that is more than 90% implemented.

That model then explains to me the mutual contempt of both camps. The SOAP/WS-* group criticises REST for an overly simplistic model that does not cover a number of use cases - notably Quality of Service specification (security, reliability, transactions), policy-driven service interactions, process definition, exploitation of "industrial strength" message queue infrastructure, etc. That's because the SOAP/WS-* brief is all-encompassing. They've bitten off a lot.

The REST group criticises SOAP/WS-* for exactly that. They believe that SOAP/WS-* has bitten off more than it can chew. REST has taken a modest bite of the SOA problem, and has produced tangible results. What has SOAP/WS-* delivered except complexity?

A year ago, the winner of this argument would have been REST. But the SOAP/WS-* camp has been steadily chewing away at what they've bitten off. And delivering, bit by bit. Two words - Tango and WSO2. SOAP/WS-* complex? Vapourware? Not anymore.

Now it's the REST camp's turn to be defensive about their incompleteness of vision. We're beginning to hear mumblings about WADL, HTTPR and message-level security. I detect an air of defensiveness in talk about transactions and process definition. My long years of experience dealing with commercial vendors tells me that when someone questions the need for a capability, it usually means their product doesn't have it (yet). (Once they have the feature, it of course becomes indispensible and any rival product without it isn't serious competition.) Many of the RESTafarian arguments against the "unnecessary" features of SOAP/WS-* give me a feeling of déjà vu.

And oh, while REST is ahead of SOAP/WS-* on implementation, it's still not at 100%. Do let me know, for example, when PUT and DELETE are finally implemented by browsers. Ouch.

I'm not gloating that a bit of the REST smugness has been punctured (OK, just a little bit). I'm happy that the two worlds are improving their levels of maturity. In 2008, I'll be looking for REST to expand their vision, and for SOAP/WS-* to deliver on their vision.

Tuesday, December 18, 2007

What ESB Is and Isn't (A Quick Mnemonic)

Even if you remember nothing else about ESB (Enterprise Service Bus) from my earlier rant, this should be sufficient:

ESB != Expensive Shrinkwrapped Broker
ESB = Endpoints (SOAP & BPEL)

As my old friend Jim Webber likes to say, the SOAP message _is_ the Bus. It's a piece of Zen-like wisdom, and it takes a while to get one's head around it, but once you "get" it, you'll never fall for vendorspeak again. I am so over centralised brokers calling themselves ESBs.

Thursday, December 13, 2007

Paying the RESTafarians Back in Their Own Coin

Though I like REST and consider it a very elegant model for SOA, it's a little tiresome to hear day in and day out that it's so much more elegant than the SOAP-based Web Services model. In fact, I'm getting so tired of this shrill posturing that I'm going to stick it to the RESTafarians right now, in their own style. Watch.

REST exploits the features of that most scalable application there is - the web, right? REST doesn't fight the features of the web, it works with them, right? HTTP is not just a transport protocol, it's an application protocol, right? URIs are the most natural way to identify resources, HTTP's standard verbs are the most natural way to expose what can be done with resources, and hyperlinks are the most natural way to tie all these together to expose the whole system as a State Machine, right??

Great. I'd like to introduce you to something far older, more basic and more extensible than your web with its URIs, standard verbs and hyperlinks. You may have heard of the Internet. That's right, The Internet. Do you consider the Internet to be scalable, resilient, supportive of innovation? Yes? Good.

This may sound familiar, but the trick, dear friends, is not to fight the Internet model but to work with it. And what is this Internet model? In a sentence, it's about message passing between nodes using a network that is smart enough to provide resilience, but nothing more. This means that every higher-order piece of logic must be implemented by the nodes themselves. In other words, the Internet philosophy is "Dumb network, smart endpoints" (remembering of course, that the network isn't really "dumb", just deliberately constrained in its smartness to do nothing more than resilient packet routing). And the protocol used by the Internet for packet routing is, of course, Internet Protocol (IP).

Voila! We've decentralised innovation! Need a new capability? Just create your own end-to-end protocol and communicate between nodes (endpoints) using IP packets to carry your message payload. You don't need to consult or negotiate with anyone "in the middle". There is no middle. The Internet is not middleware, it's endpointware. (Thanks to Jim Webber for that term :-)

What's TCP? A connection-oriented protocol that ensures reliable delivery of packets in the right order. Did the Internet's designers put extra smarts into the network to provide this higher quality of service? Heck no, they designed an end-to-end protocol and called it TCP. What does TCP look like on the wire? Dunno - they're just IP packets whizzing around like always. TCP packets are wrapped inside IP packets. TCP is interpreted at the endpoints. That's why the networking software on computers is called the TCP stack. Each computer connected to an IP network is an endpoint, identified by nothing more than - that's right - an IP address. Processing of what's inside an IP message takes place at these endpoints. The Internet (that is, the IP network) is completely unaware of concepts such as sockets and port numbers. Those are concepts created and used by a protocol above it. The extensibility and layered architecture of the Internet enable the operation of a more sophisticated networking abstraction without in any way tampering with the fundamental resilient packet-routing capability of the core Internet.

Time for a more dramatic example. What's IPSec? An encrypted end-to-end channel between two nodes that may be routed through any number of intermediary nodes. Wow, that must have required an overhaul to the network, right? Nope, they just created another end-to-end protocol called ESP (Encapsulating Security Payload) and endpoints that understood it. Then they "layered" it between TCP and IP. Can they do that? Of course! The Internet's architecture doesn't in any way force TCP to sit directly atop IP. The "next protocol" attribute of every packet in the "TCP/IP suite" allows virtually any protocol to be layered on top of any other. So TCP packets get wrapped inside ESP packets that get wrapped inside IP packets. At the receiving endpoint, when the IP envelope is examined, there's a little note there to say that the "next protocol" is ESP, not TCP. So an ESP processing layer is given the payload of the IP packet. When ESP is through with its decryption, it hands its payload to the "next protocol", TCP. And life goes on. Any higher-level protocol that runs above TCP will be none the wiser. Is this cool or what? And with all this smart processing going on at the endpoints, what do we see on the wire? Just IP packets whizzing around as usual.

Since you RESTafarians are so forthcoming about how the Web is superior to other forms of distributed computing, let me tell you a little bit about how the Internet is superior to other distributed computing platforms. Look at the old Telco network, called POTS (Plain Old Telephone System), the philosophical antithesis of the Internet - dumb endpoints, smart network. The telephone handsets provided by the Telcos are traditionally dumb, just dialtone-capable. All the smarts are in the network (the "cloud"). All the messagebanks, the call waiting, call forwarding, teleconferencing, everything is taken care of by the smart network. The Telcos claim that the model is least disruptive because they can upgrade capability without having to upgrade millions of handsets.

Is that so? Then how come the Internet has beaten the Telco network even in telephony? Made an overseas call using Skype lately? How much did it cost you? Did you use a webcam? Do the Telcos offer anything equivalent? I rest my case.

The Internet, with its "smart endpoints, dumb network" approach, has comprehensively beaten the Telco network with its "smart network, dumb endpoints" philosophy. It's a fact in plain view that cannot be denied. When innovation is decentralised, it flourishes. The Internet is a platform for decentralised innovation. (Heck, the very web that you RESTafarians wave in people's faces is an example of the innovation that the Internet enables. HTTP is an end-to-end protocol, remember?) You don't fight such a model. You work with it.

Now what does all this have to do with SOAP and Web Services? Everything.

Remember that when we talk about SOAP today, we're talking about SOAP messaging, not SOAP-RPC. Forget that SOAP-RPC ever existed. It was something invented by evil axe murderers and decent SOA practitioners had nothing to do with it ;-).

And another subtle point to remember is that when we say "SOAP message", we always mean "SOAP message with WS-Addressing headers". Why is this important? Because a SOAP message with WS-Addressing headers is an independently-routable unit, just like that other independently-routable unit, the IP packet. Begin to see the parallels?

Now imagine a messaging infrastructure that knows how to route SOAP messages and nothing more. We will deliberately stult the IQ of this infrastructure and prevent it from growing beyond this capability. How do we innovate and build up higher levels of capability? Why, follow the Internet model, of course. Create protocols for each of them and embed all their messages into the SOAP message itself (specifically into the SOAP headers). There's no need to layer them as in the TCP stack, because reliable delivery, security, transactions, etc., are all orthogonal concerns. They can all co-exist at the same level within the SOAP header block - headers like WS-ReliableMessaging, WS-Trust, WS-SecureConversation, WS-Security, WS-AtomicTransaction, WS-BusinessActivity, etc.

Now we have the plumbing required for a loosely-coupled service ecosystem. This isn't by any means the totality of what we require for SOA. At the very least, there's still the domain layer that sits on top of all this elegant message-oriented plumbing. So now define document contracts using an XML schema definition language and a vocabulary of your choice (for Banking, Insurance or Airlines), stick those conforming XML documents into the body of the SOAP messages we've been talking about, and ensure that your message producers and consumers have dependencies only on those document contracts. Ta-da! Loosely coupled components! Service-Oriented Architecture! (Note that just one language - XML - is being used to define both a core application wire protocol (SOAP/WS-*) and one or more application domain protocols (the document payloads). What does 'X' stand for in XML again?)

Now all we need is a bit of metadata to describe and guide access to these services (WSDL and WS-*Policy - some might say SSDL) and a recursive technique to create composite services (WS-BPEL - some might say SSDL again) and we're more or less done.

Sounds good, right? In fact, it sounds more than just good. This is revolutionary! But why haven't we seen this wonderful vision unfold?

A reality check. (And this is no apology or reason for RESTafarians to crow. SOAP-based Web Services technology is not a tired model about to be pushed aside by a simpler and more elegant rival. It's a sleeping tiger that is now stirring.) Why we don't enjoy the brave new world of SOAP-based Web Services today is because:

(1) Those axe murderers we talked about earlier misled the world with SOAP-RPC for a few years, and SOA practitioners are still being detoxified. (There's a school of thought that says the use of WSDL implies RPC even today, but that has more to do with the entrenched (and entirely unjustified) exposure of domain object methods as "services". See my later post on the Viewpoint Flip which is my recommended technique to service-enable a domain model. At a syntactic level, the wrapped document/literal style also distances SOAP usage from blatant RPC encoding.)

(2) Another group of petty criminals talked up SOAP's transport-independence but only wrote HTTP bindings for it, needlessly making the SOAP visionaries look silly. (I think when AMQP arrives - in its fittingly asynchronous way 20-odd years after TCP and IP, we'll begin to see TCP and AMQP as the synchronous and asynchronous transports that synchronous and asynchronous SOAP message exchange patterns should most naturally bind to. There's also the view that the sync/async dichotomy is artificial, and it is correlation which is key. Be that as it may, I don't see a need for SOAP to bind to HTTP at all, except for the unrelated constraint described next.)

(3) There's the pesky firewall issue that causes weak people to surrender to HTTP expediency, but I think that's another thing we will sort out in time. I think of SOAP as standing for "SOA Protocol". It's an application protocol in its own right, and it sits above infrastructure protocols like TCP, UDP and the coming AMQP. We need to define a port for SOAP itself, and get the firewalls to open up that port. No more flying HTTP Airlines just because we like the Port 80 decor. We are SOAPerman and we can ourselves fly ;-).

(4) Vendors of centralised message broker software have been pretending to embrace the SOAP/WS-* vision ("We're on the standards committees!") but in reality they're still peddling middleware, not endpointware. There's a whole flurry of vendor-driven "marketectures" intended purely to protect their centralised broker cash cows from being slaughtered by a slew (pun intended) of cheap, decentralised SOAP-capable nodes. Another round of detoxification needs to happen before people begin to understand that ESB is not an expensive broker you plonk in the middle of your network but a set of capabilities that magically becomes available when you have fully WS-* capable SOAP endpoints. This understanding will dawn, and soon.

(5) As long as commercial licensing of SOA products remains in vogue, deployments will tend to be centralised and will fail to exploit the fundamental Internet-style architecture of SOAP/WS-*. Who wants to pay for multiple instances of software when a single one will do? SOAP and WS-* have only made endpointware architecturally feasible. Open Source implementations will make it economically viable. Three projects to watch: ServiceMix, Mule and WSO2.

(6) The WS-* standards that define security and reliable messaging only recently arrived, so implementations are still being rolled out. But hey, it's here now. Take a look at any Tango-.NET interop demo and you'll be wowed by the policy-driven magic that happens.

I've put together a couple of diagrams that explain these basic concepts. This one illustrates the parallels between the SOAP/WS-* stack and the TCP/IP stack. This one shows my classification of the various SOA approaches in vogue today, with my biases clearly visible :-).

In short, I think there are only two valid approaches to building Service-Oriented Architectures today - (1) REST and (2) SOAP-messaging-using-Smart-Endpoints (and I hate both the SOAP-RPC axe murderers and the Centralised ESB marketing sharks for forcing me to use such a long qualifier to explain exactly what I mean).

And if all this didn't finally turn out to be the poke in the snoot of the RESTafarians that I intended, that's because I like REST. I just don't accept that REST is in any way superior to SOAP-messaging-using-Smart-Endpoints. As I hope I've explained in this lo-o-ong post, the latter's based on an equally elegant model - the Internet itself. As we address the essentially implementation-related deficiencies of the technology, SOAP-messaging-using-Smart-Endpoints will be seen as an equally elegant and lightweight approach to SOA.

Most importantly perhaps, the REST crowd will finally shut up.

Wednesday, December 12, 2007

Why Is REST (Seen To Be) Simpler Than SOAP-Based Web Services?

I had these points under another post earlier, but they really deserve their own post.

I really don't believe there is a huge difference between SOAP and REST or that there is a "war" between them. Yes, REST has been much simpler and more elegant than SOAP-based Web Services so far, but why do you think that's so? This is a multiple-choice question.

(a) There's still some basic confusion in the REST camp and elsewhere between SOAP messaging (the modern model of SOAP) and SOAP-RPC (the outdated view). What RESTafarians attack is the now-universally condemned SOAP-RPC model.
(b) The scope of SOAP/BPEL being much more than that of REST (Message Exchange Patterns, Qualities of Service, Processes, to name a few), the WS-* standards are understandably richer and more complex. As REST begins to move beyond the low-hanging fruit it currently targets, it'll get complex too.
(c) The vision of SOAP messaging hasn't yet been completely translated into reality - e.g., nominally transport-independent but still has no transport bindings defined other than HTTP, the unnecessary WSDL abstraction of "operation" rather than "message" as the basic primitive. Rumblings around AMQP and SSDL provide hope that the vision may yet find more optimal implementations.
(d) WS-* Standards bodies are political menageries. The spec writers have made things more complex than they need to be. Future iterations will shake out the bad specs.
(e) Things are actually settling down in the WS-* world. The core specs are done and dusted (WS-Addressing, WS-Security, WS-SecureConversation, WS-Trust, WS-ReliableMessaging). Implementations are much more interoperable. Tooling has improved. Policy-based configuration of services is becoming a reality (WS-PolicyFramework, WS_SecurityPolicy, WS-ReliableMessagingPolicy). A Tango demo will blow you away.
(f) All of the above.

I believe the answer is (f), which means that the REST folk can stop crowing. There's a place in this world for SOAP (SOAP messaging, that is), and it's right up there alongside REST. Neither is "superior". (In fact, both have areas for improvement, but I'll go into that in another post.)

Forget all the issues around complexity. I believe that in time, complexity will cease to be a differentiating factor between WS-* and REST. What will be left are two equally valid ways of looking at the world, and SOA practitioners will choose the view that is appropriate for the task at hand.

"Seven Fallacies of BPM" a Must-Read

An excellent post over at InfoQ. Jean-Jacques Dubray makes some very valid points about BPM. I agree with what he says, but my concern is less about the gaps between BPMN and BPEL and more about the duality of the REST (Resource) view and the SOAP/BPEL (Process) view. My thoughts began to go in that direction once more.

I've created a diagram to show how SOAP/BPEL and REST are two views of the same system.

If you don't agree with my last statement, what do you think of these two sentences: "The blacksmith hammers a piece of metal" and "The piece of metal is flattened into a plate"? Most people would agree they are both describing the same situation, one from the viewpoint of the "actor" (the blacksmith) using active verbs, and the other from the viewpoint of the iron itself using passive verbs. The first is operation-oriented and describes a process (SOAP/BPEL), the second is resource-oriented and describes a lifecycle (REST). Thanks to Jean-Jacques Dubray for nudging me towards that insight.

"XML" is a First Class DataType in PHP Too

Looks like Java is the odd man out, after all. It's not just JavaScript that has a simple and powerful way to manipulate XML (E4X). I just realised that PHP has such an extension too. It's called SimpleXML. Granted, it's not as powerful as E4X (no arbitrary depth double-dot operator, no attribute filtering, no wildcard asterisk operator), but it still provides enough simplicity and power to make XML processing seem not to be a pain.

Rajat is right. Java needs a datatype called "java.lang.XML" very soon, otherwise it may soon be abandoned by SOA practitioners in favour of PHP (or even server-side JavaScript, which the WSO2 Mashup Server folks are using with no apparent ill-effects).

Speaking of the WSO2 Mashup Server, those folk over at the community site virtually fall over each other to help newbies. I knew that Open Source communities were friendly, but these guys are something else. Thanks for all your help, Keith and Jonathan. I'm in shock and awe both at your enthusiastic support and the blistering pace of WSO2 development in general.

Sunday, December 09, 2007

"XML" Should be a First Class DataType in Java, or My E4X Epiphany

I've been playing with E4X for a few days now. I never realised that manipulating XML could be so easy. No heavyweight API like JAXB or its marginally simpler cousin JDOM. E4X is just so simple and natural, with XPath-like expressions being part of the basic syntax. A pity Java has nothing like this.

When I mentioned this today to my friend and frequent co-author Rajat Taneja, he had, as usual, just one comment to make, - a statement of astonishingly simple insight. "Java needs to have XML as a First Class DataType," he said, "You should be able to declare a variable as being of type "XML", then set it to an XML document or InputStream and it should even be able to validate itself against the schema that the document references."

I know that Java 6 has vastly improved support for XML processing, but that's by bringing JAXB into the language. It's not the same thing. What Rajat says is required is a "java.lang.XML" datatype. Without it, Java just cannot cut it in the brave new world of SOA. Strong words, I know, but my recent experience with E4X has convinced me that XML manipulation has to be dead easy, because the centre of gravity of the software industry has shifted away from implementation languages like Java and C# and towards representational languages like XML (italics represent my own terminology).

If you play around a bit with WSO2 Mashup Server, you'll see what I mean.

If BPEL is the language to orchestrate verbs (SOAP operations), I would put my money on E4X as the language to aggregate nouns (XML documents that are REST resource representations or part of SOAP responses). I believe that this is the destiny of portal servers too. They must turn into mashup servers to survive, and I think a language like E4X is what they need to aggregate content before converting it into a presentation format. The latter function is the mainstay of XSLT, but so far, we haven't had a clear candidate for the former, no universally accepted Content Aggregation Language.

Until now, that is.

I'm pretty sure that E4X isn't going to be the ultimate tool for XML processing, but it has broken new ground and represents the next generation in XML manipulation capability. It's exciting to imagine what new technologies and tools will follow in its wake.

What we need now is an architecture to guide this new technology through its infancy - a Service-Oriented Content Aggregation Architecture. Give Rajat some time :-).

Cooking with Leftovers - The Current Portal Model

Sometimes you just need the right analogy to describe a technology, and I believe I have found the one for portal technology.

Portals are sold as a lightweight integration capability. The vendor term for it is "integration at the glass", which sounds nice but doesn't explain much. I have a more descriptive phrase now, if less flattering. I think the portal model of application integration (or aggregation, if you don't want to dignify what portals do with a term that implies a degree of robustness) is more like cooking with leftovers.

Think about it. Here are two or more independent web applications that produce some HTML (or at any rate some presentation markup), and it falls to the portal to pull them together into a single web page to make them appear like part of a larger, composite application. To my mind now, it appears that this enterprise is doomed from the start.

The first reason is that we cannot adopt this model of aggregation for any existing (read: standalone) web applications. The applications need to be specially built to work within a portal environment. They need to understand that they are not standalone applications but are part of a larger environment that they may share with others like themselves, although what that larger environment or those other applications may be, they must be entirely ignorant of.

This is actually quite a big ask. The most pressing need for lightweight aggregation of the portal kind is to tie together existing web applications, and this is something that portals cannot do, because they require their constituent applications to be written to the portal model, i.e., to be portlets. Portlets do not produce full-fledged web pages, only fragments of HTML such as <div> or <span> elements. They must also conform to the portal event model and be able to respond to "render" and "update" ("processAction") commands.

The second reason why I now believe the portal model has been doomed from the start is that all we need is a better model to come along, and the portal model will be exposed for what it is, - a clumsy attempt to hitch an application wagon to two (or more) independent HTML horses that have already bolted from the content stable (Oh man, am I pleased with myself about that analogy :-).

If we're going to have to develop our constituent applications afresh using a new paradigm, and a more elegant paradigm comes along that allows us to aggregate diverse bits of content before they are cooked into a presentation format, then wouldn't we much rather use adopt that paradigm? By "content", I'm referring of course to the output of a SOA-style Service Tier, a non-visual representation of application state.

In other words, who would want to cook with leftovers from previously cooked dishes when they can have access to fresh ingredients and complete freedom to mash them up in any way they choose?

Wait a minute! Did I say "mash them up"? Because that's exactly the paradigm we have before us today - content aggregation through mashup technology.

I'll be recording my thoughts on mashup technology here in the weeks to come, but for now, let me just predict that if portal servers do not offer mashup capability very soon (in addition to supporting legacy portlets), they will lose out to pure-play mashup servers.