Wednesday, October 10, 2007

Services, Persistence and a Rich Domain Model

I see so many wrong-headed ideas about SOA and Web Services that I want to scream.

Here are just some:

1. Using the RPC style of SOAP Web Services (Wake up, Rip Van Winkle, it's 2007!)
2. Thinking that buying an Integration Broker or ESB will itself solve your integration problems (Sure, your shiny telephone lets you dial any international phone number, but does the person at the other end speak your language?)
3. Generating WSDL files and Web Service endpoints automatically from Java classes (So what happens to your service contract when you next change some aspect of your implementation?)

It's the last of these points that I want to talk about in this post.

On the face of it, the problem can be stated in a deceptively simple manner, but there are significant subtleties to it.

The application you are building uses (say) Java (it could be using C# and .NET for all we care). The way we have decided to interact with other systems is through Web Services. That means SOAP messages will be exchanged between our systems. Since the document/literal style of SOAP-based messaging requires an XML document to be embedded within the SOAP body, we can see that the service contract is basically a set of XML documents.

So our problem can be simplistically stated as follows:
How do we convert between Java objects used within our system and XML documents exchanged with other systems?

There are two "obvious" answers to this question - either generate the XML document(s) from your Java classes, or generate Java classes from the XML documents. Java IDEs tend to take the former approach. In fact, many of them go a step further by not only generating XML documents but SOAP endpoints as well, and WSDL definitions of the SOAP endpoints that you can conveniently distribute to your client systems. As I suggested earlier, this approach is wrong because it breaks the service contract (by generating a new WSDL file) whenever the Java classes or method signatures change.

There are also a bunch of tools that take the latter approach. Apache's ADB, XMLBeans and the ubiquitous JAXB library that comes bundled with your JDK all try and generate their versions of a set of Java classes that they believe represent your XML document. Looking at the resultant code can make you lose your appetite very quickly. I'm surprised how so many people gamely continue even after looking at the results of what is very patently a wrong approach.

(This is a blog. Bloggers are meant to be opinionated.)

I believe I know how the problem should really be tackled. Both answers above are wrong. Neither the XML document nor the Java classes must be generated from the other. The Service Contract (represented by the XML document) and the Domain Model (represented by the Java classes) are both First Class Entities. By this, I mean that both are concepts that stand independently. The Service Contract is something that is negotiated between service provider and consumer. It must have no implied dependencies upon how the service is implemented. The Domain Model, on the other hand, represents the designer's deep understanding of the business application and how it really works. This should be independent of how other systems may want to interact with it.

And so my answer to the problem is that we need a third entity, a mapping entity, that sits outside both the Service Contract and the Domain Model and maps them to each other in as flexible and bi-directional a manner as possible. We have such tools today. Their names are Castor and JiBX.

The tragedy of the Java-XML translation space today is not that most practitioners seem to think it is a solved problem. It is. The tragedy is that they think the solution is JAXB. The correct answer is JiBX, and they get no points for getting the letters a bit jumbled. (Think mapping, not code generation.)

Mapped data binding that respects the independence of the two entities it maps is the correct solution approach to Java-XML translation.

There is a second reason why mapped data binding works so well, and it has to do with what is called an impedance mismatch. Impedance mismatch is a term borrowed from Electrical Engineering, and refers to the difficulty of translating concepts from one paradigm to another. Classes in Object-Oriented systems and tables in Relational Databases have an impedance mismatch. Classes and XML schemas have an impedance mismatch.

The Object-Relational impedance mismatch has been satisfactorily overcome for a few years now by ORM (Object-Relational Mapping) tools. Hibernate, Toplink JPA and OpenJPA (formerly Kodo) come to mind. You will notice a similar approach with these tools to what I suggested was the solution to Java-XML translation. They respect both paradigms and can map existing instances of each using mapping files that sit outside both.

So overcoming the Java-XML impedence mismatch is a second good reason to use a mapped data binding approach like JiBX or Castor.

[I'm told that JiBX makes the use of XML as a data interchange mechanism as fast as native RMI. I just love it when the right way to do something is also made attractive.]

My third and final way of looking at the problem is from the traditional Java view of Domain Objects and Data Transfer Objects. Data Transfer Objects have never been considered true objects because they only encapsulate state, not behaviour. However, they serve a useful purpose as a data interchange mechanism. It's considered bad practice (except by the Hibernate crowd and their queer notion of detached entities) to directly expose your Domain Objects to the outside world. You should decouple your implementation from your service interface. This decoupling is done by using Data Transfer Objects (DTOs) as very specialised data structures visible to other systems through the service interface. DTOs are marshalled from Domain Objects and unmarshalled back to Domain Objects during service interactions. This is a paradigm well known to most J2EE developers. Looked at in this way, XML documents used in Web Services are just DTOs.

The best technology to marshal and unmarshal Java DTOs is Dozer, just as the best technology to marshal and unmarshal XML DTOs is JiBX.

I see a beautiful symmetry in the way all these tools and technologies inter-relate. At the centre is the rich Domain Model that implements your application. To persist the application's state, you use Hibernate or another ORM tool to map your Domain Objects to relational tables. To talk to remote Java clients, you map your Domain Objects to DTOs using Dozer. To expose services to clients that require a Web Services interface, you use JiBX or Castor to map your Domain Objects to XML documents.

This picture should be worth the thousand words above.


johan andries said...

Very interesting post! What about using Dozer to map between JAXB-generated objects and the rich domain model?

Ganesh Prasad said...


That seems to be a commonly-used pattern. I have heard from other people who do the same. But my question is, why two steps when one will suffice?

If we treat XML documents as DTOs (structures used for Data Interchange), then there is symmetry between Dozer (Java DTO to Java Domain Object) and JiBX (XML DTO to Java Domain Object).

The explanation I received from another person who uses JAXB and Dozer in the way you described is that they were not aware of the fundamental difference between JAXB and JiBX. Perhaps they will use JiBX in future.


Lindsay said...

Whew - I thought I was the only one. This is so simple, why don't people get it?

Simon said...

Great, crystal clear thinking. Domain objects are the foundation and everything else maps from that. Has helped me understand how to implement my GXT DTOs from my domain ojbects.

minnie said...

well JibX is a good option but its codeGen is restricted with few xml types like xs:Any and xs:AnyAttribute same problem as Castor latest version

Praveen said...

First of all a very good blog. I gather you want the Domain Model to be mapped by JiBX to expose web services... I'm not too sure does that not contradict your original point of don't expose your domain model as services? Pardon my ignorance on JiBX but if you think there is an elegant solution using JiBX I'd be more than happy to learn it. Cheers

Ganesh Prasad said...

Thanks, Praveen.

> does that not contradict your original point of don't expose your domain model as services?

The key is "loosely coupled". You shouldn't directly expose the domain model in a rigid one-to-one fashion as services, in such a way that if you add an attribute to one of your domain objects, that attribute immediately pops up in your service interface as well. You will find this kind of tight coupling when you *generate* the service interface from the domain model or vice-versa.

JiBX and EclipseLink, on the other hand, allow you to *map* the two together. This is loose coupling. You can change either the domain model or the service interface, and ensure that the other is not affected, by just making modifications to the mapping.

Hope that makes things clearer.


Ganesh Prasad said...

Minnie said:

> well JibX is a good option but its codeGen is restricted with few xml types like xs:Any and xs:AnyAttribute same problem as Castor latest version

I wouldn't use JiBX's CodeGen option. It defeats the purpose :-). Use mapping for loose coupling. Code generation is tight coupling.

Ganesh Prasad

Praveen said...

I did look into JiBX, I see your point now. I was playing with JAXB - to expose SOAP services. To avoid code generation from entity model directly, I started building DTOs to achieve loose coupling. I hated the fact I had to run all these setters for mapping entities to DTOs. Now using dozer I dont have to manually map them as well. The issue is I might need same data in different formats REST-XML, REST-JSON and SOAP. With apache CXF I can expose the same DTO as different service endpoints. Not sure if JiBX will fit this scenario... I can change the data binding provider for apache cxf but it would still need dtos to marshall them to json..?? anyway I would love to by pass this DTO business completely if I can. Any suggestions to save me some of my bed time??? Thanks G-man. :)

Ganesh Prasad said...

> The issue is I might need same data in different formats REST-XML, REST-JSON and SOAP. With apache CXF I can expose the same DTO as different service endpoints.

Well, if you need multiple data representations in your service interface, and your Dozer-CXF combination works, then that's probably the best solution. You're probably saving yourself some sleep time already by not addressing each mapping entirely independently. The DTO is your abstract service interface, and all the others you listed are the concrete ones. Not a bad solution.


Praveen said...

Thanks Ganesh.