Saturday, June 12, 2010

Does REST Need Versioning?


In my ongoing conversations with JJ Dubray, he has often made the point that "REST couples identity and access together in a terrible way". When pressed to explain, he provided the following example.

Assume that there is a Resource identified by "/customers/1234". Updating the state of this customer requires a PUT. JJ asks how REST can handle a change to the business logic implied by the PUT.

Since we cannot say

PUTv2 /customers/1234

implying a change to the logic of PUT, he believes we have no option but to say

PUT /customers/v2/1234

but this is different from the identity of the customer, which remains at

/customers/1234

Hence REST "couples identity with access".

Well, I disagree. First of all, it's a mistake to think there are only two places where the version of business logic can be exposed - the Verb and the Resource. The data submitted is an implicit third, which I'll come to in a moment. But this example only makes me question the whole basis for versioning.

Does REST need versioning? For that matter, does any service need versioning? What is versioning in the context of SOA?

I would say service versioning is a mechanism that allows us to simultaneously maintain two or more sets of business logic in a consumer-visible way.

Why does it have to be consumer-visible as opposed to just consumer-specific? After all, if the service implementation can distinguish between two classes of consumer, it can apply two different business rules to them in a completely opaque manner. The consumer doesn't even have to know that two (or more) different sets of business rules are being weighed and applied under the covers.

Let's ask the more interesting question: Why do we need to maintain two or more sets of business logic simultaneously? The interesting (and circular) answer is often that business logic happens to be consumer-visible, hence a new version of business logic also needs to be distinguished from the old in a consumer-visible way. This is often stated as the need to support legacy consumers, i.e., consumers dependent in some way upon the previous version of business logic. But why do we have to support legacy consumers? Because existing contracts break when services are silently upgraded.

This argument leads to an interesting train of thought. Perhaps the answer lies in the opposite direction to what JJ believes, i.e., not versioning of services but abstraction of detail. Are our service contracts too specific and therefore too brittle? Service versioning is perhaps a "smell" that says we are going about SOA all wrong. Let us see.

I want to take up a more real-world example than the customer access example that JJ talked about. After all, that's more of a "data service" than a business service. Let's look at a real "business service".

Let's take the case of the insurance industry where a customer asks for a quote for an insurance product. The client-side application has to submit a set of data to the service and get a quote (a dollar value for the premium) in return.

In REST, here's how it could work.

Request:

POST /quotes
<insurable-details>
...
</insurable-details>

Response:

201 Created
Location: /quotes/06fb633b-fec4-4fb6-ae32-f298b8f499c1

The client is referred to the location of the newly-created quote Resource, which is at /quotes/06fb633b-fec4-4fb6-ae32-f298b8f499c1. When the client does a GET on this URI, the quote details are transferred.

So far, so good. Now let's say the business logic changes. Premiums are now calculated using very different logic. The first question is, can this new business logic be applied to all customers, or do we need to keep track of "old" customers and keep applying the old business logic to them? If we can "upgrade" all customers to the new business logic, there is, of course, no problem at all. The interface remains the same. The client application POSTs data to the same URI, and they are redirected in the same way to the location of the newly-created quote Resource. The business logic applied is all new, but customers don't see the change in their interface (only in the dollar values they are quoted!)

However, if we do need to maintain two sets of business logic, it could be for three reasons. One, the data that the client app needs to submit has changed, so the change is unavoidably visible to the customer and has to be communicated as a new and distinct contract. Two, there is another business reason to tell two types of customers apart, perhaps to reward longstanding customers with better rates, and this difference between customers is not obvious from the data they submit. Third, the client app somehow "knows" the behaviour of the old version and is dependent on it. In this case, we need a new version just to keep legacy clients from breaking.

We can readily see that the third reason is an artificial case for versioning. It's in fact a case to break implicit dependencies that have crept in.

In contrast, the first and second reasons provide their own resolution. If the type of data submitted by the client changes, that is itself a way to distinguish new clients from old ones and apply different business logic to them. In other words, we only need to tell newer customers about the change in the data they need to POST. Older customers don't need to do a thing. Also, if we can somehow derive that the customer is an existing one, even if this is not explicit in the data submitted, we can still apply different business logic transparently.

JJ may consider this a messy and unstructured approach to versioning. Business stakeholders may have the opposite view. It's less disruptive. The less clients are exposed to the way services are implemented, the better.

Service versions are not really an interface detail. They're an implementation detail that often leaks into the interface.

That means version numbers are a problem, not a solution.

None of these arguments may satisfy someone like JJ. In that case, if service versioning is absolutely essential, there is a simple way to include it, after all. Include the version number in the message body accompanying a POST or PUT request. In fact, message bodies are allowed even for GET and DELETE requests (anything except a TRACE), so versioning of any type of service is possible. REST does not enforce versioning, (that would be a bad thing considering that versions are often a smell), but doesn't impede it either.

With this approach, neither Verbs (e.g., POST) nor URIs (e.g., /quotes) are affected by versions and the "terrible" coupling of identity and access is avoided.

It seems to me that the problem is not with REST, it's with looking at REST through WS-* eyes.

17 comments:

Integral ):( Reporting said...

Ganesh,

it looks like our discussion has been uncovered.

First, let me assert that Composite Applications are a relatively new paradigm, not in principles, but in fact. Yes CORBA and its ancestors may hame made Composite Applications possible but in fact we see them today at a scale that has never been seen before both in the number of services consumed and the number of consumers consuming these services.

All versioning strategies that were devised for a world of a lesser magnitude simply don't work.

Versioning is extremely important in a Composite Application because you either kill the service provider (which has to manage consumer specific relationships) or you kill the consumers who don't have the resources to follow the rate of new service versions published by the provider. Overtime your consumer community either get mad at you and walks away or reach a status quo where no one has enough money to do something new. Agility, my ...

I am glad that you agree with me that REST couples Access with Identity. I wish that Mark had published that on his infoQ summary, that would qualify as news. Nevertheless, I consider that as a huge victory that a RESTafarians understands that coupling because it is lethal.

So as you pointed out so eloquently, we need a different approach PUTv2 does not work and PUT /v2/customers/1234 does not work either.

Note that since SOAP does not imply this type of coupling, endpoints can be managed solely for versioning purposes. In SOAP, wherever my data moves, or which ever version is used, /customers/1234 will always mean the identity of customer 1234.

Note also that Roy Fielding never had to deal with this question. So how could he have serendipitously dealt with it?

-to be continued-

Integral ):( Reporting said...

Now, let's look at your proposals:

First, I would say that your definition of versioning is not correct:
>> I would say service versioning is a mechanism that
>> allows us to simultaneously maintain two or more
>> sets of business logic in a consumer-visible way.

versioning is about having the ability to evolve business logic (having two or more is the least desirable solution) without breaking existing consumers, who want/need these updates. This is actually the essence of distributed business logic, many can use it, many can benefit when it changes, otherwise, we would still using libraries and direct connections to data sources (which is the world that REST is pushing us into with the concept of data services). Just ask the developers how much they like to transition from iOS 3.2 to iOS 4? In the enterprise you don't even have a choice, because services and consumers are under different lines of control, so services MUST be able to change along these lines.

>> First of all, it's a mistake to think there are only
>> two places where the version of business logic
>> can be exposed - the Verb and the Resource.
I am not saying that all, I am saying that this is the worst possible, or the strategy of last resort, because it breaks the consumer relationship. Of course there are more.

>> Does any service need versioning?
You are kidding right?

>> Why does it have to be consumer-visible as
>> opposed to just consumer-specific?
Yes, people have tried that, they call it generally an immutable versioning strategy. It works great when you have a small number of versions, when you have 30 versions of each service in production (I have seen it) you reach the point where nothing can change as a single "common change" (say a bug fix or a regulatory change) across 30 versions will make every project cost explode.

-to be continued-

Integral ):( Reporting said...

I have collected everything that can change in a "service" here http://www.ebpml.org/blog/217.htm, so IMHO, trying to devise a Service Versioning strategy just on PUT is a tragic mistake. Your versioning strategy needs to solve all cases.

>> Third, the client app somehow "knows" the
>> behaviour of the old version and is dependent on
>> it. We can readily see that the third reason is an
>> artificial case for versioning.
You are kidding right? just talk to the million of developers who use AWS, GAE ... how much artificial that is. In IT, it is of course well known that every service consumer welcomes service changes, specially the ones that don't help them in any ways, and they have to change or even test because someone else needs a services version. This doesn't sound very artificial to me. This is actually the essence of versioning. How can you support existing behaviors and new consumers at the same time. Let the consumer decide when they want to change their behavior.

If we don't agree on that, all discussion is pointless. Of course this is the case that REST can't support because of the coupling between Access and Identity, so I am not surprised you don't see this as a potential scenario. In REST this scenario is a nightmare precisely because you have millions of endpoints to "version" without a clear definition of what "existing behavior" is. Good luck !

-to be continued-

Integral ):( Reporting said...

>> Service versions are not really an interface detail.
>> They're an implementation detail that often leak
>> into the interface.

You are kidding right? Say I have a service to make hotel reservations. People love it, lots of consumers build apps that use it.

But one day they ask me, that there is this other flight reservation service they use in combination and what they would need is a "pre-reservation" semantics such that if the flight reservation fails, they do not commit the hotel reservation?

The intent has changed, the semantics of the interface has changed (and the implementation will change too of course). It is not because 90% of the time a change in the implementation mandates a change in the interface that what you are claiming is correct. It could not be further from the truth.

This happens in IT all the time. New intent appears, side-by-side with the all intent, but sharing the same implementation. That is the problem that Versioning MUST solve.

So sorry, I remain unconvinced by your arguments. in REST you need a much more solid versioning strategy and the lack of "contract" (i.e. explicit intent) is dire.

JJ-

Frank Carver said...

This is an interesting discussion. I'd like to explore a little more on the assertion that "REST couples Access with Identity". I find this assertion puzzling particularly as there is no mention in this article about content types.

In my understanding of REST, identity is represented by a URI, and access is determined by content type.

A client should issue an "Accept" header indicating which version(s) it understands in any GET request, and the server should honour the request by returning appropriate data in the correct format.

Likewise, when using POST or PUT to send data to a server, a client is responsible for setting a correct Content-Type to indicate what is being sent.

In this view of REST, content-type seems the natural place to indicate version-specific support, an approach which seems in tune with Fielding's paper.

If I have misunderstood, please feel free to expound in more detail on what you mean by "REST couples Access with Identity".

Frank.

Ganesh Prasad said...

JJ said:

> I am glad that you agree with me that REST couples Access with Identity.

Now how did you get that impression?? I spent the entire post trying to refute that, and this is the takeaway?

Regards,
Ganesh

Ganesh Prasad said...

JJ,

I also remain puzzled by your continuing references to REST's "millions of endpoints", such as here:

> In REST this scenario is a nightmare precisely because you have millions of endpoints to "version" without a clear definition of what "existing behavior" is.

You perhaps see these as millions of endpoints:

/customers/1
/customers/2
/customers/3
...
/customers/97776676667

I see them as one endpoint:
/customers/{customerid}

It doesn't seem that unwieldy to me.

Regards,
Ganesh

Ganesh Prasad said...

JJ said:

> Say I have a service to make hotel reservations. People love it, lots of consumers build apps that use it.

> But one day they ask me, that there is this other flight reservation service they use in combination and what they would need is a "pre-reservation" semantics such that if the flight reservation fails, they do not commit the hotel reservation?

Why wouldn't I add another tag to my message body that says something like "tentative"? I can then distinguish between requests that imply the old functionality from requests that want "pre-reservation" semantics?

> The intent has changed, the semantics of the interface has changed (and the implementation will change too of course).

Your order is wrong. I would first look to accommodate the new intent through a more sophisticated implementation without impacting the interface. If I can't help letting it bubble through, then I would direct the change towards a part of the interface that lends itself well to change, i.e., the message body. I wouldn't touch the URI unless the resource itself was going to be something else as a result of this change (say a new "Pre-Reservation" Resource was a more natural way to model the change).

> It is not because 90% of the time a change in the implementation mandates a change in the interface that what you are claiming is correct. It could not be further from the truth.

If so, this example doesn't convince me.

I think you're so caught up in the WS-* way of doing things that you don't see a more elegant solution to the same problems. You want to eat soup with a fork and complain about the soup when you can't do it...

Regards,
Ganesh

Integral ):( Reporting said...

well as usual, these discussions are a complete waste of time.

REST is perfect, there is absolutely wrong with it. Good luck with your REST implementations, call me back in a couple of years.

I will simply note that again like so many you reject the challenge of providing a business process such that I can show you how resource lifecycles relate to events, services and processes.

Cheers,

JJ-

Integral ):( Reporting said...

@Frank:

but of course, let's make Content Negotiation the next garbage bag for stuffing all the semantics that are missing in REST. Again without a contract to validate the syntax.

This what Cesare suggests in his presentation (without even considering versioning):
Accept: application/xhtml+xml; q=0.9, text/html; q=0.5, text/plain; q=0.1

(http://www.infoq.com/presentations/Some-REST-Design-Patterns)

Good luck in matching the appropriate content negotiation string, with the correct end point. Maybe a genius will then invent the RSB (Resource Service Bus) to precisely do just that.

As a side note, content types are typically registered with the IANA, but this is just a small detail. Who cares right? as long as I can construct a string in a proprietary format to express anything and everything.

Again, good luck with your REST project.

JJ-

Ganesh Prasad said...

JJ said:

> REST is perfect, there is absolutely wrong with it. Good luck with your REST implementations, call me back in a couple of years.

It's hard to discuss logically with someone when there's so much sarcasm coming from the other side. I appreciate your frustration that no one seems to understand you, and I'm trying to understand your point of view by approaching each fundamental piece separately. You'll just need to be more patient, not more sarcastic, if you want the conversation to go anywhere.

> I will simply note that again like so many you reject the challenge of providing a business process such that I can show you how resource lifecycles relate to events, services and processes.

I'm not sure we even agree on the basics. I'm trying to understand many of the categorical statements you have made over the years ("REST couples identity with access", "REST has millions of endpoints", etc.) and I'm not convinced we're even looking at things the same way. If you approach a situation from one angle, you see a terrible problem. From another angle, it's not a problem at all, and there is an elegant way forward.

There's no point discussing more advanced situations when we can't agree on the basics. This is not about "rejecting a challenge". I love the challenge of exploring new concepts. I'm just not convinced by your reasoning on the simple stuff, so I doubt we'll make any headway on the rest of it, that's all.

Regards,
Ganesh

Integral ):( Reporting said...

sorry, Ganesh, in the end I see this exercise as just a big waste of time since REST is always a syntax away from the solution. Re-encoding traditional semantics behind a URI or a Content type header is not what I call progress.

I would call progress understanding how service, process, events and -of course- resources fit together. I would call progress being able to extend intent in a non breakable way.

Let me paraphrase the answers of the RESTafarians:

REST can't do X ... but X is not needed
REST can't do Y ... but Y can be encoded in this little corner
...

So, I thoughts we could talk at a hire level when we started talking about decoupling of interface from implementation.

Anyways. too much time wasted.

Ganesh Prasad said...

JJ,

I'm genuinely sorry that you view this discussion as a waste of time.

Addressing your points has helped to clarify my thinking a lot, even if I haven't always agreed with you, so it has been very useful to me.

I don't consider myself a RESTafarian. I started off on the SOAP side of the fence and was initially a REST skeptic. Even now, I look for limits to where REST can be applied. But so far, it seems to fit almost every situation without problems. Many of your criticisms of REST intrigued me, but unfortunately, after drilling down a bit, I wasn't convinced by any of them. There may still be something there that I've missed, so I won't close the door on that possibility.

I will continue to look into your arguments (i.e., your 150-page book that looks very promising), and I'm sure I will learn something from it.

Stay tuned.

Regards,
Ganesh

Integral ):( Reporting said...

Ganesh,

I would like to suggest that a good way to make progress in our discussion is to take a "real-world" example. Discussing URI syntax or adding an element to a schema is not going to get us very far. This is what I consider a waste of time. When people like Frank says we are just a "content type" away from versioning (with authority) it helps no one. What would help people is to actually details the syntax of the content negotiation header to support versioning and ... many other things.

If you want to take the discussion to that level, with your process, I am ready to invest the time, otherwise, sorry, I view all these URI syntax discussions as a pure waste of time.

JJ-

Frank Carver said...

I think your suggestion of taking a "real world" example is a good one. I have proposed and described a possible example over on my blog.

I'd love to read and discuss anyone's thoughts and suggestions on how to handle the change scenarios I describe.

Tiago Silveira said...

Encoding versioning in a header is perfectly reasonable. A restful interaction deals with data and metadata. If the data is self-describing, even better. But if you absolutely can't, and absolutely need to expose a version number in the API, then giving two names for the same resource is a bad idea.

Integral ):( Reporting said...

I found Bill deHora bitching about content headers:
http://jacobian.org/writing/rest-worst-practices/

"That ugly crap saves people having to deal with parsing/generating accept headers and the associated matching algorithms. Another alternative is to put .xml at the end of the URL; it should be easier on caches than a query param."

So, what do we do? there are so many ways to skin the REST cat, and so many traps in the process. Who's right? who's wrong?