Tuesday, March 12, 2013

How to Implement An Atomic "Get And Set" Operation In REST


This question came up yesterday at work, and it's probably a common requirement.

You need to retrieve the value of a record (if it exists), or else create it with a default value. An example would be when you're mapping identifiers between an external domain and your own. If the external domain is passing in a reference to an existing entity in your domain, you need to look up the local identifier for that entity. If the entity doesn't yet exist in your domain, you need to create (i.e., auto-provision) it and insert a record in the mapping table associating the two identifiers. The two operations have to be atomic because you can't allow two processes to both check for the existence of the mapping record, find out it doesn't exist, then create two new entity instances. Only one of the processes should win the race.

(Let's ignore for a moment the possibility that you can rely on a uniqueness constraint in a relational database to prevent this situation from occurring. We're talking about a general pattern here.)

Normally, you would be tempted to create an atomic operation called "Get or Create". But if this is to be a RESTian service operation, there is no verb that combines the effects of GET and POST, nor would it be advisable to invent one, because it would in effect be a GET with side-effects - never a good idea.

One solution is as follows (and there could be others):

Step 1:

GET /records/{external-id}

If a record exists, you receive a "200 OK" status and the mapping record containing the internal ID.

Body:
{
  "external-id" :  ...
  "internal-id" :  ...
}

If the record does not exist, you get a "404 Not found" and a one-time URI in the "Location" header.

Location: /newrecords/84c5d65a-2198-42eb-8537-b16f58733791

(The server will also use the header "Cache-control: no-cache" to ensure that intermediate proxies do not cache this time-sensitive response but defer to the origin server on every request.)

Step 2 (Required only if you receive a "404 Not found"):

2a) Generate an internal ID.

2b) Create a new entity with this internal ID and also create a mapping record that associates this internal ID with the external ID passed in. This can be done with a single POST to the one-time URI.

POST /newrecords/84c5d65a-2198-42eb-8537-b16f58733791

Body:
{
  "external-id" :  ...
  "internal-id" :  ... (what you just generated)
  "other-entity-attributes" : ...
}

The implementation of the POST will create a new local entity instance as well as insert a new record in the mapping table - in one atomic operation (which is easy enough to ensure on the server side).

If you win the race, you receive a "201 Created" and the mapping record as a confirmation.

Body:
{
  "external-id" :  ...
  "internal-id" :  ... (what you generated)
}

If you lose the race, you receive a "409 Conflict" and the mapping record that was created by the previous (successful) process.

Body:
{
  "external-id" :  ...
  "internal-id" :  ... (what the winning process generated)
}

Either way, the local system now has an entity instance with a local (internal) identifier, and a mapping from the external domain's identifier to this one. Subsequent GETs will return this mapping along with a "200 OK". The operation is guaranteeably consistent, without having to rely on an atomic "Get or Create" verb.

One could quibble that a GET that fails to retrieve a representation of a resource does have a side-effect - the creation of a one-time URI with the value "84c5d65a-2198-42eb-8537-b16f58733791" being inserted somewhere. This is strictly true, but the operation is idempotent, which mitigates its impact. The next process to do an unsuccessful GET on the same value must receive the same one-time URI.

It's a bit of work on the server side, but it results in an elegant RESTian solution.

4 comments:

Jim Webber said...

Hi Ganesh,

Atomicity is easy - as you point out HTTP already does that. The next question you'll get at work is how to turn atomicity at a single operation into a transaction that can run across several.

Turns out that's easy enough too, but no doubt you'll get some folks claiming REST can't do it.

Jim

prasadgc said...

Thanks Jim,

I updated the post to include a two-step transaction on the server-side, just to forestall any such objections!

Anonymous said...

Why does the scheme expose an 'internal-id' across multiple clients? If a client process is generating the 'internal-id', what is the purpose of sharing the identifier across clients? The other client may 1) already have the same identifier value assigned to another record 2) not care about the internal identifier because the external identifier is used during REST service calls.



prasadgc said...

Hi Chaddad,

Good point. I should have mentioned that this API was for clients within the same "domain", so they all share the internal ID, which is a candidate key inside all of them. You're certainly right that the internal ID is not exposed to clients outside the domain, so the API is probably protected with a mechanism like OAuth 2.0, which restricts it to internal clients.