Wednesday, February 06, 2008

REST Eye for the Relational Guy - The "Uniform Interface" Explained

My thanks go to Rhys Frederick for the analogy I'm about to share with you. Just the day after my last post, in which I'd said, "On the 'uniform interface' issue, I'm hearing the words, but I'm not seeing the REST vision", Rhys asked me to remind him of the name for a certain pattern used in relational database design, and casually mentioned that it was analogous to the uniform interface of REST. I did a double-take. A few Google searches later, I finally grokked the REST Uniform Interface concept, thanks to my experience with RDBMS in a past life. Thanks again, Rhys. The coincidences in my life never cease to amaze.

It's funny how many times I've had to change my way of thinking in the last twenty years of my career. I learned programming with BASIC and FORTRAN, complete with that wonderful "feature" called GOTO. I feared GOSUB. I couldn't understand it. GOTO was easy to understand. Then I learned structured programming, and GOSUB suddenly became the most natural construct in the world. GOTO became (cough) a four-letter word. As relational databases became popular, data for me became a set of tables, and with SQL came the shift from procedural to declarative thinking. Then came the difficult shift to object-oriented thinking (although I'm grateful it didn't have to be through the torture of C++!) A combination of PowerBuilder (yes!) and Java got me over that line. The latest change in my thinking has been thanks to SOA. I've now learnt to think "between domains" in addition to peering "inwards" at a domain with pure OO eyes. And while I've been able to fit REST fairly naturally into my generic SOA worldview, I have struggled to see the world through RESTian eyes, the way the REST folk seem to see it.

Until now.

Now at last, one major facet of REST lies exposed before me, and the irony is that I had to fall back to relational thinking in order to understand it.

Let me launch into the analogy. It's variously called (as I learned) the Value-Pairs Pattern, the Vertical Schema and the Object-Attribute-Value Pattern.

Consider some arbitrary domain modelled using an Entity-Relationship Diagram. The model looks something like this:

Note that each entity has an ID (its primary key), some attributes, and also some foreign keys that refer to the IDs of other entities. This is a specialised model, tailored to this particular application. But there may be a common pattern here. If you find that the attributes (names and types) are similar for all the entities, it may suggest a generalisation of the form shown below:

What we've done here is create a separate table (ATTRIBUTE) listing the names of the different attributes that an entity may have. Then we create mapping tables (ENTITYn_ATTRIBUTE) that create an association between each entity and the generic attribute table. Each row here holds the value that an attribute has for a particular entity. The entity tables themselves are now much smaller. They only hold foreign keys. This model is no longer tied so strongly to a particular domain, because we've taken part of what makes it unique to a domain and genericised it. If we next assume that each entity ID is globally unique, we can apply some generic thinking to this aspect of the model as well.

Here's a completely generic schema.

Look at the names of the tables now! They're all completely generic (ENTITY, ENTITY_RELATIONSHIP, ATTRIBUTE, ENTITY_ATTRIBUTE). The new ENTITY_RELATIONSHIP table even takes the foreign key constraints out of the various ENTITYn tables and holds it as an external mapping. Now all the different ENTITYn tables can be collapsed into a single generic one called simply ENTITY. This model is completely generic. We can add and remove entities, attributes and foreign key relationships at will. We don't have to change the schema at all. Contrast this with the specialised model we started with. Every such change there would have required a change to the schema.

What's more, the generic schema is simple and can be universally communicated, even turned into assumed knowledge. Then all you're concerned with when you talk about a particular domain is the data. The schema ceases to be a differentiator. It's universal. With specialised schemas, on the other hand, you need to communicate the schema as well as the data. Now I understand why the REST guys say REST doesn't need a WSDL equivalent. When the schema is generic and well-known (the "uniform interface") why would you need to describe it?

More analogies creep out of the woodwork as I gaze at the model. See the IDs? They're URIs. It was the assumption of global uniqueness that allowed us to mix the IDs of completely different entities in the one table. URIs are globally unique too. Nobody is going to confuse http://something.com/customers/1234 with http://something.com/orders/1234.

See the foreign keys? They're hyperlinks between entities.

And because we're assuming a limited number of attributes (but not restricting it to a particular number), it vaguely corresponds to REST's limited set of verbs.

URIs, hyperlinks, a limited set of verbs - there's the relational analogy for REST!

Message to the REST folk: I finally get your model :-).

</euphoria>

There is something, though, that we lose by going down the generic path, and that is understandability. I can understand a domain pretty well by studying a specialised Entity-Relationship Diagram, but a generic schema does not speak to me at all. If all domains look the same, what makes them unique? Infinite plasticity makes me very uncomfortable. I can't "see" the domain from the data. I need a schema. When we talk about automated reasoning, perhaps schemas are more important than ever...?

To turn an old REST argument around, constraints empower. If I make my schemas more rigid, I gain something - obviousness of purpose. So the vertical schema pattern isn't a constraint, it's the absence of one. Anything will fit into it, and so it describes nothing.

I must say I'm enjoying this journey. Each day brings fresh insights, and fresh questions...

4 comments:

Jim said...

Hey Ganesh,

Web resources and their inter-relations aren't necessarily as semanticaly deprived as RDBMS relations. For example I can trivially use Microformats to enrich links, and the various media types give me useful metadata as to the type of the resource (whereas tables don't have types really).

In this case I think the Web model is richer, but you pay a price for that richness that the model won't (yet) let you doall kinds of clever relational calculus-style queries.

Jim

Kirstan Vandersluis said...

Nice analogy, Ganesh! I believe you could abstract one more time, and have a single table called metaTable, with columns tableName, key, and value, which would then be able to store the entire data model in one table! That's kinda the idea behind MOF and the four-tier metamodel architecture, which truly makes my head hurt!

But this is madness. As architects, we want to pick the *correct* level of abstraction. Your analogy shows that the REST camp and WS-* camp are off by just one level of abstraction, which really is not too bad. This indicates to me that REST is more loosely coupled (by one degree) than WS-*, at the expense of being less understandable (by one degree?) in terms of domain intent.

Stian Soiland-Reyes said...

Just a side note.. the key-property-value pattern you described has also been formalised for the semantic web as something called RDF - with the tiny difference of calling the triple subject-predicate-value.

For RDF you can describe the relationships (the "schema"), predicates, classes ("tables") in ontologies, RDFS and OWL are typical standards used. Notably both of these schema can be expressed as RDF as well.

John "Z-Bo" Zabroski said...

This is an old idea. It is called Entity-Attribute-Value (EAV) and is often used in hospital systems. As others have pointed out, it has been renamed a thousand times. ...as RDF, as OAV, as subject-predicate-value.

And , no, it is not a panacea.