< The Rite of First Sale
Next >

[Comments] (1) Abbot and Costello Meet the Taxonomy of the Programmable Web: So far I think I've made a coherent case for classifying RESTful and resource-oriented services on one side of the Venn diagram, and message-oriented services on the other. The RESTful and resource-oriented services keep the scoping information (why should the server send you some dataset instead of some other dataset?) in the URI, and action information (why should the server send you the dataset instead of modifying the dataset?) in the HTTP method.

The MOA field is dominated by RPC-style services, most of which use SOAP messages constrained by WSDL files. The client communicates its desire by sending a message (a box with stickers on it) to a message processor, which unpacks the box to find both the scoping information and the action information.

We've got two edge cases to explain: overloaded POST and so-called "HTTP+POX" services. It's been said (see link at end) that I shouldn't try to fit HTTP+POX services anywhere at all because they're just a mishmash, but I press on.

We've also got an unexplained coincidence. I've been describing SOAP as a way of putting a document in a box and slapping stickers on it. But you can look at HTTP the same way. The HTTP request (or response) is the box; the entity-body is the document; and the URI, headers, method name, and response code are all stickers.

HTTP+POX, as they say, is a mishmash, but I think I can identify the things it's a mishmash of. An HTTP+POX service is a message-oriented RPC-style service that uses HTTP as the box format, rather than using SOAP.

This is most obvious if you look at how you modify the dataset in HTTP+POX services (I'm using the del.icio.us API and Flickr's allegedly "REST" API as examples). You either send a GET or an overloaded POST to a URI that's not a resource: it's a description of the procedure you want to call. There's a message processor for every operation, rather than just one (as generally happens with SOAP+WSDL services), or one for every object on which you might want to operate (as with resource-oriented services). Examples: del.icio.us /v1/posts/add and /v1/posts/delete, Flickr's multitude of methodName arguments.

The confusing part is this: when the RPC method you're calling is "get some data", the HTTP method is usually GET, and the arguments to the RPC method usually go into the URI. This URI designates a resource, and you can GET a representation of that resource! This is why I thought that HTTP+POX services are hybrids of the resource-oriented and message-oriented architectures. In many cases, the message is one you'd send if you were using the uniform interface to get a representation of a resource. The service may not have been designed with resources in mind, but this part of it is functionally resource-oriented.

Examples. Consider an endpoint of the del.icio.us API: https://api.del.icio.us/v1/posts/get?tag=restbook That doesn't have the same ring as http://del.icio.us/leonardr/restbook, but it's the same kind of URI. Those are two URIs to two representations of the same resource: "my recent posts tagged with 'restbook'". There are infinitely many URIs of this form, each identifying a different resource.

Similarly for http://api.flickr.com/services/rest?method=flickr.photos.search&api_key=xxx&name=penguin. We're supposed to interpret that as a remote procedure call, but it's also the URI to a resource: "Pictures tagged 'penguin'". Another URI to a different representation of the same resource is http://flickr.com/photos/tags/penguin. Even a URI like /rest?method=flickr.people.findByEmail&find_email=leonardr@segfault.org is the URI to a resource, though a better URI for the same resource might be /people?email=leonardr@segfault.org.

We have our own recommendations about how to structure URIs in a resource-oriented service, but it's pretty small-minded to say that services aren't resource-oriented just because their URIs look funny and contain things that look like method names (like "get" and "method=flickr.photos.search"). From a functional standpoint, these services stop being resource-oriented when they stop using resources. And, in general, they stop using resources when it comes time to modify the data set.

The method names in those URIs are useless to the extent they agree with HTTP's uniform interface. The "get" and the "flickr.photos.search" basically mean HTTP GET, and the client's already using GET. When those strings become useful ("add" or "flickr.photos.comments.deleteComment") it's because the RPC interface supercedes the uniform interface. The URI can no longer be conceived as pointing to a resource.

/rest?method=flickr.photos.comments.deleteComment&comment_id=100 could be one of many URIs to the resource "comment #100". Maybe the "method" argument is just random junk. Maybe if you GET this URI you get the comment, and if you delete this URI you delete the comment. But of course if you GET this URI you delete the comment. This URI doesn't identify the comment: it identifies an operation on the comment. That's not a resource. It's just a procedure call.

You can see this by looking at web applications. A web application like a search engine is pretty resource-oriented. It exposes an infinite number of URIs that slice up the search engine data in various ways: by query, by page, by language searched. All of these URIs point to resources. You can't see any cracks between the message-oriented and resource-oriented models, because 1) the "messages" you're sending are standard HTTP requests, and 2) you can't modify the search engine data through the web interface, so you only use GET and you use it as intended.

Then again, consider NewsBruiser, the weblog program that publishes these words. I wrote NewsBruiser before the term "REST" was coined, but it's pretty typical of examples even today. I've got URIs like /nb.cgi/add, /nb.cgi/comment-add, /nb.cgi/configure, and so on. In general GETting one of these URIs gives you an HTML form, which you fill out and POST to the same URI to modify the data set.

Even putting aside the fact that you can't do PUT or DELETE with HTML forms, this is a very RPC-oriented application. It's got method names in the URIs. Yet when you're not changing the application you're using URIs like /nb.cgi/view/nycb/2006/11/03/1. That's got a method name ("view"), but it's redundant with the name of the HTTP method (GET), just like with the del.icio.us and Flickr examples. The rest of the URI describes a resource: the second News You Can Bruise entry of the third of November, 2006. Indeed, I've got a rewrite rule that cuts out all the redundant and default data so that you can use /2006/11/03/1 instead.

So that's HTTP+POX. It looks like there are two main paths to this architecture. You might come to it by designing web services on the exact same principles as web applications. This overlaps with the resource-oriented architecture when that architecture congrues with the capabilities of web browsers (GETting data from resources identified with URIs, sending data to hard-coded locations). It departs from the resource-oriented architecture when it comes to things you can't do with a web browser (using PUT and DELETE instead of overloading POST so much, sending data to URIs that the client generated dynamically).

Or you might come to it through an attempt to simplify the RPC style. You might be doing pushbutton SOAP+WSDL services and one day say "screw only having a single endpoint; I'm going to have an endpoint for every procedure and it's going to be identified in the URI and then I won't need WSDL". Or you might say "screw putting an XML document in an XML box; I'm already using HTTP, I'll just use it as my box and then I won't need SOAP".

Now a word about the name. I've decided "HTTP+POX" is inaccurate and I think I won't use it in the book except insofar as I need to translate real-world terms into my wondrous precise terminology. Here's the problem: like "AJAX" of old, "HTTP+POX" hard-codes a reference to XML where there might not be any XML.

To reiterate one of my earlier points in new language: what's "Plain Old" about Plain Old XML? What's "Plain Old" is that it's a document, not a box with a document inside it. You've already got a perfectly good box: the HTTP request. But that's a statement about the box. "POX" is a statement about the contents, and those don't have to be XML. An HTTP+POX web service might serve plain text, JSON, HTML microformats, or graphic files in an HTTP box. This holds even if you don't buy my argument that HTTP+POX is an RPC-style architecture.

I can think of two different names. One involves doing what they did to AJAX and lowercasing the "POX", making it a word instead of an acronym: HTTP+Pox (HTTP can stay an acronym because you are using HTTP and only HTTP). This has the advantage that it's an almost invisible change. But depending on how cocky I feel about my classification of these services as RPC services, I may try to push the term HTTP+RPC. Because "Pox" doesn't mean anything, and I don't like the trend towards stripping acronyms of their meaning when it becomes inconvenient (as also happened to SOAP).

The next installments will cover what to do with terms like "Service-Oriented Architecture" in the face of definitions that would a priori prevent REST from getting any of the pie, and also finally tackle the question of overloaded POST in resource-oriented architectures. I think it's going to be a while until the last one, at least, shows up. Before I try to figure out overloaded POST I want to incorporate into the book text my thinking in the series so far.

Filed under:


Posted by Daniel Morrison at Mon Nov 13 2006 13:45

What about using the term 'Hybrid'? Since you're putting it right in the middle of the diagram, and describing it as a hybrid, it seems accurate.

[Main] [Edit]

Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.