<D <M <Y
Y> M> D>

[Comments] (7) Return of the Son of Taxonomy of the Programmable Web: Previously, on Taxonomy of the Programmable Web, I put out a description of the Resource-Oriented Architecture based on its answers to these two questions:

Now I'm going to put out a description on the same principles of the Message-Oriented Architecture. Then I think I can make a case for how to classify weird cases like HTTP+POX.

To state the obvious (though it stops being obvious if you or we call the MOA something else), the Message-Oriented Architecture is about passing messages back and forth. Client sends server a message, server takes some action and sends client a message in response. A SOAP box with stickers on it is an example of a message. An HTTP request with headers is also an example of a message.

In an ideal MOA there is no information outside of the message. So if the message is in a SOAP box being transmitted over HTTP, then all the features of HTTP are just distractions and should be used as little as possible. There should be no leakage from the message into the transport protocol.

In the ROA, the fundamental server-side object is the resource. An ROA service usually has an infinite number of resources. In the MOA, the fundamental server-side object is the message processor, sometimes called an endpoint. A MOA service exposes a small number of message processors: usually one. A message processor is the thing that has a URI (or, possibly, the thing that can be the target of the WS-Addressing SOAP sticker).

Side note: A resource is just a message processor with a very limited vocabulary. So there's a pretty easy conceptual translation between the ROA and the MOA. This makes sense because HTTP is itself a message-oriented protocol. Of course, almost all ROA services expose too many "endpoints" to list. You either generate their URIs according to a rule, or you follow links in a document the server sends you.

An MOA might have ten or 200 or a million endpoints, but it's not ROA unless it has an endpoint or "resource" for every object the client might manipulate. Any service that exposes an infinite number of endpoints has at least some of the ROA in it.

This ties back into the famous triangle of "nouns" (resources), "verbs" (HTTP methods), and content types. The ROA constrains the "verbs" and lets the "nouns" run free. The MOA constrains the "nouns" and lets the "verbs" run free.

Side side note: an ROA service can contain a finite number of endpoints, but only if there's a finite number of objects the client might manipulate. A static website (like your cat's homepage, or the site that serves Google Maps image tiles) is an ROA service with a finite number of endpoints. A web service version of Parmenides's model of the universe would only have one endpoint, because there is only one proper object of discussion.

Now let's consider how the MOA answers the two questions:

How is the request scoped? Why should the server do x to data D1 instead of data D2? Well, in the URI it's scoped to the message processor. If your MOA service is actually ROA then that's all you need, because the "message processor" is the "resource" is the part of the application you're operating on. Otherwise, you need some additional scoping, and this is done by putting information in the message itself. This might be in the box or it might be in one of the stickers; the details don't matter.

How does the client convey to the server what it wants the server to do? Why should the server do x to data D1 instead of doing y to data D1? This too is scoped by information within the message itself. Again, this might be in the box or in one of the stickers; it doesn't matter to the MOA. Unlike with the ROA, the vocabulary of this sort of information is not constrained at all. If you constrained it you'd need to create more endpoints to handle the same actions with the more limited vocabulary, and you'd move towards the ROA.

Since in the MOA both of these pieces of information go into the message, there's a tendency to conflate them, and indeed that's what I used to do.

To recap from last time: the ROA puts scoping information in the URI, and puts the action information in the HTTP method. The big exception seems to be overloaded POST, where the action information can go in the entity-body. I'm now fairly certain that overloaded POST in the ROA is best described in terms of the MOA; I'll try to cover that next time.

Now I'd like to talk about a common design style for MOA services: the RPC style. In an RPC service, the scoping information corresponds to the information that would identify an object if this were an object-oriented program. The action information corresponds to a method that should be called on that object. As is proper with the MOA, both of these go into the message.

The message also contains some additional information, which corresponds to arguments passed into the procedure. The scoping information can be found among these arguments: things like the bank account number or the user ID. This is because an RPC service isn't really object-oriented. If it was, you'd be able to work directly with the objects (as, say, resources) instead of going through message processors. But with a little work you can get the same kind of "fake" OO you see in the GNOME project's C code, where the first argument to every function is a pointer to the "object". I put "fake" in scare quotes, possibly sending the most mixed message ever, because I think you can do real OO design in a non-OO system. But it looks strange, and you have to do work to get it.

In an RPC-style service, the HTTP request-response cycle is used to simulate a method or function call in a programming language. The request is the method invocation, and the response contains the return value of the method, or the exception it throws.

We all know that method calls are not neccessarily simple stuff where I ask you for information and you return it in a data structure. This "method call" might do anything: fetch information, store information, put a job in a processing queue, set up callbacks for later, etc. The response message might contain useful information, it might describe an exceptional condition, or it might just be a formality. If you register a callback the real data might be coming in later through the callback. So saying "RPC style" is not a negative value judgement about the capabilities of the style.

Let's take an easy case: XML-RPC. It's obviously an MOA architecture, since the only URI in an XML-RPC service is the URI to the XML-RPC service. It also rejects HTTP headers, methods, and status codes almost entirely, preferring to convey all information information in its XML message document.

It's also obviously an RPC architecture, because 1) it says so right in the name, and 2) its messages are full of incriminating tag names like methodCall and params.

Now, what about SOAP? Is SOAP an RPC style? No, that's a category error. SOAP came from XML-RPC (more accurately: they have a common ancestor) but they aren't the same kind of thing. XML-RPC is a way of describing method calls; SOAP is a way of sticking a message in a box and putting stickers on the box. This makes it useful for any MOA application, but it doesn't make it RPC.

So, riddle me this, Batman: how come basically the entire installed base of services that use SOAP also use the RPC style? Why does the average programmer think SOAP and then think RPC style?

SOAP is the message format of choice for automated tools that take Java or C# code you've already written, and with one click turn it into a web service. These tools are to web services as FrontPage is to web pages. The resulting "web service" exposes only one URI and accepts methods only in a very specific form that a person can't comprehend without a detailed map (a WSDL file) and a software tool on the other side (a WSDL client) that can read the map so you don't have to. When the software tool is done reading the map it's reconstructed something like the method signatures of the original Java or C# code.

This is why REST people dislike these services so much, and why that dislike often spills over into SOAP. This is why people tend to think that anything that doesn't fit this model is "REST". This is why I say these services are not really on the web: they're looking out at the web through a little peephole. To get any information out of them, you have to construct a very specific document, and then you have to put in a box and put the right stickers on it.

These services have the same design as XML-RPC services. They use XML Schema instead of XML-RPC's idiosyncratic data serialization format, and they use WSDL instead of the listMethods extension to XML-RPC. All the work done to make SOAP nothing but a good way of putting an XML document in a box and slapping stickers on the box, has passed these services by.

I don't like this because it's ugly and it pollutes the web. SOAP fans don't like this because it conflates SOAP with RPC when what they use SOAP for isn't RPC at all. I'm not really clear on what MOA styles they are using if not RPC, but I don't think it's going to come up a lot in a book about REST and the ROA, so I can just leave them to it and have Sam act as my lookout in case anything relevant to the book happens.

Who's the culprit here? I think it's those tools. They make it easy to get a web service that has the same interface as your code (that'd be a Procedure-Call interface that works Remotely), disregarding the fact that the architecture of the web doesn't look like that. Combine this disconnect with the complexity of the interface and the idiosyncracies of the tools, and you've made it easy for someone else to work with your service if they've got the exact same setup, and very difficult otherwise. This is pretty disgraceful since the real-world power of the web comes from its ability to connect everyone together.

The secondary culprit is WSDL. WSDL doesn't enforce RPC but it makes life easy for tools that want to force SOAP into an RPC mold. Sam is ambivalent about WADL. I suspect he's afraid that WADL will make it easy to force HTTP into an RPC mold, that tools for doing this will proliferate, and that we'll never see resources again. Sure, WADL makes you specify your resources, but a smart tool could easily subvert that and create services with a single "resource" that handles a whole bunch of method calls.

I think the scenario I've ascribed to Sam is not inevitable. A lot of the mess we see today comes from historical contigencies like SOAP's heritage in XML-RPC. But to prevent a future without resources we need a basic book that's willing to go out and fight for them. It means we need to straighten out the terminology—even if only informally and we only use the terminology locally—so we can create a frame for our argument.

Next time: HTTP+POX.


[Main]

Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.