< Previous
Next >

[Comments] (7) Return of the Son of Taxonomy of the Programmable Web: Previously, on Taxonomy of the Programmable Web, I put out a description of the Resource-Oriented Architecture based on its answers to these two questions:

Now I'm going to put out a description on the same principles of the Message-Oriented Architecture. Then I think I can make a case for how to classify weird cases like HTTP+POX.

To state the obvious (though it stops being obvious if you or we call the MOA something else), the Message-Oriented Architecture is about passing messages back and forth. Client sends server a message, server takes some action and sends client a message in response. A SOAP box with stickers on it is an example of a message. An HTTP request with headers is also an example of a message.

In an ideal MOA there is no information outside of the message. So if the message is in a SOAP box being transmitted over HTTP, then all the features of HTTP are just distractions and should be used as little as possible. There should be no leakage from the message into the transport protocol.

In the ROA, the fundamental server-side object is the resource. An ROA service usually has an infinite number of resources. In the MOA, the fundamental server-side object is the message processor, sometimes called an endpoint. A MOA service exposes a small number of message processors: usually one. A message processor is the thing that has a URI (or, possibly, the thing that can be the target of the WS-Addressing SOAP sticker).

Side note: A resource is just a message processor with a very limited vocabulary. So there's a pretty easy conceptual translation between the ROA and the MOA. This makes sense because HTTP is itself a message-oriented protocol. Of course, almost all ROA services expose too many "endpoints" to list. You either generate their URIs according to a rule, or you follow links in a document the server sends you.

An MOA might have ten or 200 or a million endpoints, but it's not ROA unless it has an endpoint or "resource" for every object the client might manipulate. Any service that exposes an infinite number of endpoints has at least some of the ROA in it.

This ties back into the famous triangle of "nouns" (resources), "verbs" (HTTP methods), and content types. The ROA constrains the "verbs" and lets the "nouns" run free. The MOA constrains the "nouns" and lets the "verbs" run free.

Side side note: an ROA service can contain a finite number of endpoints, but only if there's a finite number of objects the client might manipulate. A static website (like your cat's homepage, or the site that serves Google Maps image tiles) is an ROA service with a finite number of endpoints. A web service version of Parmenides's model of the universe would only have one endpoint, because there is only one proper object of discussion.

Now let's consider how the MOA answers the two questions:

How is the request scoped? Why should the server do x to data D1 instead of data D2? Well, in the URI it's scoped to the message processor. If your MOA service is actually ROA then that's all you need, because the "message processor" is the "resource" is the part of the application you're operating on. Otherwise, you need some additional scoping, and this is done by putting information in the message itself. This might be in the box or it might be in one of the stickers; the details don't matter.

How does the client convey to the server what it wants the server to do? Why should the server do x to data D1 instead of doing y to data D1? This too is scoped by information within the message itself. Again, this might be in the box or in one of the stickers; it doesn't matter to the MOA. Unlike with the ROA, the vocabulary of this sort of information is not constrained at all. If you constrained it you'd need to create more endpoints to handle the same actions with the more limited vocabulary, and you'd move towards the ROA.

Since in the MOA both of these pieces of information go into the message, there's a tendency to conflate them, and indeed that's what I used to do.

To recap from last time: the ROA puts scoping information in the URI, and puts the action information in the HTTP method. The big exception seems to be overloaded POST, where the action information can go in the entity-body. I'm now fairly certain that overloaded POST in the ROA is best described in terms of the MOA; I'll try to cover that next time.

Now I'd like to talk about a common design style for MOA services: the RPC style. In an RPC service, the scoping information corresponds to the information that would identify an object if this were an object-oriented program. The action information corresponds to a method that should be called on that object. As is proper with the MOA, both of these go into the message.

The message also contains some additional information, which corresponds to arguments passed into the procedure. The scoping information can be found among these arguments: things like the bank account number or the user ID. This is because an RPC service isn't really object-oriented. If it was, you'd be able to work directly with the objects (as, say, resources) instead of going through message processors. But with a little work you can get the same kind of "fake" OO you see in the GNOME project's C code, where the first argument to every function is a pointer to the "object". I put "fake" in scare quotes, possibly sending the most mixed message ever, because I think you can do real OO design in a non-OO system. But it looks strange, and you have to do work to get it.

In an RPC-style service, the HTTP request-response cycle is used to simulate a method or function call in a programming language. The request is the method invocation, and the response contains the return value of the method, or the exception it throws.

We all know that method calls are not neccessarily simple stuff where I ask you for information and you return it in a data structure. This "method call" might do anything: fetch information, store information, put a job in a processing queue, set up callbacks for later, etc. The response message might contain useful information, it might describe an exceptional condition, or it might just be a formality. If you register a callback the real data might be coming in later through the callback. So saying "RPC style" is not a negative value judgement about the capabilities of the style.

Let's take an easy case: XML-RPC. It's obviously an MOA architecture, since the only URI in an XML-RPC service is the URI to the XML-RPC service. It also rejects HTTP headers, methods, and status codes almost entirely, preferring to convey all information information in its XML message document.

It's also obviously an RPC architecture, because 1) it says so right in the name, and 2) its messages are full of incriminating tag names like methodCall and params.

Now, what about SOAP? Is SOAP an RPC style? No, that's a category error. SOAP came from XML-RPC (more accurately: they have a common ancestor) but they aren't the same kind of thing. XML-RPC is a way of describing method calls; SOAP is a way of sticking a message in a box and putting stickers on the box. This makes it useful for any MOA application, but it doesn't make it RPC.

So, riddle me this, Batman: how come basically the entire installed base of services that use SOAP also use the RPC style? Why does the average programmer think SOAP and then think RPC style?

SOAP is the message format of choice for automated tools that take Java or C# code you've already written, and with one click turn it into a web service. These tools are to web services as FrontPage is to web pages. The resulting "web service" exposes only one URI and accepts methods only in a very specific form that a person can't comprehend without a detailed map (a WSDL file) and a software tool on the other side (a WSDL client) that can read the map so you don't have to. When the software tool is done reading the map it's reconstructed something like the method signatures of the original Java or C# code.

This is why REST people dislike these services so much, and why that dislike often spills over into SOAP. This is why people tend to think that anything that doesn't fit this model is "REST". This is why I say these services are not really on the web: they're looking out at the web through a little peephole. To get any information out of them, you have to construct a very specific document, and then you have to put in a box and put the right stickers on it.

These services have the same design as XML-RPC services. They use XML Schema instead of XML-RPC's idiosyncratic data serialization format, and they use WSDL instead of the listMethods extension to XML-RPC. All the work done to make SOAP nothing but a good way of putting an XML document in a box and slapping stickers on the box, has passed these services by.

I don't like this because it's ugly and it pollutes the web. SOAP fans don't like this because it conflates SOAP with RPC when what they use SOAP for isn't RPC at all. I'm not really clear on what MOA styles they are using if not RPC, but I don't think it's going to come up a lot in a book about REST and the ROA, so I can just leave them to it and have Sam act as my lookout in case anything relevant to the book happens.

Who's the culprit here? I think it's those tools. They make it easy to get a web service that has the same interface as your code (that'd be a Procedure-Call interface that works Remotely), disregarding the fact that the architecture of the web doesn't look like that. Combine this disconnect with the complexity of the interface and the idiosyncracies of the tools, and you've made it easy for someone else to work with your service if they've got the exact same setup, and very difficult otherwise. This is pretty disgraceful since the real-world power of the web comes from its ability to connect everyone together.

The secondary culprit is WSDL. WSDL doesn't enforce RPC but it makes life easy for tools that want to force SOAP into an RPC mold. Sam is ambivalent about WADL. I suspect he's afraid that WADL will make it easy to force HTTP into an RPC mold, that tools for doing this will proliferate, and that we'll never see resources again. Sure, WADL makes you specify your resources, but a smart tool could easily subvert that and create services with a single "resource" that handles a whole bunch of method calls.

I think the scenario I've ascribed to Sam is not inevitable. A lot of the mess we see today comes from historical contigencies like SOAP's heritage in XML-RPC. But to prevent a future without resources we need a basic book that's willing to go out and fight for them. It means we need to straighten out the terminology—even if only informally and we only use the terminology locally—so we can create a frame for our argument.

Next time: HTTP+POX.

Filed under:


Posted by Riana at Thu Nov 09 2006 13:47

Weird. I just started my class notes for today, and since we're on like day four of federal question jurisdiction, today's notes are titled "Return of the Son of Federal Question Jurisdiction." Pop culture-soaked minds think alike...

Posted by Pete Lacey at Thu Nov 09 2006 14:40

Brilliant, Leonard, absolutely brilliant. I have a nit and some answers for you.

Nit: You write "In an RPC service, the scoping information corresponds to the information that would identify an object if this were an object-oriented program. The action information corresponds to a method that should be called on that object. As is proper with the MOA, both of these go into the message."

Not quite: The way SOAP is typically implemented is such that that the object is identified by the URL. This is obviously not true in the case of WS-Addressing, but that's not yet typical. The action (method) identifier is placed in the message somewhere (sometimes the HTTP header, sometimes not).

Answers: You ask "how come basically the entire installed base of services that use SOAP also use the RPC style? Why does the average programmer think SOAP and then think RPC style?"

The answer is partly for historical reasons, there was a time in SOAP's history when it was explicitly an RPC mechanism. (And, as I understand it, XML-RPC is the son of SOAP, not the other way 'round.) While it has since morphed into a generic message passing system, this was not always the case. Even the SOAP 1.1 spec (still the most prevalent) is aggressively RPC-oriented, if not explicitly. The very first meaningful section of the 1.1 spec, section 1.3 reads: "The request takes a string parameter, ticker symbol, and returns a float in the SOAP response." Parameter! Float! And look at the following example, where the envelope body looks like this:


Then, of course, there's section 7 "Using SOAP for RPC." And section 5 which is not just an encoding style, but a data /object serialization style.

Or, as Don Box put it, "Once we had these representational types in place, we modeled behavioral types by defining operations/methods in terms of pairs of structs and, at least on the DevelopMentor and Microsoft sides, aggregated these operations into interfaces. Hence the RPC flavor that people associate with SOAP."

You write: "Who's the culprit here? I think it's those tools."

This is very, very true. The tools encourage developers to autogenerate SOAP interfaces from code. Going as far as, in the Java world anyway, automatically exposing stateless session beans as services, if you so desire. But the tools didn't write themselves. Put the blame where it belongs.

Also, the tool vendors aren't solely to blame. All developers grok APIs. Many developers grok RPC style communication. They don't understand resources and message passing and what-not. With the help of their tools, they knowingly create stateful, tightly-coupled APIs and expose them on the network. Doing this over HTTP gets them a tunnel through the firewall. Using SOAP gets them a better chance at interoperability. But it's RPC all the way, baby.


Posted by Pete Lacey at Thu Nov 09 2006 14:44

That "DIS" sitting all by itself there is but one bit of a code snippet I included, but I didn't escape it. Sorry. You can see the example yourself here.


Posted by Leonard at Thu Nov 09 2006 15:03

I think blaming the (anthropomorphized) tools is a good way to split the blame between the programmers and the tool vendors, and I don't want to make this a book about apportioning blame; but yes, there is supply-side and demand-side blame to go around.

"The way SOAP is typically implemented is such that that the object is identified by the URL." Do we mean different things by "object"? I'm talking about the underlying piece of the application you're modifying: the bank account or the queue or whatever. The thing that would be the "object" or "business object" in an OO program.

If every one of those things had a URI then you might not be RESTful but you'd be fairly resource-oriented (I think this is what a lot of HTTP+POX applications do). But most SOAP/RPC applications expose only one URI, corresponding to one mega "object" that handles a wide variety of requests and dispatches to real OO objects behind the scenes. And to identify which real OO object, you need to put some extra data in the message.

However that just means I've been letting current SOAP usage leak into my discussions of things like MOA.

Posted by Mark Baker at Mon Nov 13 2006 12:40

Some thoughts, stream-of-consciousness style...

"In an ideal MOA there is no information outside of the message."

Do you mean that MOA is stateless? If not, I don't understand what you mean.

I understand the view of MOA that you're espousing here, and its
comparison with ROA. Another comparison would be to describe ROA as a
more tightly constrained form of MOA. Then you could examine the
architectural constraints that differ (REST's interface constraints)
and the change in architectural properties they induce. I suppose
though, that your audience for this book is those who don't understand
software architecture to that extent. Just a thought ...

"How is the request scoped?"

By the description, it looks like you're asking which "part" of the
server software processes the message. Interesting criterion, but I
don't expect why most folks would think that was important (we know it
is, of course). Plus the name, "scope", doesn't really describe it
well IMO; I initially thought you were referring to

I've also noticed that you also use a multitude of names for
operations (verbs, methods). Because it's happened to me, I expect
you're probably losing some of your audience by doing this. Ideally,
I'd try to stick to one name, but failing that, at least say somewhere
(more than once) that they're synonymous. Or just use, e.g., "verb"
in the context of the noun/verb discussion but say it's synonymous
with operation/method.

Good stuff again though.

Posted by Leonard at Mon Nov 13 2006 13:09

I edited this entry hopefully to resolve your first two problems. As for the third, I do struggle with this problem. I'm trying to use "methods" everwhere because that's what it says in the HTTP standard. I only said "verbs" in the side note because I was quoting the triangle. I realize that "methods" in HTTP conflicts with "methods" in computer programming, but I'm trying to say "HTTP method" everywhere there might be ambiguity.

The questions you bring up about MOA and ROA are the ones I'm wrestling with as I try to convert this series of freeform and fairly higbrow weblog entries into a single coherent lecture, one that won't lose inexperienced programmers and that doesn't go into huge detail about things tangential to REST and the ROA.

Posted by Leonard at Mon Nov 13 2006 13:11

Incidentally, Pete, I think you had another comment here which I accidentally deleted in an anti-comment-spam binge. I'm sorry about that and I'd like you to repost your thoughts if you can recall them.


Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.