How to Follow Instructions

A talk about hypermedia, code-on-demand, and their evil twins.

I gave this talk at QCon in June 2012. The text is edited from an automatic transcript of the audio. the original slides (with my notes) are available here, and a video of this talk is online.

In 2006, I wrote a story about the future called RESTful Web Services. It was a tough pitch. As I recall, the publisher wasn't sure if people were going to buy into this vision of the future, but fortunately, a lot of people did, and now there are eight O'Reilly books about REST, including mine and Mike's.

But there are still parts of this book from 2006 that read like science fiction. In this talk, I'm going to focus on those parts.

More recently, I wrote another story about the future. This is my plug. This is Constellation Games, my first novel. It is a story about video game programming and space aliens. I recommend you check it out if that sounds interesting, but this is why I'm nervous. For the last two years, instead of programming for money, I have been trying to make people I made up have realistic-sounding conversations, which is a very different skill.

I'm going to talk a lot in this talk about links. A link is a story about the future. This link says that if you make a certain HTTP request, something will happen. You will get the next thing in a sequence. I'm talking about links because links are still probably the biggest part of RESTful Web Services that still reads like science fiction.

But before we get to the stories about the future, I'm going to tell a story about the past.

I'm 12 years old, early '90s, and I get this worksheet in algebra class. It's busy work. It's not being graded. It's just to keep us busy, and it's in purple mimeograph ink because my school doesn't have a Xerox machine, and it's a test of your ability to follow instructions.

We're given this, and my algebra teacher says, go to it.

Number one, read every instruction before you do anything. All right, I can do that. I read, I read, I read, and I notice what you've probably noticed.

Instruction number 20 contradicts instructions two through 19.

At this point, I have two choices.

I can be intimidated by instruction number 20 with its STOP and its exclamation marks, or I can say, screw number 20.

I'm not going to let number 20 scare me and get one point out of twenty. I'm going to follow as many instructions as possible.

I fill out number two, number three, number four, until I get to 20. Then I stop. I don't write anything on the paper, and I turn it in. And my teacher tells me I'm not very good at following instructions. Little does she know I can follow instructions just fine. I'm just a smartass.

But I'm not here to relitigate this busy-work worksheet from seventh grade. Instead, I want to talk about ways of following instructions, ways of writing instructions down, and ways of deciding which instructions to follow and which to ignore.

A few months ago, a company called Parse did a publicity stunt. I've adapted this slide from it. It's very similar to what they actually said.

This company is looking for people to work in the exciting field of REST APIs, and so they have this little puzzle you can solve. You construct a JSON document that has a certain format, and you post it to this URL, and then they look at what comes through on the wire, and they'll contact you and set up an interview. It's cute. They set up this little hoop you can jump through.

I'm not going to apply for this job, which is probably filled anyway, but I did write some Python code to carry out the instructions. This is a Python script, it contains my resume, and it constructs a JSON document, and POSTs it.

Now, I'm going to modify the instructions that I just showed you very slightly. It's almost the same. All I've done is change this media type. Instead of application JSON, they want form-encoded data.

So you make a form-encoded document, and you post it to the special URL, and it's almost exactly the same. You just have to use Python's urllib library to build the document instead of Python's JSON library

And now, the trick.

I'm going to show you those instructions a third time.

They're going to be exactly the same as these instructions, but they're going to look radically different.

Here they are.

You can write code to follow these instructions. It will look a lot like the code I just showed you, but you are not supposed to write code to follow these. The instructions themselves are machine-readable.

You're supposed to use a web browser, and when you use a web browser, you realize that this puzzle is incredibly boring. It's a form that you fill out to apply for a job. It's not fun anymore.

The machine-readable instructions removed all of the puzzle elements and left you with the parts that a computer can't do. You have to gather information and actually apply for this job.

Here is my question: why is it that any college-educated person in the world can fill out an HTML form to apply for a job, but making a nearly identical HTTP request to a web service qualifies you as a developer for these web services?

This seems like a stupid problem. I am kind of embarrassed to even be talking about it, but we have not collectively been able to agree on the solution, and our inability to agree on a solution is preventing us from solving much more difficult problems.

Later, I'm going to talk about why we haven't been able to achieve agreement. I'll give my guesses, but first, I want to talk about the instructions themselves.

Part One: How to Recognize Instructions

In 2008, at a QCon talk, I classified web services into a hierarchy, so you could look at one really quickly and make a snap judgment.

I'm going to do more classifications in this talk. I think there's about three classifications, but they're not hierarchical. I'm not saying number two is better than number one. I'm going to start by talking about five kinds of instructions, and if you want value judgments for these, I'm going to say that #1-#3 are are bad, and #4 and #5 are good.

Let's look at #1, human-readable text. This is a stark example of human-readable text from 1995. This is the Usenet FAQ from alt.pagan, the newsgroup, and it's telling you how to download the FAQ.

This file is available via anonymous FTP to the host ftp.cc.utexas.edu, in the directory pub/minerva.

In this talk, I'm going to take a hard line. I'm going to say that human-readable instructions like this do not count as instructions at all, because you are human beings. You already know how to follow instructions. That's not a hard problem. The hard problem is getting a computer to understand instructions, and this string has no meaning to a computer.

A computer can't understand English, even though it can understand FTP. You need separate machine-readable instructions for starting an FTP session and downloading a file.

Fortunately, in 1994, RFC 1738, which is the URL standard, was published, and this defines a way to represent a complete FTP session in a short, machine-readable string.

In 1993, we saw the first internet draft for HTML, which lets you combine these as yet unstandardized, apparently, URLs with human-readable text.

URLs existed in 1995 when this Usenet FAQ was written, but they were new and scary technology. People were still writing prose to do the work of a URL.

Now, although I'm saying that human-readable instructions are bad, we can't do without them altogether, because human-readable instructions are what we use to define the machine-readable instructions. We have an RFC that defines the URL standard, and we have an Internet-Draft for HTML.

You can't do away with human-readable instructions altogether, which is good because that gives us humans something to do. But the amount of human-readable documentation I see as a consumer for web services is out of control. People are still writing prose to do the job of a URL. So I am taking, as I said, a hard line against human-readable instructions in this talk. It's not that they're bad per se, but let's see what's possible without them.

The second type of instructions is native-language bindings. These are the other dominant form of machine-readable instructions. These are client libraries, generally written by the people who designed the web service. The user downloads them, installs them locally, and then uses them in their own programs. These are machine-readable instructions for turning the code you write, you the consumer, into HTTP requests against the web service.

I think these are bad. They're not bad the way poisonous snakes are bad; they're bad the way fast-food restaurants are bad in neighborhoods that don't have grocery stores. They are compensating for a failure earlier in the process.

The third type of instructions is service descriptors. Service descriptors are instructions for generating a native language binding. They are instructions for generating instructions for turning code into HTTP requests.

These are also bad. You may already feel that way. I'm going to explain why I think they're bad later.

So those are the three ones I'm saying are bad. Now the two good ones.

Number four is hypermedia, which we have heard about today. Hypermedia is a language-neutral description of an HTTP request you might make in the future.

Fifth, this is one I didn't mention much in RESTful Web Services. I didn't realize how important it was going to be: code on demand.

This is executable code, which is served alongside the data, the same way that hypermedia is extra data served alongside the regular data. Code-on-demand is literally instructions for your computer to follow. It's as far from human-readable instructions as you can get.

This is just a quick example to show what I mean, because although I am probably the person in this room who is the worst at JavaScript, maybe some of you haven't thought of it in terms of code-on-demand. I'm going to use code-on-demand to cheat at the job application puzzle.

I presented it before, with pure HTML, but I had to change the rules to get it to work. I had to say it was allowed to submit a form-encoded document instead of a JSON document. Now we're going to do it properly and submit a JSON document.

So there's a <script> tag, and there's a JavaScript function. It gathers information from an HTML form, and it posts it using XMLHTTPRequest.

And here is a hypermedia form that triggers the request when you click the 'submit' link.

Clicking the link triggers the send_json() function. Now you can fill out the form like normal, and they will not know that you don't know anything about APIs until you come in for the interview.

So, HTML has limitations. You can't make a PUT or DELETE request with pure HTML. Another limitation is you can't send a JSON document. I evaded the JSON limitation by serving the client code to create and send that JSON representation, along with the HTML form.

Code-on-demand is not a complete substitute for hypermedia. You still need hypermedia to explain how to trigger one bit of code or another bit. One of the big advantages of code-on-demand is it lets the server use hypermedia to describe computational actions other than making HTTP requests. Remember, hypermedia is a story about what might happen in the future.

So the first thing I said was that I was just going to stipulate that human-readable documentation was bad. What about the other four kinds of instructions? There's an interesting relationship between them.

Descriptors and hypermedia are both documents that you download that describe HTTP requests you might make in the future. Why are descriptors bad and hypermedia is good?

Native language bindings and code-on-demand are both code you download, and then decide which bits to run. Why are native language bindings bad when code-on-demand is good?

The difference is when the download happens.

When you download things ahead of time and install them on your computer, you're fixing them in stone. Changes to the server side will break your installation.

Hypermedia and code-on-demand are downloaded at runtime, so if they change, the client will automatically adapt. The client might still break, but it won't break because it has outdated instructions.

WSDL in particular has a bad rap. People think of it as being brittle. But you could write a service descriptor in HTML and it would still be brittle. The problem's not WSDL. Brittleness enters the system when you download instructions from the server and install them permanently on the client.

This is a summary slide so we can catch our breath. Now we're going to move on to part two:

Part Two: How to Write Instructions

What are instructions for?

What do they say?

I think Mike [Amundsen] does a really good job of explaining this. All of the H factors Mike identified are ways of describing HTTP requests that the client might make in the future.

I'm not going to go into detail because Mike has it pretty well covered in his book, but I'm going to show some popular media types and show how they measure up in terms of support for Mike's H factors.

First, AtomPub. AtomPub defines three media types and between them they support five of these factors.

HTML has "hyper" right in the name so it does better; it supports seven of the factors.

This is the one Mike didn't show you because Mike is too nice. This is JSON and XML. They do not support any hypermedia factors. You can turn them into hypermedia data types through extensions, but it's not built in.

There's no single registered media type that contains all nine of these factors. Why is this? Three reasons.

First, the H factors were identified very recently. They're sort of like the elements way on the end of the periodic table. You don't know about the weird ones, you don't really think about them, you don't design using them.

Second, media types are generally designed for some domain-specific purpose. They're not designed to represent any possible HTTP request someone might make for any reason.

The third reason is that people apparently do not care about this stuff. I'm going to illustrate this apathy in a couple of different ways.

This is from a bug filed against HTML5 that resulted in PUT and DELETE being removed from HTML5. An update was planned, they were going to add one of the missing H factors, but because no one had ever done that before, there was no clear use case, and, it got taken out of HTML.

Because of this, HTML just cannot tell you that you might want to send a PUT or DELETE request in certain circumstances. It's under a gag order. You have to combine HTML with code-on-demand to give those instructions.

But you can't just blame the standards people. We as programmers also don't care.

This is a graph I made from data I got from the good folks at ProgrammableWeb who have an enormous database of pretty much all public web services in the world.

The X-axis is the month that ProgrammableWeb discovered a web service. The Y-axis is the number of services discovered that month, excluding SOAP and XML-RPC services. The colors correspond to the media types that these newly discovered services serve.

Obviously the big format is magenta: JSON. Next is green, services that will serve you either XML or JSON. The yellow is services that will only serve you XML. Orange is HTML, the rich hypermedia type, and blue is Atom, the system designed for web services.

HTML and Atom are nowhere. They're flat and they're very small. XML has seen some mild growth and around 2010, where as JSON explodes, and the combination of XML and JSON explodes.

But these are the two media types that have no hypermedia affordances. What's more, someone who offers you just a choice of XML or JSON, either one, is probably not that concerned with adding the very different types of extensions you need to add to XML or JSON to have hypermedia support.

So without exhaustively looking at every single one of these services, I think I can say that these people, most of whom buy into the REST buzzword, are not using the single biggest innovation provided by REST.

People would rather have JSON.

This is a strange mix of early adopter enthusiasm and extreme conservatism. Why would people rather have JSON?

This is my theory: if you take an old SOAP WSDL service and you rewrite the SOAP documents as JSON documents and you rewrite the WSDL as human-readable documentation, you will get something that looks a lot like your typical modern "REST" API.

I think people secretly liked SOAP. They liked pushing a button and publishing their internal object model, but SOAP+WSDL was too heavy. There was all of this XML, it was impossible to see what was going on, so we found an easier way to do it: HTTP+JSON.

There's another reason for JSON's rise.

There is a lot of dark matter in this graph, which I've illustrated with a squiggly green line. There's a huge number of web services that ProgrammableWeb does not track, because the services are not public. Specifically, the people who write the servers are the same people who write all the client software.

In this environment, JSON makes a lot of sense. Due to the architecture of this environment it doesn't matter that JSON has no hypermedia affordances. And since this architecture is now so popular, it's being applied in areas where it doesn't make so much sense.

I am talking about websites.

The analogy between websites and web services is not new. We used it in RESTful Web Services, and Sam [Ruby] and I did not invent it. But I want to take you back to 2005, before Ajax, represented here by a modern Wikipedia page.

The way it works is you go to the browser, you hit the home page, you make an HTTP request, the server sends you some hypermedia, and you see a form rendered before you. You fill out the form, you click submit, make another HTTP request, there's a page refresh and the server sends you more hypermedia which is then rendered. The cycle repeats. Pretty simple.

But things have gotten a lot more complicated since 2005, due to Ajax. Now, when you hit the home page, you are served a combination of hypermedia (HTML) and code-on-demand (JavaScript).

You fill out the form and click submit as before, but instead of directly triggering an HTTP request through the HTML rules, when you click that submit button, it triggers a little bit of JavaScript instead.

The JavaScript does the work. It makes an HTTP request using the XMLHTTPRequest library. There's no page refresh. The server sends JSON, usually, and the JavaScript runs again and fiddles around with the document's DOM.

This is a perfectly legitimate web service of the RESTful variety. I mentioned before that we didn't cover code-on-demand a lot in RESTful Web Services. That's just because I didn't see this coming. This is not very friendly for a human being who's not an employee of this company trying to figure out what's going on here, but architecturally it's very sound.

Here's the problem:

When I say this is a fine web service, I'm talking about this entire system. I'm talking about the browser client, which is served being HTML and JavaScript; and I'm talking about XMLHTTPRequest inside the browser, which is being served JSON. Both of these systems are needed.

This is why it's okay that JSON has no hypermedia affordances: because you're also getting HTML. That's where the hypermedia comes from. This system is serving a huge amount of hypermedia and code-on-demand.

I think people look at this design and say, hey, we already have an API. We've exposed our data model through HTTP so that the client running in our user's web browsers can get to it.

This is why the term "API" has bothered me for a while. If you look at this, yes, it's an Application Programming Interface, but it's not something you can present to users. The instructions, the hypermedia and the code-on-demand, are missing.

What do we do to replace the missing hypermedia that wasn't suitable for showing to people? We write human-readable documentation! That's why I'm so down on human-readable documentation in this talk, because frequently this documentation is human-readable versions of hypermedia documents that already exist.

Part Three: Why to Follow Instructions

In part three, I want to explore what we are giving up when we get rid of machine-readable instructions.

I want to explore what instructions do, and to that end, I'm going to show you two choices.

Here are two hypermedia controls. There is a red button and a blue button. These are your choices.

Which button do you push?

Do you cut the red wire or the blue wire?

Do you take the red pill or the blue pill?

Maybe you should leave the whole thing alone.

How do you know?

I can think of a few ways.

First, human-readable documentation! Again, you can read a doc that explains the red form and the blue form and every form in the system, and then you can write a client based on this knowledge.

And, again, all the ideas I'm going to present are ultimately backed up by human-readable docs. A library that understands HTML forms didn't come out of nowhere. A human being read the HTML standard, and used their interpretation of the standard to program a piece of software with knowledge of these hypermedia affordances.

But not every single control needs its own human-readable docs. That would get way out of hand. So, instead, we create standards. We create rules for what counts as an instruction and what instructions mean, and we agree to follow instructions when we see them. We signal our compliance with these standards in machine-readable ways, such as the Content-Type header that says which media type a document conforms to.

So again, I'm just going to say human-readable documentation is bad by fiat, because that's what you use when nothing else works.

Number two: HTTP's uniform interface.

The human-readable doc for this one is RFC 2616. [Now RFC 7230 et al.] The client tells the server it wants to GET some data, and the server sends data. The client could say PUT or DELETE instead, and the server could comply, because that's allowed for in the RFC.

This may not sound like much, but the idea that you should say GET if you want to get some data was enough to kill off a lot of early Internet protocols.

Number three: link relations.

The IANA has a human-readable document that defines a lot of common relationships that might obtain between two documents.

A link connects two documents, and the link relation says what the relationship between them is. You can program a client to understand these relationships, and then you can have it do things like go to the end of a chain.

You program that behavior by following next, next, next, next, until there's no more next link.

Number four is the big one: media types.

Media types don't just tell you how to parse a document. They tell you what the document means. The example I've chosen, RFC 5545, describes calendaring events. You can implement this standard, and then instead of writing a whole bunch of human-readable docs saying what a calendar is and what an event is, you just point to the RFC. Then your users can use a library somebody else wrote against the same RFC, like Python's calendar library.

The interesting thing is since RFC 5545 is a domain-specific standard, a client that understands this RFC doesn't just know how to parse this document. It knows what the document means. It knows where to find the location of a event. It can figure out whether an event has already happened by looking at the end date.

If you combine the text calendar standard with the HTTP standard, a client also knows how to modify the location of an event. You modify the document according to the rules of the RFC 5545, and then you PUT it back to its original location using the rules of RFC 2616.

It's not that there's no human-readable document necessary; it's that those documents were already written, years ago. This scenario just calls them into effect.

Compare this to a JSON representation of the same data. It certainly looks nicer. It has its advantages, but the JSON standard does not define the semantics of a calendaring event, nor should it. This is a data format.

That means a JSON library can parse this document but can't understand it. You would have to also write human-readable documentation saying the location is in location and the start date is in dtstart or start or start_time or whatever you decided to call it.

If you do that, you're rewriting RFC 5545. Again, JSON has some significant advantages over the text/calendar media type, but this is the price you pay for using a simple data format instead of a rich media type.

One more example: RFC 5023, the human-readable documentation for the Atom publishing protocol.

This is an AtomPub service document, and if you get this content type, application/atomsvc+xml, you know what's going on in the document. You know that if you see this collection tag and then a title tag, then you're looking at an Atom collection that has a certain title. But there's more! You know things about resources you haven't even requested yet!

You know that the resource at http://example.org/blog/main will behave like an Atom Pub collection. You know that if you have the right credentials, you can POST to it. You can send it an Atom entry, and it will be added to this collection.

You know that the resource http://example.com/cats/forMain.cats is an Atom Pub category document. You don't even have to GET it to know this. By definition, according to the RFC, the href attribute of a categories tag is the URL of a category document.

If a document supports any of the H-factors, the media type definition is where the hypermedia controls are defined. The hypermedia part of the definition describes what HTTP requests might spring out of a document if you read it and decide to take some action.

Number five: profiles.

Profiles add semantics on top of a media type. This is a JSON Schema profile for calendaring events.

So if you had something like that JSON I just showed you with totally random information about a calendaring event, if it conforms to this schema, you could stick the schema on top of it and make some kind of sense for it.

So now you know dtstart is the "Event starting time".

Profiles work like mix-ins. They are useful in the same place as mix-ins are useful. You can add a profile to a general hypermedia type like HTML, just to take advantage of its H factors, and not have to defining a whole new media type. Or you can combine data from many domains into a single document.

And this is just one type of profile. This one happens to be written in JSON. You could write a profile in English, so long as you had a machine-readable way of signalling that you were using that particular profile. There's a lot of different possibilities.

Another summary slide.

Now I want to talk about the status quo. At my last programming job, I integrated our dataset with these three popular websites. As far as I'm concerned or I was concerned at the time, all three of these websites do the exact same thing: they let me take some text the user wants posted and they let me post it on the person's account on the other site.

These three web services are very, very different. They all use HTTP, they all use OAuth, and two of them use XML, but they're different. I could not reuse any other code other than the HTTP library, the OAuth library, and the XML parser.

But as far as I cared at the time, all three of these sites were exactly the same.

I'm going to talk about an analogous situation. This is something I complained about in RESTful Web Services. Back in 2006, Flickr and Google had web services, and on top of that they had systems, effectively identical systems, for allowing the end user to delegate control of their account on Google or Flickr to a third-party app.

So, your camera phone may have an app that uploads pictures to Flickr. You want the phone to be able to upload pictures to Flickr, but you don't trust it with your Flickr password. So you delegate some authority.

Google and Flickr's systems were technically identical, but they used different architectures. Google called theirs "AuthSub". I don't think Flickr called their system "get-frob", but that's the only thing I remember from it. Google would never call anything a 'frob'.

There was no compatibility. Well, who cares? Who uses both Flickr and Google? It's kind of a random use case. Just write the code twice. It's not a big deal.

But then some companies got together and they came up with OAuth. OAuth is technically the same as AuthSub and get-frob. The advantage is it's written down in human-readable documentation, and, everyone has kind of agreed that it's a good solution and that's generally what we're going to use going forward.

Well-supported client libraries for OAuth happened because there was a standard. All these companies that came on the scene afterwards used OAuth, instead of inventing their own technically equivalent but differently named system, because there is general agreement that OAuth is good enough.

With that in mind, I want to show you some recent headlines from ProgrammableWeb. "71 eCommerce APIs", "78 Hosting APIs", "123 Database APIs", "53 microblogging APIs".

Now, in some sense, this is very good news. If there are 71 e-commerce companies, there should be 71 e-commerce APIs. That's good business. But I did some spot checks and I'm willing to bet these are literally 71 different APIs.

71 different internal data models exposed to the world, 71 people talking about the same concepts of products and shopping carts and payments and payment methods in slightly different language. Each of these documented with human-readable documentation that somebody has to understand if they want to hook into one particular service.

What a huge waste of everyone's time.

Do we need 123 database APIs? Outside the web, we have about two database APIs: SQL and NoSQL.

Do we need 53 ways of posting a tiny chunk of content to a little blog-like thing that keeps track of tiny chunks of content? No. Focus on what differentiates you from the competition.

Again, I only did spot checks, but I'm pretty sure this is largely accurate.

Imagine if instead of 123 database APIs, we had twenty. Try to think of twenty distinct ways of accessing a database over the web, twenty different techniques that could conceivably compete with each other.

Imagine if we could just get down to twenty standards, each with its own community and its own fans who will defend it to the death and its own well-supported set of client libraries.

That would be very wasteful, but it would be five times better than having 123 one-offs.

Think about how useful OAuth is, even though nobody uses every single API that uses OAuth. The fact that it exists, it has a name, it has documentation, and people generally agree that it's a good solution, is enough to stop people from coming up with more stuff that is almost the same.

Now that we're in this situation, I'm going to suggest a few things we could conceivably do to get out.

This is a quote I gave in 2007:

The big question in my mind is whether architecture is consciously designed with REST in mind will win over architectures that are simple but only intermittently RESTful.

Well, that happened. The winning architecture was not REST, it was XMLHTTPRequest.

These are the lessons I've drawn from that fact. The big one is that JSON has unquestionably won the representation war. If you give the programming public at large a data format that's not JSON, they're not really going to be interested.

Neither JSON or XML has native hypermedia affordances, but JSON without hypermedia means you get 123 database designs because everyone is just pushing out their internal data model.

So, things you can do. There are a lot of standards in the works that try to add JSON links, some sort of linking capability to JSON. Basically they have the function of making JSON's hypermedia factors diagram look like this, which is not much, but it is literally better than nothing.

My idea, which you can take or leave, is to sort of copy the architecture of modern web applications. Serve HTML when you need to explain a complicated HTTP request someone might make in the future, and serve JSON with simple links otherwise.

This one's kind of wacky: go all-in on code-on-demand. You want to write a native language client library? Go ahead and write one, but don't have an install link anywhere. Make the client library the actual web service. Anyone who wants to use this web service should actually download the library and run it in a sandbox.

You could use JavaScript for this, of course. That's probably the best solution. You could also use something like Lua, which handles being in a sandbox well. You could use any interpreted language as long as you can get it in the sandbox to deal with security concerns.

I think this is a stupid idea, but I have been wrong about this stuff before. Maybe it's actually a great idea.

Number three is the big money solution: OData. This is basically AtomPub with a JSON representation. One problem with AtomPub, one reason it's not used anymore, is it only serves XML. OData gives you the choice of XML or JSON.

As I said, this has big money behind it. Microsoft and IBM are both working on it. I really want to like OData. Problem is, it's not just AtomPub. It comes with a whole lot of other stuff. Some of the other stuff is very cool looking, like additional relationships between resources. Some of it looks very suspicious, like the schema definition language.

The latter makes me think OData is designed by people who, unlike the general programming public, kind of have this nostalgia for the days of SOAP. They understand REST and they like REST, they just wish it were as complicated as SOAP.

If I could offer some friendly advice to the OData developers, I would really like to see this split into a bunch of sort of optional components standards so that people can pick and choose. I would really like to see just a basic AtomPub that serves JSON.

I know that Mike's solution, Collection+JSON is very similar to "AtomPub that serves JSON". It's probably close enough to what you want to do.

Of course, you want to deal with your specific domain, so you can write a little profile explaining the semantics of what you're putting in the JSON dictionaries. It's close enough.

Another unconventional technique is to copy someone else. Just pick a big company's API and copy it. Make your API look like Facebook Graph or GData. You won't make the problem worse. Your users will be able to reuse some code.

Here's another wild idea I have. This is actually kind of similar to what Subbu [Allamaraju] is doing: go freelance.

You can create an HTTP intermediary that takes advantage of the fact that you don't need all 53 of these microblogging APIs. You don't need 71 e-commerce APIs. You can provide three or four and just hide the useless differences between all these APIs.

Another summary slide. We're into the home stretch.

Epilogue: How to agree on instructions

One reason for the status quo is that standardization is very boring. It is also very hubristic. Someone who writes a standards document has to think that they have solved the problem for everybody.

Solving the problem for everybody means working with your competition. There's always the risk that you will accidentally commodify your own product and put yourself out of business. It's not easy. I want to just talk about some really simple first steps that won't threaten anybody's business.

Here's a bunch of things that are all the same kind of thing: an entity and a collection of entities. Each of these relationships could be modeled with an AtomPub collection, but clearly telling people to use AtomPub does not work.

Let's just get some little sub standards for all this stuff that people can follow and feel like they're doing the right thing. One guy who's working on this is Duncan Craig. He has a project called The Object Network which is defining these little profiles for common things like events.

Here's another strategy. We're in this business for the long haul. We don't have to standardize everything at once. You're going to have multiple chances over the course of your career to iteratively make things a little better every time.

The process—and this is what I follow whenever I do programming for money—is I look for bits of the problem that other people have solved and I adopt their solutions. I try to solve what's left in a way that the people who come after me, possibly myself in the future, can reuse.

I write up instructions that look like a standards document.

Maybe they're not submitted to a standards body, but they look reusable. Somebody can come along, they can follow these instructions, and they can feel like somebody has been through here before and blazed the trail.

This is my advice: just start reusing code on the client side. Start copying people's good ideas. Start refactoring the differences that don't matter by creating wrappers. And above all, start putting links in your damn representations instead of not doing that.

There's this whole other discussion we could have about hypermedia forms. I honestly don't think we're in a place to offer good advice. I think we are still in the experimental stage when it comes to anything other than basic, "this thing has some relationship to this other thing," so let's get that nailed down and leave more complicated things to the land of experiment.

This is my personal motto from the great Johnny Cash: "I built it one piece at a time."

I want to close by talking about progress that has already been made. In 2007, RESTful Web Services came out, and there's a whole chapter in that book on Amazon S3. We chose Amazon S3 as an example because in 2006, that was the most RESTful publicly available web service that we could find. It used PUT and DELETE. That was a big deal.

In 2012, Amazon S3 is a terrible example because it doesn't have any hypermedia links.

In 2007, we were very insecure. SOAP had all the big money behind it. We were very defensive when we wrote RESTful Web Services. We had something to prove.

Things are a lot different now. People are still making bad design decisions, but these people want to make the right decisions. They are willing to listen.

Because of this, REST is a hot and sometimes undeserved buzzword. We have shifted the discussion enough in the past few years that we can now start demonstrating the benefits of hypermedia.