<D <M <Y
Y> M> D>

[Comments] (8) Alternate ETag Validation Functions: Yes, months after driving away everyone who read this weblog hoping I would talk about RESTful topics, here's some REST stuff. This is an idea I got from my co-worker Björn Tillenius. I hope someone else has come up with the same idea and given it a better name.

Here's the problem, on a high level of abstraction. Consider a representation (#1):

<p id="1">Forklift</p>
<p class="read-only" id="2">Green</p>

And let's say the ETag of this representation is the string "x".

According to the protocol governing this media type, you can modify the text in any paragraph unless its class is "read-only". So maybe you can PUT a document like this (#2):

<p id="1">Hovercraft</p>
<p class="read-only" id="2">Green</p>

Or PATCH a document like this (#3):

<p id="1">Hovercraft</p>

OK, that's easy. Now suppose that the read-only text changes randomly according to conditions on the server. Let's say the read-only text suddenly changes from "Green" to "Red". If I were to GET the document again, I'd get this document (#4):

<p id="1">Forklift</p>
<p class="read-only" id="2">Red</p>

And let's say the ETag of this document is "y". If I sent a conditional GET with an If-None-Match of "x", I'd get 200 and a new representation instead of 304 ("Not Modified").

OK, but I don't send a conditional GET. I don't get the document again at all. Instead, I PUT document #2, with an If-Match of "x", and the request fails with 412 ("Precondition Failed"). Maybe it should fail anyway; maybe the server is very strict and thinks I'm trying to change a read-only paragraph from "Red" to "Green", which would probably be 400 ("Bad Request"). But we don't even get to that point because the ETags don't match.

The request also fails with 412 if I PATCH document #3 with an If-Match of "x". But there's nothing really wrong with that request. The point of If-Match in conditional writes is to avoid conflicts with other clients, and there are no other clients here. The ETag is different because a read-only paragraph changed on the server side.

One obvious solution is to calculate the ETag only from the read-write portion of the document. This fixes conditional writes, but it breaks conditional reads. A client that requests document 1 and then makes conditional requests will never get document 4. The ETag is no longer a strong validator (update: actually, it's not any kind of validator); the document might change significantly without the ETag changing. So that's no good.

The solution Björn came up with is to split the ETag into two parts. The first part is derived from the read-only portions of the document, and the second part is derived from the read-write portions. The ETag is a totally opaque string to the client, but the server knows what it means. On a conditional read, the server checks the entire ETag. On a conditional write, the server only checks the second half.

In this example, the ETag for document #1 might be "1.a" and the ETag for document #4 might be "2.a". A conditional GET of document #4 with If-None-Match="1.a" would fail, but a conditional write with If-Match="1.a" would succeed. When the write went through, the document's ETag would change to "2.b", and "1.a" would not be good for either conditional reads or writes.

From the client's perspective everything just works: your conditional read returns 200 iff the representation has changed, and your conditional write returns 412 iff someone else is messing with the resource. But is this okay from a standards perspective? Section 13.3.3 of RFC 2616 says "The only function that the HTTP/1.1 protocol defines on [ETags] is comparison." That doesn't seem to prohibit me from defining another one.

If "x" is a strong validator then so is "1.a", but the new comparison function ignores some of its information about the resource state, effectively treating it as a weak validator (update: or as something that's not a validator at all). Is that okay? Would you believe the following definition of a strong validation function? "In order to be considered equal, the second halves of both validators MUST be identical in every way, and both MUST NOT be weak." (cf 13.3.3 again)

I'm interested in your thoughts on this. Smartass comments like "you should have two resources" will not be dismissed out-of-hand but also will probably not convince me. If you're curious, here's the real-life bug that spawned this thinking.


Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.