Tuesday, March 16, 2010

Idempotence - Reliability that Works


In many cases of messaging where uncertainty is undesirable (e.g., a timeout on a critical transaction request), the usual solution sought is "reliable message delivery". Apart from its theoretical impossibility, "reliable messaging" is expensive. One needs "strong queuing" products, which cost a lot of money.

I've been a fan of a far simpler approach - idempotence. We're really not after reliable message delivery in most cases. We'd often be happy to settle for certainty, i.e., we don't mind whether a requested operation succeeded, failed (or indeed, timed out before it could be attempted), as long as we know what happened. What we don't want is to be stuck, not knowing what happened and afraid to retry the operation for fear of unwitting duplication.

Idempotence is, of course, the property that attempting something more than once has exactly the same effect as attempting it once. Idempotent operations can be blindly retried in a situation of uncertainty, because one is guaranteed that the operation will never be duplicated.

The trick is to identify every transaction request with a unique identifier. Provided the receiver of the message is set up to check the identifier against previously-used ones, duplication of transactions can be avoided even if requests are sent multiple times. This is extremely powerful because it allows a requesting application to simply retry the request message in situations of uncertainty until a definite response is eventually received. There is no danger of the request being acted upon more than once.

Reliability is then reduced to an endpoint-based protocol. It does not require any special capabilities on the part of the transport. In fact, the transport can afford to be quite unreliable. Idempotence allows reliable messaging solutions to be built (and quite cheaply at that) on top of unreliable components!

Here's a one-page document that illustrates the concept.


Hopefully this should make it very clear that we don't need strong queuing or "reliable message delivery" to eliminate uncertainty. A plain web server, a database and a system of one-time tokens (UUIDs?) can solve the problem.