Tuesday, December 22, 2009

The Coming Overthrow of XML - Orderly Makes Further Strides


My feeling that XML is due to be dethroned grows stronger by the month.

A quick recap of recent history:

First, JSON offered a simpler data structure than the angle bracketed format of Unicode XML. But that still lacked rigour around data definition, so even though XML suffered the confusion of having at least three competing schema definition languages (XML Schema, RelaxNG and Schematron), the world did have a way to specify data types, formats and constraints with XML that JSON could not match.

Then, thanks to Kris Zyp, JSON Schema appeared on the scene and plugged the rigour gap. JSON Schema parsers now exist for a number of languages including Java. One of the design decisions of JSON Schema was for a schema document itself to be valid JSON, much as XML Schema is itself valid XML. Unfortunately, this meant that brevity wasn't JSON Schema's strong point because the JSON way of expressing properties is necessarily long-winded.

The third shoe has dropped now (never mind the grotesque image that conjures up of the wearer). Orderly is a new schema language developed by Lloyd Hilaiel that is far more compact than JSON Schema and yet round-trips to JSON Schema quite effectively. [There's a really cool Ajax-y screen that converts back and forth between Orderly and JSON Schema before your eyes, so you can tweak either code to see how it looks in the other representation.]

Bottomline: SOA architects can recommend the use of the simpler JSON data format instead of XML without having to worry about the lack of rigour in data definition. Data architects, designers and developers can use Orderly to design schemas without bothering about JSON Schema's cumbersome syntax. JSON parsers can work with the equivalent JSON Schema to validate a piece of JSON data without the need to understand two different syntaxes.

A great solution, and it's all come together quite nicely in time for Christmas. Thanks to Lloyd (and Kris before him) for a wonderful Christmas present to all SOA practitioners, and ultimately, everyone wrestling with XML in any capacity.

The Meaning of Open - By Google


A friend sent me this link to a piece written by Jonathan Rosenberg of Google on the meaning of the term "open". This is old hat to those of us who have already seen the light, of course ;-). [Rosenberg's calisthenics when he then tries to justify the closed bits of the Google ecosystem are quite amusing.] But to people who have not given much thought to openness and tend to follow the herd on technology (the bigger the brand name, the better), this open letter may hold many eye-opening insights (all puns intended).

I would perhaps have said what Rosenberg did in half the length, but brevity isn't a necessary quality of openness, so I'll forgive him :-). The main danger of the unnecessary length is it may just cause some of the audience to stop reading before the end, when Rosenberg delivers his most inspired paragraph:

Open will win. It will win on the Internet and will then cascade across many walks of life: The future of government is transparency. The future of commerce is information symmetry. The future of culture is freedom. The future of science and medicine is collaboration. The future of entertainment is participation. Each of these futures depends on an open Internet.
Amen to that. In fact, that's worth repeating in a more structured form:

The future of government is transparency.
The future of commerce is information symmetry.
The future of culture is freedom.
The future of science and medicine is collaboration.
The future of entertainment is participation.

I would like to analyse these in greater detail and add to/modify the list, because I'm sure this is incomplete.

For now, this is a document that is worth circulating to our brand name-dazzled colleagues. After all, Google is one of the biggest brands out there, so if Google is endorsing openness, there must be something in it ;-).

Now if only IBM would come out with an OpenTM line of products, we would be willing to write a cheque...

Friday, December 11, 2009

Is Canonical Trying to Purge Ubuntu of the L-word?


I don't think much of revisionist history, and biting the hand that feeds isn't an endearing trait.

Visiting the Ubuntu site today after a while, I was unpleasantly surprised that I couldn't see the word "Linux" anywhere. After trawling the site exhaustively, I did find two or three references, and I leave it as an exercise for the reader to find more. Warning: You'll have to search really hard.

Under "What is Ubuntu?", the site says, "Ubuntu is a community developed operating system that is perfect for laptops, desktops and servers".

What's up, guys? Does it hurt a lot to use the phrase "based on Linux" somewhere in that sentence?

Canonical and Ubuntu, great as their contributions have been, would be nowhere without Linux, especially the Debian distribution. So why not acknowledge that debt? Why try to pass Ubuntu off to newbies as a completely original operating system with no ties to Linux?

On the same page, right at the bottom, there's a section titled "What does Ubuntu mean?" and it goes on to explain, "Ubuntu is an African word meaning 'Humanity to others', or 'I am what I am because of who we all are'."

How apt. Dear Canonical, why not show some Ubuntu (humanity to others) and acknowledge that you are who you are because of what Linux is?

Thursday, December 03, 2009

Advice To Organisations Embarking On SOA Today


I have been involved with SOA in various roles over the last two to three years, and my thinking has evolved a fair bit over this period. If I was asked to advise an organisation embarking on a major SOA initiative today, I would probably say this to them:

1. The End Goal: Remember that SOA is not about integration but about inherent interoperability. Think health, not medication. SOA is about raising the capability of your systems such that they can easily and inexpensively integrate with others, not about introducing a new, slick technology that will connect your systems together more easily. Simplifying the components of your enterprise and making them easy to understand and connect to will give you SOA. So keep simplicity at the back of your mind all the time, and don't confuse it with expediency, which is the path of least resistance. Simplification could take some effort.

2. Domain Models: Don't waste too much time searching for "the" canonical data model. Most off-the-shelf ones are too high-level and abstract to be useful. And building your own comprehensive dictionary is wasteful and time-consuming. Instead, identify logical owners of different elements of data and let them own the data dictionary for those elements. All services exposed out of these logical domains should use these definitions and it is the responsibility of service consumers from other domains to understand them. Processes that combine services crossing such domains should perform their own mapping between similar data elements. It's necessarily messy and plenty of out-of-band communication will be required at design time. After all, even similarly-named elements may suffer from subtle interpretation errors, so manual discussion and clarification will always be part of a service or process design.

This is not as bad as it sounds because only a subset of data elements managed by a domain is exposed through its service interface, and it's only these that may need translation to the context of their consumers. Don't look to do away with this manual effort through a single canonical data model. That's a wild goose chase, so don't even start.

3. Infrastructure and Connectivity: Try and avoid using message queues unless you're looking at low latency situations like a trading floor, or if there's simply no other way to interface to a legacy system. The queuing paradigm introduces various coordination issues into application design, and implementing message queues requires establishing more complex patterns to solve these gratuitous problems. [I have a larger, philosophical argument about the need to innovate an application protocol on top of an asynchronous, peer-to-peer transport, but let's not confuse the current set of recommendations with that idea.]

In today's world, HTTP-based communication patterns backed up by databases will often do the trick more simply than expensive message queues. Look beyond the apparent need for reliable message delivery. Often, an idempotent operation will suffice to meet the real requirement, and this is quite a standard pattern to implement. Often, queues are used in synchronous (blocking) patterns anyway (to avoid the coordination problems I talked about), so nothing is being gained in an architectural sense by the use of queues. And even asynchronous communications, where required, can be implemented in standard ways over HTTP, so HTTP is quite a universal protocol to use as the logical infrastructural element for your SOA.

ESBs, Service Directories and other "governance" components are often only required to manage the complexity that they themselves introduce. It's amazing what you can achieve with a farm of simple web servers and a database, and still keep things simple and understandable.

4. Service Enablement: Try and avoid the entire SOAP/WS-* stack of technologies if you can. There is a significant complexity overhead associated with this set of technologies, and you will need an expensive toolset to (partially) simplify its use. Look seriously at REST instead. Even though REST advocates don't make the case strongly enough (and sometimes see SOA as an antithetical philosophy), REST is in fact a valid way to do SOA and can usually help to deliver solutions at much lower cost and complexity. The hard part about doing REST is finding good people who can think that way. REST is subtly different from the SOAP/WS-* approach, even though they may just look like different kinds of plumbing to move XML documents around (and I confess that's the way I initially sell REST to corporate skeptics brought up on a diet of vendor-provided Web Services propaganda).

5. Data Contract: Consider alternatives to XML for the data contract. Though this sounds like heresy, XML is heavyweight and cumbersome, and XML manipulation tools in high-level languages (with the possible exception of E4X in JavaScript) are clumsy to use and suffer major impedance mismatches. You will spend more time wrestling with XML than on the service itself. Although many in the web world will immediately recommend JSON, raw JSON is not sufficient to ensure data integrity, because it has hitherto lacked a strong schema definition capability. Maintain a watching brief on the JSON Schema proposal, submitted for approval as an IETF standard. Already, there are JSON Schema libraries in many high-level languages such as Java. It should be possible to define data contracts with as much rigour as with XML, but at a much lower level of complexity. A newer and more compact JSON Schema representation called Orderly is also maturing, which makes this approach simple as well as easy.

So instead of going down the XML rabbit-hole, start with JSON anyway, and incorporate JSON Schema/Orderly as it matures. You will find this works especially well in combination with REST. A quick Proof-of-Concept may convince the skeptics in your organisation (although the opposite result may also occur, with many going away convinced by the speed of this approach that it's either simplistic or too good to be true!)

6. Web Service Implementation: If you're trapped by circumstances into an XML-and-SOAP/WS-* approach, look at the WSO2 suite of commercially-supported Open Source products. Especially look at the WSO2 Mashup Server. Don't be fooled by the name. It's more than just a mashup server. It's a service orchestration engine that (curiously) uses server-side JavaScript as its programming language. The major advantage of JavaScript is the ability to use the E4X library to perform extremely straightforward XML manipulation. Once you use E4X, you will never go back to JAXB or any other XML-processing library. WSO2 Mashup Server allows SOAP or REST services to be consumed, combined and orchestrated, and in turn exposes SOAP or REST services. It's a good way to hedge your bets if you're only half-convinced about REST. The WSO2 suite is also much less expensive than its proprietary rivals, although the real expense is in the heavyweight approach that it unfortunately shares with them.

7. The Paradox: SOA is really all about simplicity, but it's hard to find SOA architects who seek to simplify systems. Conventional SOA practice seems to be about making integration complex through heavyweight approaches, then introducing tools to manage that complexity, tools that require specialist skills to use properly. If done the conventional way as most SOA consultants seem to agree, your SOA initiative will only leave you with additional complexity to manage.

Of course, if you're politically inclined, you will bask in the prestige of a hefty budget and a large team, and can declare victory anyway on the basis of the number of services and processes you have delivered. But if you want to be really successful at delivering SOA (i.e., making your business more agile and able to operate on a sustainably lower cost basis) while keeping your burn rate low along that journey, you would do well to look at boring, unimpressive and even anticlimactic approaches and technologies such as the ones I've listed above. Give the big vendors a wide berth. You don't need to buy technology (beyond the web servers and databases you already have). You certainly don't need to buy complex technology, which is what the vendors all want to sell you.

And don't let the lack of grandeur in this approach worry you. Complexity impresses the novice, but results are what ultimately impress all.

Postscript: Vendor REST is coming. Beware.