Pretty image
We like our tools to be lightweight. So what’s with all the bloated XML in IBM’s Jazz?

In other industries, being “heavyweight” is a good thing. Heavyweight means stable, tested, grounded, and reliable. But in software, “lightweight” is the new goal—svelte, fast, efficient, easy-to-learn, and elegant. Products like Spring, GlassFish, and Adobe Air pride themselves on being lightweight. They offer just what you need and no more. Who wouldn’t want that?

One lightweight technology gaining converts is JavaScript Object Notation or JSON. It’s a data language similar to XML—both are self-describing, human-readable, hierarchical formats. So an XML packet like this:

  <customer>
   <name>Grover Nixon</name>
    <address>
       <line1>512 Elm</line>
       <line2>Apt. 3B</line2>
       <city>Kennebec</city>
       <state>SD</state>
    </address>
  </customer>

Is equivalent to the JSON:

  "customer": {
      "name": "Grover Nixon",
      "address": {
           "line1": "512 Elm",
           "line2": "Apt. 3B",
           "city": "Kennebec",
           "state": "SD"
  }
  }

The JSON has less structural markup. And though modern web browsers understand both XML and JSON, it processes JSON much faster because it uses the already-optimized JavaScript engine. Passing JSON is like passing partially-parsed data. Once I started using it in the browser and saw how fast it was, I dropped XML like a hot potato.

So I wasn’t surprised to see JSON used in IBM’s Jazz architecture. Jazz is an open source server-based platform providing integration between source control, task management, bug reporting, and communication. Think of it as Eclipse for the server. By building a solid base product, making it infinitely extensible, and open sourcing all the code, IBM creates an environment that users flock to, write extensions for, and embrace with gusto. 

I’m pretty persnickety about software. I’m not content to kick the tires—I need to dig in and read code and look at the protocols that flow over the wire. If you find elegance there, you find it everywhere in a product. So that’s how I approached Jazz. I fired up the server, opened the admin page in my Firefox browser, opened Firebug, and watched the net conversation:

 {
   "soapenv:Body": {
     "response": {
       ...
       "method":"getQuery",
       "interface":
  "com.ibm.team.workitem.common.internal.rest.IQueryRestService",
  "_eQualifiedClassName":
  "http://com/ibm/team/core/services.ecore:Response"
     },
     "_eQualifiedClassName":
  "http://schemas.xmlsoap.org/soap/envelope/:Body"
   },
   "_eQualifiedClassName":
  "http://schemas.xmlsoap.org/soap/envelope/:Envelope"
 }

If you’ve seen enough SOA stuff in your life, this should look somewhat familiar. It’s JSON all right, but it looks like a SOAP packet. And don’t these _equalifiedClassNames look a lot like XML namespaces? Here was evidence that there’s some heavyweight XML processing going on in the Jazz server. 

“Whoa!” I thought. Combining the heavyweight with the lightweight—isn’t that like sucking a Nerf ball through a straw? Rather than defining some svelte new JSON-based protocol, they simply borrowed web service protocols and converted them to JSON for the last 10 feet—that is, from the web server to the browser. The idea seemed crazy at first. We all know SOAP and XML are bloated, slow, and committee-driven, right? Wouldn’t it be faster, easier, better to use JSON across the entire data transfer spectrum? Since IBM built Jazz from the ground up, they could’ve easily done that, right?

Data Integration Is Hard

But here’s the thing. You might have a better architecture in the end, but not a lot better. It might be faster, but not a lot faster. When you go down the JSON route, you run into the same issues that XML faced 10 years ago:

  • Mixing data from two different sources into one JSON packet can cause element labels to bump into each other. Mix up a packing slip and an invoice, and suddenly the From address may mean something quite different. That’s why XML has namespaces.

  • Converting between different JSON structures would require writing mundane code. A more declarative way to map data would make the job easier. That’s why XML has XSLT.

  • Describing a JSON packet’s structure—its fields, data types, etc.—is necessary in order for people to hook into your services. It’s essential to have a metadata language for this. That’s why XML has Schemas.

  • Carrying on two simultaneous client-server conversations takes care. If you ask the server two questions and get one answer back, how do you know what question it answers? That’s why XML has WS-Correlation.

...and so on. The bottom line is this: data integration is difficult. XML is a rich and complex format because it has to be. You can think of it as Robert’s Rules of Order for computer data. Because you can’t govern people’s values and upbringing, the rules must take into account different cultures, sources, and styles of communication that are all valid. So it is with XML. 

Mixing data integration into your browser-based app makes it unwieldy. Slicing, dicing, combining, transforming and aggregating data... that’s just not something JavaScript does well. Plus it detracts from the real purpose of your browser app—display and input. So by delegating this job to an XML and SOA infrastructure, you can express your own apps more succinctly.

Take a concrete example: the Google Map API. To use it, you must obtain an application key from Google and use it in each of your requests. You can do this directly in JavaScript, but then you end up coding this key into your application where people can see it. Furthermore, now you have the same key hardcoded in many different application scripts. This is a clear violation of the DRY rule—Don’t Repeat Yourself. 

But an SOA server is good at that stuff. You write a proxy there for Google Maps, so your app can send JSON Google Maps-bound data to this proxy, which converts it from JSON to XML, adds the key to each request, and sends it off. In most SOA servers, you can do this with zero code. The magic is performed in XML and XSLT. Why write a JSON mechanism for the same thing?

Discovered Elegance

XML universality adds other benefits. In Jazz, for example, I can examine packets across the wire as I’m doing GUI-based tasks, then replay those web service calls from any web service language. Instant macros! And not just macros but macros that can be called from any language that speaks Web Services... which is to say, any language. If you need an extremely high-performance macro, you can just write it in C++. 

“That’s nice,” you might counter, “but it’s still heavyweight.”  But even that’s less true nowadays. XML processing is sped up with products like Intel’s SOA Expressway, which does parsing way down in machine language. On top of that, you can build a good SOA infrastructure, which doesn’t necessarily mean a large one. Of course there are huge SOA suites with ga-gillions of features and other-worldly price tags. But there are solid smaller suites too, and many open source options: WSO2, Mule, and GlassFish, to name a few. These products take away a lot of the mundane XML tasks you’d do in code.   

So in the final analysis, I think Jazz has the right idea. Use the right tool for the right job. Mold all your data with XML, and give it to the browser in easy-to-digest JSON. A well-balanced, harmonious assembly line... that’s a very elegant approach.

Craig Riecke is a Dojo committer and a writer and editor for the Book of Dojo, Dojo’s online documentation. Holding a BA in English and an MS in Computer Science from the University of Nebraska, he is currently Chief Software Architect for CXtec in Syracuse, NY. While programming he listens to old, scratchy blues music on his iPod. His motto is “I’d rather drink muddy water and sleep in a hollow log than write a redundant line of code.”