Pretty image
Node.js is a toolkit for writing high-performance network servers in JavaScript. And it’s events all the way down.

In case you haven’t heard, JavaScript is now an excellent language for writing extremely fast production-ready web servers. I know, I didn’t believe it at first, either. But two important projects were started in 2009 to make that possible: CommonJS and Node.

Even though JavaScript has been around for what feels like centuries and is widely used (arguably the most used programming language on the planet), it has been mostly relegated to the safe confines of running inside web browsers. Meanwhile, a few frameworks have attempted to bring JavaScript to the server side, such as Aptana Jaxer (powered by the SpiderMonkey JavaScript interpreter) and Helma (powered by Rhino). But several deficiencies have been holding these platforms back from wider adoption among developers.

The Ecosystem

When we choose a technology to write an application, we don’t just choose the language, we also choose the list of available libraries. If a language has many useful libraries with a vibrant community around them, it’s going to be easier to write your application in less time.

All modern languages have a standard library and a healthy ecosystem of third-party libraries. Python is well-known as a “batteries included” language. And there’s a healthy ecosystem of packages contributed to the Python Package Index (“PyPI”). The same holds true for Ruby and Perl. Not so with JavaScript.

Until recently, you could take JavaScript interpreters like SpiderMonkey, V8, or JavaScriptCore off the shelf and run server-side JavaScript code with them right away. But without libraries, you couldn’t do much of anything real with them without lots of work and time.

In 2009, however, the JavaScript community realized things needed to change. On his blog, Kevin Dangoor explained that JavaScript had a social problem, not a technical one. While JavaScript is a great language, there has been a general lack of agreement on a standard library API and no common way of packaging and using external libraries between frameworks. With no common APIs, each new server-side JavaScript project had to do things in its own way, to the detriment of creating a larger JavaScript ecosystem of cross-project libraries and tools.

So Dangoor started the the “ServerJS” project. The goal was to specify the APIs needed for creating a large and compatible JavaScript library ecosystem. Within a week of its launch, the ServerJS group had 224 members and 653 posted messages to its mailing list. Clearly, Dangoor had struck a nerve with developers. The project was later renamed CommonJS, to better reflect its goal of uniting all JavaScript communities, browser-side and server-side, with common APIs.

Meanwhile, also in 2009, Ryan Dahl was working on a new JavaScript framework called Node. Node is also known by the far more searchable terms Node.js and Nodejs. Node takes Google’s V8 JavaScript interpreter engine, combines it with the CommonJS library APIs, and rolls it up into a complete environment for using JavaScript outside the browser.

A third noteworthy JavaScript event happened in 2009. JavaScript-focused conferences started to appear. Chris Williams and Iterative Designs created JSConf, the first professional conference for JavaScript developers. I had the honor of presenting on Selenium, the web testing tool, at the first JSConf in April 2009, in Washington, D.C.

The Breakthrough Presentation

Though Dahl started the Node project in early 2009, Node’s big day came when he gave a presentation on Node at JSConf Berlin in November. Significant attention from web developers for Node started after that conference, and has been growing since. Between the two JSConf conferences, Dahl was the only one to receive a standing ovation at the end of his talk. In a room of peers on the bleeding edge of their craft, that’s exciting.

(Slides and video of Dahl’s JSConf talk are available online.)

Since JavaScript on the server has been around for years, you may be wondering what the big deal is about Node. What makes it so special? The big idea about programming with Node is that it focuses on evented I/O all the way down.

There are arguably three main programming styles for implementing high-performance servers:

  1. Using multiple processes,

  2. multiple threads, or

  3. single-threaded asynchronous events.

Node is an event-based framework and enforces a “no blocking APIs” policy throughout.

The multi-process or multi-threaded procedural style is the more common, conventional way to program in most other languages, like Java, C#, Perl, Python, Ruby, or PHP. Although event-based programming in those languages is possible, it is not the cultural norm. (Event-based frameworks in Python are available in Twisted or Tornado, and EventMachine in Ruby.)

Doing It with Events

Although event-based programming is an uncommonly used style in other languages, it is the preferred common style for writing browser-based JavaScript code, and Node is inheriting that cultural norm. In the browser and now on the server with Node, event-based programming is the JavaScript Way to code.

For example, here’s how the jQuery documentation explains how to make an asynchronous (aka “Ajax”) request for data.

 $.get(’ajax/test.html’, function(data) {
  $(’.result’).html(data);
  alert(’Load was performed.’);
 });

What does an event-based program in Node look like? Here’s a snippet of code from Dahl’s JSConf presentation:

 db.query("select..", function (result) {
  // use result
 });

In this example, a database query is made, but a callback function is attached. When the database returns results, the callback will be executed. Blocks of code are linked by events. When no database events have been triggered, the program is free to process other code and respond to other events.

Compare this to the more common procedural way this would be coded:

 var result = db.query("select * from T");
 // use result

The problem with the the traditional example is that the program is blocked from doing anything else while waiting for results from the database. The traditional fix for that problem is to wrap the database call in a separate thread or process. At JSConf, Dahl explained that the event-based model is far more efficient with CPU and memory, while also scaling better. Compared to multi-process or multi-threaded programs, event-based frameworks can do more with less.

Getting Started with Node

Here’s the minimal path to start playing with the latest version of Node:

 $ git clone git://github.com/ry/node.git
 $ cd node
 $ ./configure
 $ make
 $ sudo make install
 $ node-repl

Out of the box, Node.js assumes it’s running in a POSIX environment—Linux or Mac OS X. If you’re on Windows, you’ll have to install MinGW to get a POSIX-like environment.

In Node, Http is a first-class citizen. Node is optimized for creating http servers, so most of the examples and libraries you’ll see on the internet will most likely focus on web topics (http frameworks, templating libraries, etc.)

Here’s a simple “hello world” web server:

 var sys = require(’sys’),
  http = require(’http’);
 
 server = http.createServer(function (req, res) {
  res.writeHeader(200, {’Content-Type’: ’text/plain’});
  res.write(’Hello World’);
  res.close();
 })
 server.listen(8000);
 sys.puts(’Server running at <a href="http://127.0.0.1:8000/’);

Let’s examine the parts separately:

 var sys = require(’sys’),
  http = require(’http’);

This is where the CommonJS APIs come in. The require function is the standard way to import modules. Before CommonJS, JavaScript programmers had to roll their own way of importing packages of code similar to the import statement in Python or require in Ruby.

 server = http.createServer(function (req, res) {
  res.writeHeader(200, {’Content-Type’: ’text/plain’});
  res.write(’Hello World’);
  res.close();
 })

The createServer function expects to be given a callback to run every time a new request comes in.

Here’s how to run the example web server:

 $ node hello-world.js
 Server running at <a href="http://127.0.0.1:8000/

Even with this small amount of code, Node is fast enough to deploy as a real server. To measure server performance, Apache Bench is a great little utility for quick ad-hoc load testing. On my MacBook Pro (3.0 GHz Intel Core 2 Duo with 4GB 1067 MHz Ram), here’s how the “hello world” server performs with 10,000 requests, 4 multiple requests at a time.

 $ ab -c 4 -n 10000 <a href="http://127.0.0.1:8000/
 ...
 Requests per second: 6560.50 [#/sec] (mean)
 ...

Granted, most benchmarks are done wrong. This one is especially perilous since it was run on a laptop, and not via a real network. However, 6560 requests per second is nothing to laugh at. V8 is a mighty fast javascript interpreter and Node takes it even further, to make this a compelling platform for building servers.

Okay, So What’s the Catch?

Since Node is so new, there are many little features missing that that Rails and Django developers take for granted. During development, one small, yet important, feature that Node lacks is the ability to auto-restart when a change is made to the server’s source code. Other features like a step-debugger and REPL interactive prompt are available and useful, but still have some rough edges compared to equivalent tooling in Python and Ruby.

There’s a lot of churn going on in the Node codebase, with core APIs being modified between each release. Of course, these changes are making the APIs more consistent, so I’m not complaining. The pace of development is reminiscent of the early days of Ruby on Rails—there is a ton of innovation going on, but if you don’t pay attention for a week, many things are different. It’s the best and worst thing about the project. It’s great because more eyeballs means more bugs get fixed and new features get implemented faster. But it’s bad because you have to work hard to keep up with all the changes.

Until now, writing server software meant not coding in JavaScript. Before Node, there were many better and faster alternatives. Now the game has been changed. Node levels the playing field and is a serious contender for your next web server project.

Jason Huggins co-founded Sauce Labs and currently leads product direction. Prior to Sauce Labs, Jason was a Test Engineer at Google where he supported the grid-scale “Selenium Farm” for testing Google applications such as Gmail and Google Docs. Jason’s experience also includes time at ThoughtWorks in Chicago as a software developer. While at ThoughtWorks, Jason created the Selenium testing framework out of the need to cross-browser test a new in-house time and expense system. When not programming in Python or JavaScript, Jason enjoys hacking on Arduino-based electronics projects. Jason has spent time in New York City, LA, and the Bay Area, but Chicago is his kind of town.