The JavaScript blog.


node unix libuv

From fs.readFile to read(2)

Posted on .

Here's a statement you might hear used to describe Node:

Node uses asynchronous I/O to avoid using threads for fast filesystem and network operations.

This statement is false (in more ways than one), and understanding why will help you better understand Node.

First, consider the question: what is asynchronous?

fs.readFile('file.txt', function(err, data) {  
  if (err) throw err;

That is an asynchronous API. The program will continue running after the fs.readFile. The callback may be run at some point in the future. As JavaScript programmers we like this because JavaScript's scoping rules mean the callback gets a closure, so we find it easy to reason about what variables are available to the callback, while being able to do thing while something potentially slow (I/O, web requests) responds.

Even if you're not a Node programmer, you're probably comfortable with the last example. It's not amazingly different to a jQuery Ajax request, where callbacks are passed to $.ajax.

Now, assuming you're not a C/C++ programmer and don't know too much about Node's internals, here's the amazing thing about fs.readFile -- it calls a (potentially) blocking system call.

How is that possible? We know our program continues when we try to read a file, so what's going on? The first step is to look at Node's fs source.


If you look at the source for fs.readFile in lib/fs.js, you'll see binding.read. Whenever you see binding in Node's core modules you're looking at a portal into the land of C++. This binding is made available using NODE_SET_METHOD(target, "read", Read). If you know any C, you might think this is a macro -- it was originally, but it's now a function. The reference to Read here is a C++ function in src/node_file.cc.

Like other filesystem code in Node, Read defines a synchronous and asynchronous version using ASYNC_CALL and SYNC_CALL. These are macros used to bind to libuv. If you dig into libuv, which you can find in deps/uv or on GitHub at joyent / libuv, then you'll discover something interesting: the filesystem code actually uses libuv's own streams, buffers, and native filesystem wrapping code.

Filesystem Wrapping

Going back to ASYNC_CALL in Read, one of the arguments is read: the syscall read (help: man 2 read). But wait, doesn't this function block? Yes, but that's not the end of the story. As summarised in An Introduction to libuv:

The libuv filesystem operations are different from socket operations. Socket operations use the non-blocking operations provided by the operating system. Filesystem operations use blocking functions internally, but invoke these functions in a thread pool and notify watchers registered with the event loop when application interaction is required.

Let's look a little deeper. The ASYNC_CALL macro uses FSReqWrap and calls uv_fs_ methods. The one that gets bound to for reading is uv_fs_read:

int uv_fs_read(uv_loop_t* loop, uv_fs_t* req,  
               uv_file file,
               void* buf,
               size_t len,
               int64_t off,
               uv_fs_cb cb) {
  req->file = file;
  req->buf = buf;
  req->len = len;
  req->off = off;

What's that POST macro at the end? It checks to see if a callback has been provided, and if so uses uv__work_submit from src/unix/threadpool.c to read the file in a thread queue. Notice I'm just talking about Unix here, but in following fs.readFile down to the syscall that does the work it's interesting to find uv__work_submit.

In this example the Node API method fs.readFile is asynchronous, but that doesn't necessarily mean it's non-blocking underneath. As the libuv book points out, socket (network) code is non-blocking, but filesystems are more complicated. Some things are event-based (kqueue), others use threads, and I'm currently trying to work out what Windows is doing (I'm only a libuv tourist for now).

I'm collating my notes on libuv for a talk on Node's internals at The Great British Node Conference. The talk started more as a walkthrough of the core modules and binding, but I enjoyed looking at libuv in more detail so hopefully I can fit some of this stuff in there. It'd be good to see you there!


node modules unix

Unix: It's Alive!

Posted on .

On a philosophical level, Node developers love Unix. I like to think that's why Node's core modules are relatively lightweight compared to other standard libraries (is an FTP library really necessary?) -- Node's modules quietly get out of the way, allowing the community to provide solutions to higher-level problems.

As someone who sits inside tmux/Vim/ssh all day, I'm preoccupied with command-line tools and ways to work more efficiently in the shell. That's why I was intrigued to find bashful (GitHub: substack / bashful, License: MIT, npm: bashful) by substack. It allows Bash to be parsed and executed. To use it, hook it up with some streams:

var bash = require('bashful')(process.env);  
bash.on('command', require('child_process').spawn);

var s = bash.createStream();  

After installing bashful, running this example with node sh.js will allow you to issue shell commands. Not all of Bash's built-in commands are supported yet (there's a list and to-do in the readme), but you should be able to execute commands and run true and false, then get the last exit status with echo $?.

How does this work? Well, the bashful module basically parses each line, character-by-character, to tokenise the input. It then checks anything that looks like a command against the list of built-in commands, and runs it. It mixes Node streams with a JavaScript bash parser to create a Bash-like layer that you can reuse with other streams.

This module depends on shell-quote, which correctly escapes those gnarly quotes in shell commands. I expect substack will make a few more shell-related modules as he continues work on bashful.

ShellJS (GitHub: arturadib / shelljs, npm: shelljs) by Artur Adib has been around for a while, but still receives regular updates. This module gives you shell-like commands in Node:


mkdir('-p', 'out/Release');  
cp('-R', 'stuff/*', 'out/Release');  

It can even mimic Make, so you could write your build scripts with it. This would make sense if you're sharing code with Windows-based developers.

There are plenty of other interesting Unix-related modules that are alive and regularly updated. One I was looking at recently is suppose (GitHub: jprichardson / node-suppose, License: MIT, npm: suppose) by JP Richardson, which is an expect(1) clone:

var suppose = require('suppose')  
  , fs = require('fs')
  , assert = require('assert')

suppose('npm', ['init'])  
  .on(/name\: \([\w|\-]+\)[\s]*/).respond('awesome_package\n')
  .on('version: (0.0.0) ').respond('0.0.1\n')
  // ...

It uses a chainable API to allow expect-like expressions to capture and react to the output of other programs.

Unix in the Node community is alive and well, but I'm sure there's also lots of Windows-related fun to be had -- assuming you can figure out how to use Windows 9 with a keyboard and mouse that is...


testing node modules time async promises unix daemons

Node Roundup: wish, Vow, shell-jobs

Posted on .

You can send in your Node projects for review through our contact form.


wish (GitHub: EvanBurchard / wish, License: MIT, npm: wish) by Evan Burchard is an assertion module designed to raise meaningful, human-readable errors. When assertions fail, it parses the original source to generate a useful error message, which means the standard comparison operators can be used.

For example, if wish(a === 5) failed an error like this would be displayed:

  Expected "a" to be equal(===) to "5".

If assert(a === 5) had been used instead, AssertionError: false == true would have been raised. A fairer comparison would be assert.equal, which would produce AssertionError: 4 == 5, but it's interesting that wish is able to introspect the variable name and add that to the error.


Vow (GitHub: dfilatov / jspromise, License: MIT/GPL, npm: vow) by Filatov Dmitry is a Promises/A+ implementation. Promises can be created, fulfilled, and rejected -- you should be able to get the hang of it if you've used libraries with then methods elsewhere, but there are some differences to Promises/A which feels like it actually simplifies some of the potentially messier parts of the original CommonJS specification.

Here's an example of the Vow API:

var promise1 = Vow.promise(),  
    promise2 = Vow.promise();

Vow.all([promise1, promise2, 3])  
  .then(function(value) {
    // value is [1, 2, 3]


The author has written some pretty solid looking tests, and benchmarks are included as well. The project performs favorably when compared to other popular promise libraries:

 mean timeops/sec


I like seeing daemons made in Node, and Azer Ko├žulu recently sent in a cron-inspired daemon called shell-jobs (GitHub: azer / shell-jobs, License: MIT, npm: shell-jobs). It uses .jobs files that are intended to be human readable. All you need to do is write a shell command followed by a # => and then a time:

cowsay "Hello" > /tmp/jobs.log # => 2 minutes  

The shell-jobs script will then parse this file and output the following:

  jobs Starting "cowsay "Hello" > /tmp/jobs.log" [2 minutes] +2ms

After two minutes has passed the job will be executed:

  exec 1. Running cowsay "Hello" > /tmp/jobs.log. +0ms


JSON cli node modules search unix sandbox

Node Roundup: 0.8.17, 0.9.6, gelf-node, jsong, Stuff.js

Posted on .

You can send in your Node projects for review through our contact form or @dailyjs.

Node 0.8.17, 0.9.6 (Unstable)

Node 0.8.17 was released last week with a security fix for TypedArrays, so you should upgrade if you're using them:

If user input can affect the size parameter in a TypedArray, an integer overflow vulnerability could allow an attacker to write to areas of memory outside the intended buffer.

The unstable branch also saw a new release with 0.9.6. The streams API has changed slightly again as it continues to be developed: Isaac Schlueter added the readable.push method, and there are also fixes for TypedArrays in this branch too.


I've had a lot of luck with ElasticSearch. The last time I used it was on a project that used Node HTTP crawlers to index thousands of sites, and it was all backed by ElasticSearch. It worked extremely well and I actually got paid! If you're also using ElasticSearch, then you might be interested in the gelf-node module (GitHub: robertkowalski / gelf-node, License: MIT, npm: gelf) by Robert Kowalski. It works with Graylog2, allowing messages to be sent from Node:

var Gelf = require('gelf');  
var gelf = new Gelf({  
  graylogPort: 12201,
  graylogHostname: '',
  connection: 'wan',
  maxChunkSizeWan: 1420,
  maxChunkSizeLan: 8154

// The readme has an example message
gelf.emit('gelf.log', message);  

Graylog2 itself is released under the GPL (version 3).


jsong (GitHub: textgoeshere / jsong, npm: jsong, License: MIT) by Dave Nolan is a CLI tool and module for filtering JSON. It's built with streamin and clarinet, and shows full paths to matches:

$ cat my.json | jsong -k 'z\wp'

foo.bar.zip: val1  
foo.bar.zap: val2  
quux.zip: val  

Because it's built using streams, it should handle large JSON files.


Here's another project by Amjad Masad from Codecademy: Stuff.js (GitHub: Codecademy / stuff.js, License: MIT) -- an easy way to run arbitrary HTML and JavaScript in an iframe. It uses node-static and uglify-js to create a sandbox for securely running user-contributed code.

There's an example in Amjad's blog post that shows how to use it:

stuff(secureIframeUrl, function (context) {  
  var html = CodeMirror.fromTextArea($('#html'), {
    onChange: reload
  , mode: 'text/html'
  var js = CodeMirror.fromTextArea($('#js'), {
    onChange: reload
  , mode: 'javascript'
  var css = CodeMirror.fromTextArea($('#css'), {
    onChange: reload
  , mode: 'css'

  var t = null;
  function reload () {
    t = setTimeout(function () {
      var code = '<!DOCTYPE html><html><head>';
      code += '<style>'  + css.getValue() + '</style>';
      code += '<body>' + html.getValue();
      code += '<script>' + js.getValue() + '</script>';
      code += '</body></html>';
    }, 50);


node unix daemons

Node Daemon Architecture

Posted on .

I've been researching the architecture of application layer server implementations in Node. I'm talking SMPT, IMAP, NNTP, Telnet, XMPP, and all that good stuff.

Node has always seemed like the perfect way to write network oriented daemons. If you're a competent JavaScript developer, it unlocks powerful asynchronous I/O features. In The Architecture of Open Source Applications: nginx by Andrew Alexeev, the author explains nginx's design in detail -- in case you don't know, nginx is a HTTP daemon that's famous for solid performance. Andrew's review states the following:

It was actually inspired by the ongoing development of advanced event-based mechanisms in a variety of operating systems. What resulted is a modular, event-driven, asynchronous, single-threaded, non-blocking architecture which became the foundation of nginx code.


Connections are processed in a highly efficient run-loop in a limited number of single-threaded processes called workers. Within each worker nginx can handle many thousands of concurrent connections and requests per second.

Highly efficient run-loop and event-based mechanisms? That sounds exactly like a Node program! In fact, Node comes with several built-in features that make dealing with such an architecture a snap.


If you read DailyJS you probably know all about EventEmitter. If not, then this is the heart of Node's event-based APIs. Learn EventEmitter and the Stream API and you'll be able to easily learn Node's other APIs very quickly.

EventEmitter is the nexus of Node's APIs. You'll see it underlying the network APIs, including the HTTP and HTTPS servers. You can happily stuff it into your own classes with util.inherits -- and you should! At this point, many popular third-party Node modules use EventEmitter or one of its descendants as a base class.

If you're designing a server of some kind, it would be wise to consider basing it around EventEmitter. And once you realise how common this is, you'll find all kinds of ways to improve the design of everything from daemons to web applications. For example, if I need to notify disparate entities within an Express application that something has happened, knowing that Express mixes EventEmitter into the app object means I can do things like app.on and app.emit rather than requiring access to a global app object.


Guess what else is an instance of EventEmitter? The process global object. It can be used to manage the current process -- including events for signals.


Domains can be used to group I/O operations -- that means working with errors in nested callbacks is a little bit less painful:

If any of the event emitters or callbacks registered to a domain emit an error event, or throw an error, then the domain object will be notified, rather than losing the context of the error in the process.on('uncaughtException') handler, or causing the program to exit with an error code.

Domains are currently experimental, but from my own experiences writing long-running daemons with Node, they definitely bring a level of sanity to my spaghetti code.


The Cluster module is also experimental, but makes it easier to spawn multiple Node processes that share server ports. These processes, or workers, can communicate using IPC (Inter-process communication) -- all using the EventEmitter-based API you know and love.

In the Wild

I've already mentioned that Express "mixes in" EventEmitter. This is in contrast to the inheritance-based approach detailed in Node's documentation. It's incorrect to say Express does this because it's actually done by Connect, in connect.js:

function createServer() {  
  function app(req, res){ app.handle(req, res); }
  utils.merge(app, proto);
  utils.merge(app, EventEmitter.prototype);
  app.route = '/';
  app.stack = [];
  for (var i = 0; i < arguments.length; ++i) {
  return app;

The utils.merge method copies properties from one object to another:

exports.merge = function(a, b){  
  if (a && b) {
    for (var key in b) {
      a[key] = b[key];
  return a;

There's also a unit test that confirms that the authors intended to mix in EventEmitter.

An extremely popular way to daemonize (never demonize, which means to "portray as wicked and threatening") a program is to use the forever module. It can be used as a command-line script or as a module, and is built on some modules that are useful for creating Node daemons, like forever-monitor and winston.

However, what I'm really interested in is the architecture of modules that provide services rather than utility modules for managing daemons. One such example is statsd from Etsy. It's a network daemon for collecting statistics. The core server code, stats.js, uses net.createServer and a switch statement to execute commands based on the server's protocol. Notable uses of EventEmitter include backendEvents for asynchronously communicating with the data storage layer, and automatic configuration file reloading. I particularly like the fact the configuration file is reloaded -- it's a good use of Node's built-in features.

James Halliday's smtp-protocol can be used to implement SMTP servers (it isn't itself an SMTP server). The server part of the module is based around a protocol parser, ServerParser -- a prototype class and a class for representing clients (Client). Servers are created using net.createServer, much like the other projects I've already mentioned.

This module is useful because it demonstrates how to separate low-level implementation details from the high-level concerns of implementing a real production-ready server. Completely different SMTP servers could be built using smtp-protocol as the foundation. Real SMTP servers need to deal with things like relaying messages, logging, and managing settings, so James has separated that out whilst retaining a useful level of functionality for his module.

I've also been reading through the telnet module, which like smtp-protocol can be used to implement a telnet server.

At the moment there seems to be a void between these reusable server modules and daemons that can be installed on production servers. Node makes asynchronous I/O more accessible, which will lead to novel server implementations like Etsy's stats server. If you've got an idea for a niche application layer server, then why not build it with Node?