DailyJS

DailyJS

The JavaScript blog.


Taglibuv
Featured

node unix libuv

From fs.readFile to read(2)

Posted on .

Here's a statement you might hear used to describe Node:

Node uses asynchronous I/O to avoid using threads for fast filesystem and network operations.

This statement is false (in more ways than one), and understanding why will help you better understand Node.

First, consider the question: what is asynchronous?

fs.readFile('file.txt', function(err, data) {  
  if (err) throw err;
  console.log(data);
});

That is an asynchronous API. The program will continue running after the fs.readFile. The callback may be run at some point in the future. As JavaScript programmers we like this because JavaScript's scoping rules mean the callback gets a closure, so we find it easy to reason about what variables are available to the callback, while being able to do thing while something potentially slow (I/O, web requests) responds.

Even if you're not a Node programmer, you're probably comfortable with the last example. It's not amazingly different to a jQuery Ajax request, where callbacks are passed to $.ajax.

Now, assuming you're not a C/C++ programmer and don't know too much about Node's internals, here's the amazing thing about fs.readFile -- it calls a (potentially) blocking system call.

How is that possible? We know our program continues when we try to read a file, so what's going on? The first step is to look at Node's fs source.

Bindings

If you look at the source for fs.readFile in lib/fs.js, you'll see binding.read. Whenever you see binding in Node's core modules you're looking at a portal into the land of C++. This binding is made available using NODE_SET_METHOD(target, "read", Read). If you know any C, you might think this is a macro -- it was originally, but it's now a function. The reference to Read here is a C++ function in src/node_file.cc.

Like other filesystem code in Node, Read defines a synchronous and asynchronous version using ASYNC_CALL and SYNC_CALL. These are macros used to bind to libuv. If you dig into libuv, which you can find in deps/uv or on GitHub at joyent / libuv, then you'll discover something interesting: the filesystem code actually uses libuv's own streams, buffers, and native filesystem wrapping code.

Filesystem Wrapping

Going back to ASYNC_CALL in Read, one of the arguments is read: the syscall read (help: man 2 read). But wait, doesn't this function block? Yes, but that's not the end of the story. As summarised in An Introduction to libuv:

The libuv filesystem operations are different from socket operations. Socket operations use the non-blocking operations provided by the operating system. Filesystem operations use blocking functions internally, but invoke these functions in a thread pool and notify watchers registered with the event loop when application interaction is required.

Let's look a little deeper. The ASYNC_CALL macro uses FSReqWrap and calls uv_fs_ methods. The one that gets bound to for reading is uv_fs_read:

int uv_fs_read(uv_loop_t* loop, uv_fs_t* req,  
               uv_file file,
               void* buf,
               size_t len,
               int64_t off,
               uv_fs_cb cb) {
  INIT(READ);
  req->file = file;
  req->buf = buf;
  req->len = len;
  req->off = off;
  POST;
}

What's that POST macro at the end? It checks to see if a callback has been provided, and if so uses uv__work_submit from src/unix/threadpool.c to read the file in a thread queue. Notice I'm just talking about Unix here, but in following fs.readFile down to the syscall that does the work it's interesting to find uv__work_submit.

In this example the Node API method fs.readFile is asynchronous, but that doesn't necessarily mean it's non-blocking underneath. As the libuv book points out, socket (network) code is non-blocking, but filesystems are more complicated. Some things are event-based (kqueue), others use threads, and I'm currently trying to work out what Windows is doing (I'm only a libuv tourist for now).

I'm collating my notes on libuv for a talk on Node's internals at The Great British Node Conference. The talk started more as a walkthrough of the core modules and binding, but I enjoyed looking at libuv in more detail so hopefully I can fit some of this stuff in there. It'd be good to see you there!