DailyJS

DailyJS

The JavaScript blog.


Tagleveldb
Featured

databases node leveldb

LevelDB and Node: Getting Up and Running

Posted on .

This is the second article in a three-part series on LevelDB and how it can be used in Node.

Our first article covered the basics of LevelDB and its internals. If you haven't already read it you are encouraged to do so as we will be building upon this knowledge as we introduce the Node interface in this article.

LevelDB

There are two primary libraries for using LevelDB in Node, LevelDOWN and LevelUP.

LevelDOWN is a pure C++ interface between Node.js and LevelDB. Its API provides limited sugar and is mostly a straight-forward mapping of LevelDB's operations into JavaScript. All I/O operations in LevelDOWN are asynchronous and take advantage of LevelDB's thread-safe nature to parallelise reads and writes.

LevelUP is the library that the majority of people will use to interface with LevelDB in Node. It wraps LevelDOWN to provide a more Node.js-style interface. Its API provides more sugar than LevelDOWN, with features such as optional arguments and deferred-till-open operations (i.e. if you begin operating on a database that is in the process of being opened, the operations will be queued until the open is complete).

LevelUP exposes iterators as Node.js-style object streams. A LevelUP ReadStream can be used to read sequential entries, forward or reverse, to and from any key.

LevelUP handles JSON and other encoding types for you. For example, when operating on a LevelUP instance with JSON value-encoding, you simply pass in your objects for writes and they are serialised for you. Likewise, when you read them, they are deserialised and passed back in their original form.

A simple LevelUP example

var levelup = require('levelup')

// open a data store
var db = levelup('/tmp/dprk.db')

// a simple Put operation
db.put('name', 'Kim Jong-un', function (err) {

  // a Batch operation made up of 3 Puts
  db.batch([
      { type: 'put', key: 'spouse', value: 'Ri Sol-ju' }
    , { type: 'put', key: 'dob', value: '8 January 1983' }
    , { type: 'put', key: 'occupation', value: 'Clown' }
  ], function (err) {

    // read the whole store as a stream and print each entry to stdout
    db.createReadStream()
      .on('data', console.log)
      .on('close', function () {
        db.close()
      })
  })
})

Execute this application and you'll end up with this output:

{ key: 'dob', value: '8 January 1983' }
{ key: 'name', value: 'Kim Jong-un' }
{ key: 'occupation', value: 'Clown' }
{ key: 'spouse', value: 'Ri Sol-ju' }

Basic operations

Open

There are two ways to create a new LevelDB store, or open an existing one:

levelup('/path/to/database', function (err, db) {  
  /* use `db` */
})

// or

var db = levelup('/path/to/database')  
/* use `db` */

The first version is a more standard Node-style async instantiation. You only start using the db when LevelDB is set up and ready.

The second version is a little more opaque. It looks like a synchronous operation but the actual open call is still asynchronous although you get a LevelUP object back immediately to use. Any calls you make on that object that need to operate on the underlying LevelDB store are queued until the store is ready to accept calls. The actual open operation is very quick so the initial is delay generally not noticeable.

Close

To close a LevelDB store, simply call close() and your callback will be called when the underlying store is completely closed:

// close to clean up
db.close(function (err) { /* ... */ })  

Read, write and delete

Reading and writing are what you would expect for asynchronous Node methods:

db.put('key', 'value', function (err) { /* ... */ })

db.del('key', function (err) { /* ... */ })

db.get('key', function (err, value) { /* ... */ })  

Batch

As mentioned in the first article, LevelDB has a batch operation that performs atomic writes. These writes can be either put or delete operations.

LevelUP takes an array to perform a batch, each element of the array is either a 'put' or a 'del':

var operations = [  
    { type: 'put', key: 'Franciscus', value: 'Jorge Mario Bergoglio' }
  , { type: 'del', key: 'Benedictus XVI' }
]

db.batch(operations, function (err) { /* ... */ })  

Streams!

LevelUP turns LevelDB's Iterators into Node's readable streams, making them surprisingly powerful as a query mechanism.

LevelUP's ReadStreams share all the same characteristics as standard Node readable object streams, such as being able to pipe() to other streams. They also emit all of the the expected events.

var rs = db.createReadStream()

// our new stream will emit a 'data' event for every entry in the store

rs.on('data' , function (data) { /* data.key & data.value */ })  
rs.on('error', function (err) { /* handle err */ })  
rs.on('close', function () { /* stream finished & cleaned up */ })  

But it's the various options for createReadStream(), combined with the fact that LevelDB sorts by keys that makes it a powerful abstraction:

db.createReadStream({  
    start     : 'somewheretostart'
  , end       : 'endkey'
  , limit     : 100           // maximum number of entries to read
  , reverse   : true          // flip direction
  , keys      : true          // see db.createKeyStream()
  , values    : true          // see db.createValueStream()
})

'start' and 'end' point to keys in the store. These don't need to even exist as actual keys because LevelDB will simply jump to the next existing key in lexicographical order. We'll see later why this is helpful when we explore namespacing and range queries.

LevelUP also provides a WriteStream which maps write() operations to Puts or Batches.

Since ReadStream and WriteStream follow standard Node.js stream patterns, a copy database operation is simply a pipe() call:

function copy (srcdb, destdb, callback) {  
  srcdb.createReadStream()
    .pipe(destdb.createWriteStream())
    .on('error', callback)
    .on('close', callback)
}

Encoding

LevelUP will accept most kinds of JavaScript objects, including Node's Buffers, as both keys and values for all its operations. LevelDB stores everything as simple byte arrays so most objects need to be encoded and decoded as they go in and come out of the store.

You can specify the encoding of a LevelUP instance and you can also specify the encoding of individual operations. This means that you can easily store text and binary data in the same store.

'utf8' is the default encoding but you can change that to any of the standard Node Buffer encodings. You can also use the special 'json' encoding:

var db = levelup('/tmp/dprk.db', { valueEncoding: 'json' })

db.put(  
    'dprk'
  , {
        name       : 'Kim Jong-un'
      , spouse     : 'Ri Sol-ju'
      , dob        : '8 January 1983'
      , occupation : 'Clown'
    }
  , function (err) {
      db.get('dprk', function (err, value) {
        console.log('dprk:', value)
        db.close()
      })
    }
)

Gives you the following output:

dprk: { name: 'Kim Jong-un',  
  spouse: 'Ri Sol-ju',
  dob: '8 January 1983',
  occupation: 'Clown' }

Advanced example

In this example we assume the data store contains numeric data, where ranges of data are stored with prefixes on the keys. Our example function takes a LevelUP instance and a range key prefix and uses a ReadStream to calculate the variance of the values in that range using an online algorithm:

function variance (db, prefix, callback) {  
  var n = 0, m2 = 0, mean = 0

  db.createReadStream({
        start : prefix          // jump to first key with the prefix
      , end   : prefix + '\xFF' // stop at the last key with the prefix
    })
    .on('data', function (data) {
      var delta = data.value - mean
      mean += delta / ++n
      m2 = m2 + delta * (data.value - mean)
    })
    .on('error', callback)
    .on('close', function () {
      callback(null, m2 / (n - 1))
    })
}

Let's say you were collecting temperature data and you stored your keys in the form: location~timestamp. Sampling approximately every 5 seconds, collecting temperatures in Celsius we may have data that looks like this:

au_nsw_southcoast~1367487282112 = 18.23  
au_nsw_southcoast~1367487287114 = 18.22  
au_nsw_southcoast~1367487292118 = 18.23  
au_nsw_southcoast~1367487297120 = 18.23  
au_nsw_southcoast~1367487302124 = 18.24  
au_nsw_southcoast~1367487307127 = 18.24  
...

To calculate the variance we can use our function to do it while efficiently streaming values from our store by simply calling:

variance(db, 'au_nsw_southcoast~', function (err, v) {  
  /* v = variance */
})

Namespacing

The concept of namespacing keys will probably be familiar if you're used to using a key/value store of some kind. By separating keys by prefixes we create discrete buckets, much like a table in a traditional relational database is used to separate different kinds of data.

It may be tempting to create separate LevelDB stores for different buckets of data but you can take better advantage of LevelDB's caching mechanisms if you can keep the data organised in a single store.

Because LevelDB is sorted, choosing a namespace separator character can have an impact on the order of your entries. A commonly chosen namespace character often used in NoSQL databases is ':'. However, this character lands in the middle of the list of printable ASCII characters (character code 58), so your entries may not end up being sorted in a useful order.

Imagine you're implementing a web server session store with LevelDB and you're prefixing keys with usernames. You may have entries that look like this:

rod.vagg:last_login    = 1367487479499  
rod.vagg:default_theme = psychedelic  
rod1977:last_login     = 1367434022300  
rod1977:default_theme  = disco  
rod:last_login         = 1367488445080  
rod:default_theme      = funky  
roderick:last_login    = 1367400900133  
roderick:default_theme = whoa  

Note that these entries are sorted and that '.' (character code 46) and '1' (character code 49) come before ':'. This may or may not matter for your particular application, but there are better ways to approach namespacing.

Recommended delimiters

At the beginning of the printable ASCII character range is '!' (character code 33), and at the end we find '~' (character code 126). Using these characters as a delimiter we find the following sorting for our keys:

rod!...  
rod.vagg!...  
rod1977!...  
roderick!...  

or

rod.vagg~...  
rod1977~...  
roderick~...  
rod~...  

But why stick to the printable range? We can go right to the edges of the single-byte character range and use '\x00' (null) or '\xff' (ÿ).

For best sorting of your entries, choose '\x00' (or '!' if you really can't stomach it). But whatever delimiter you choose, you're still going to need to control the characters that can be used as keys. Allowing user-input to determine your keys and not stripping out your delimiter character could result in the NoSQL equivalent of an SQL Injection Attack (e.g. consider the unintended consequences that may arise with the dataset above with a delimiter of '!' and allowing a user to have that character in their username).

Range queries

LevelUP's ReadStream is the perfect range query mechanism. By combining 'start' and 'end', which just need to be approximations of actual keys, you can pluck out the exact the entries you want.

Using our namespaced dataset above, with '\x00' as delimiters, we can fetch all entries for just a single user by carafting a ReadStream range query:

var entries = []

db.createReadStream({ start: 'rod\x00', end: 'rod\x00\xff' })  
  .on('data', function (entry) { entries.push(entry) })
  .on('close', function () { console.log(entries) })

Would give us:

[ { key: 'rod\x00last_login', value: '1367488445080' },
  { key: 'rod\x00default_theme', value: 'funky' } ]

The '\xff' comes in handy here because we can use it to include every string of characters preceding it, so any of our user session keys will be included, as long as they don't start with '\xff'. So again, you need to control the allowable characters in your keys in order to avoid surprises.

Namespacing and range queries are heavily used by many of the libraries that extend LevelUP. In the final article in this series we'll be exploring some of the amazing ways that developers are extending LevelUP to provide additional features, applications and complete databases.

If you want to jump ahead, visit the Modules page on the LevelUP wiki.

Featured

databases node leveldb

LevelDB and Node: What is LevelDB Anyway?

Posted on .

This is the first article in a three-part series on LevelDB and how it can be used in Node.

This article will cover the LevelDB basics and internals to provide a foundation for the next two articles. The second and third articles will cover the core LevelDB Node libraries: LevelUP, LevelDOWN and the rest of the LevelDB ecosystem that's appearing in Node-land.

LevelDB

What is LevelDB?

LevelDB is an open-source, dependency-free, embedded key/value data store. It was developed in 2011 by Jeff Dean and Sanjay Ghemawat, researchers from Google. It's written in C++ although it has third-party bindings for most common programming languages. Including JavaScript / Node.js of course.

LevelDB is based on ideas in Google's BigTable but does not share code with BigTable, this allows it to be licensed for open source release. Dean and Ghemawat developed LevelDB as a replacement for SQLite as the backing-store for Chrome's IndexedDB implementation.

It has since seen very wide adoption across the industry and serves as the back-end to a number of new databases and is now the recommended storage back-end for Riak.

Features

  • Arbitrary byte arrays: both keys and values are treated as simple arrays of bytes, so content can anything from ASCII strings to binary blobs.
  • Sorted by keys: by default, LevelDB stores entries lexicographically sorted by keys. The sorting is one of the main distinguishing features of LevelDB amongst similar embedded data storage libraries and comes in very useful for querying as we'll see later.
  • Compressed storage: Google's Snappy compression library is an optional dependency that can decrease the on-disk size of LevelDB stores with minimal sacrifice of speed. Snappy is highly optimised for fast compression and therefore does not provide particularly high compression ratios on common data.
  • Basic operations: Get(), Put(), Del(), Batch()

Basic architecture

Log Structured Merge (LSM) tree

LSM

All writes to a LevelDB store go straight into a log and a "memtable". The log is regularly flushed into sorted string table files (SST) where the data has a more permanent home.

Reads on a data store merge these two distinct data structures, the log and the SST files. The SST files represent mature data and the log represents new data, including delete-operations.

A configurable cache is used to speed up common reads. The cache can potentially be large enough to fit an entire active working set in memory, depending on the application.

String Sorted Table files (SST)

Each SST file is limited to ~2MB, so a large LevelDB store will have many of these files. The SST file is divided internally into 4K blocks, each of which can be read in a single operation. The final block is an index that points to the start of each data block and its the key of the entry at the start of the block. A Bloom filter is used to speed up lookups, allowing a quick scan of an index to find the block that may contain the desired entry.

Keys can have shared prefixes within blocks. Any common prefix for keys within a block will be stored once, with subsequent entries storing just the unique suffix. After a fixed number of entries within a block, the shared prefix is "reset"; much like a keyframe in a video codec. Shared prefixes mean that verbose namespacing of keys does not lead to excessive storage requirements.

Table file hierarchy

The table files are not stored in a simple sequence, rather, they are organised into a series of levels. This is the "Level" in LevelDB.

Entries that come straight from the log are organised in to Level 0, a set of up to 4 files. When additional entries force Level 0 above the maximum of 4 files, one of the SST files is chosen and merged with the SST files that make up Level 1, which is a set of up to 10MB of files. This process continues, with levels overflowing and one file at a time being merged with the (up to 3) overlapping SST files in the next level. Each level beyond Level 1 is 10 times the size of the previous level.

Log: Max size of 4MB (configurable), then flushed into a set of Level 0 SST files
Level 0: Max of 4 SST files, then one file compacted into Level 1
Level 1: Max total size of 10MB, then one file compacted into Level 2
Level 2: Max total size of 100MB, then one file compacted into Level 3
Level 3+: Max total size of 10 x previous level, then one file compacted into next level

0 ↠ 4 SST, 1 ↠ 10M, 2 ↠ 100M, 3 ↠ 1G, 4 ↠ 10G, 5 ↠ 100G, 6 ↠ 1T, 7 ↠ 10T

Levels

This organisation into levels minimises the reorganisation that must take place as new entries are inserted into the middle of a range of keys. Each reorganisation, or "compaction", is restricted to a just a small section of the data store. The hierarchical structure generally leads to data in the higher levels being the most mature data, with the fresher data being stored in the log and the initial levels. Since the initial levels are relatively small, overwriting and removing entries incurs less cost than when it occurs in the higher levels, but this matches the typical database where you have a large set of mature data and a more volatile set of fresh data (of course this is not always the case, so performance will vary for different data write and retrieve patterns).

A lookup operation must also traverse the levels to find the required entry. A read operation that requests a given key must first look in the log, if it is not found there it looks in Level 0, moving up to Level 1 and so forth. In this way, a lookup operation incurs a minimum of one read per level that must be searched before finding the required entry. A lookup for a key that does not exist must search every level before a definitive "NotFound" can be returned (unless a Del operation is recorded for that key in the log).

Advanced features

  • Batch operations: provide a collection of Put and/or Del operations that are atomic; that is, the whole collection of operations succeed or fail in a single Batch operation.
  • Bi-directional iterators: iterators can start at any key in a LevelDB store (even if that key does not exist, it will simply jump to the next lexical key) and can move forward and backwards through the store.
  • Snapshots: a snapshot provides a reference to the state of the database at a point in time. Read-queries (Get and iterators) can be made against specific snapshots to retrieve entries as they existed at the time the snapshot was created. Each iterator creates an implicit snapshot (unless it is requested against an explicitly created snapshot). This means that regardless of how long an iterator is alive and active, the data set it operates upon will always be the same as at the time the iterator was created.

Some details on these advanced features will be covered in the next two articles, when we turn to look at how LevelDB can be used to simplify data management in your Node application.

If you're keen to learn more and can't wait for the next article, see the LevelUP project on GitHub as this is the focus of much of the LevelDB activity in the Node community at the moment.