Node Roundup

01 Dec 2010 | By Alex Young | Tags node server scraping services testing

Welcome to the Node Roundup. Send in your apps and libraries using our contact form or @dailyjs.

Node.io

node.io (MIT License) by Chris O’Hara is a distributed data scraping and processing framework. It uses node-htmlparser, node-soupselect, and multi-node to provide flexible and friendly tools for building scrapers.

There’s a node.io command-line script that runs jobs. Jobs look a bit like standard CommonJS modules:

var nodeio = require('node.io'),
    options = {},
    methods = {};

exports.job = new nodeio.Job(options, methods);

The methods variable contains run and fail methods that will be run on each worker. Jobs can be linked together through STDIN and STDOUT.

I actually have a freelance client for whom I write a lot of scrapers (permission is given by the scraped sites), and I kept thinking Node would be a good choice for scraping. In particular writing selectors with a jQuery-like API makes scraping pretty easy. I haven’t looked at how deep the HTTP manipulation can get; I’ve needed access to cookies and the full range of HTTP methods to scrape some sites.

JsApp.US

JsApp.US by Matthew Francis-Landau (sent by @jefkoslowski) is a hosting platform for Node apps. It’s possible to run an app without registering, but user accounts can be created for features like sharing apps or accessing a virtual file system.

There’s a database API, and some sample apps.

I think the project might be open sourced at some point, because the GitHub repository that contains the wiki says “When the source becomes public, this is where it will be”.

Should

should.js (MIT License) by that stalwart JavaScript hacker TJ Holowaychuk is a test framework agnostic assertion library for Node. The syntax is possibly inspired by Thoughtbot’s shoulda, but it’s a little bit different:

var user = {
    name: 'tj'
  , pets: ['tobi', 'loki', 'jane', 'bandit']
};

user.should.have.property('name', 'tj');
user.should.have.property('pets').with.lengthOf(4)

It sounds like TJ has a lot of pets!

As you might have realised, this library extends Object with a getter. By using a single getter on Object, TJ has managed to cut down a lot of line noise that would typically be created by using function calls. This might put you off, but he addresses this decision in his README:

OMG IT EXTENDS OBJECT???!?!@ Yes, yes it does, with a single getter should, and no it wont break your code, because it does this properly with a non-enumerable property.

If any of this seems confusing, look at the documentation for defineProperty and read through lib/should.js:

Object.defineProperty(Object.prototype, 'should', {
  set: function(){},
  get: function(){
    return new Assertion(this);
  }
});

blog comments powered by Disqus