DailyJS

Node Roundup

Alex R. Young

Subscribe

@dailyjs

Facebook

Google+

server testing services node scraping

Node Roundup

Posted by Alex R. Young on .
Featured

server testing services node scraping

Node Roundup

Posted by Alex R. Young on .

Welcome to the Node Roundup. Send in your apps and libraries using our
contact form or @dailyjs.

Node.io

node.io (MIT License) by Chris O'Hara is a distributed data scraping and processing framework. It uses
node-htmlparser, node-soupselect, and multi-node to provide flexible and friendly tools for building scrapers.

There's a node.io command-line script that runs jobs. Jobs look a bit
like standard CommonJS modules:

var nodeio = require('node.io'),
    options = {},
    methods = {};

exports.job = new nodeio.Job(options, methods);

The methods variable contains run and
fail methods that will be run on each worker. Jobs can be
linked together through STDIN and STDOUT.

I actually have a freelance client for whom I write a lot of scrapers
(permission is given by the scraped sites), and I kept thinking Node would be a good choice for scraping. In particular writing selectors
with a jQuery-like API makes scraping pretty easy. I haven't looked at
how deep the HTTP manipulation can get; I've needed access to cookies
and the full range of HTTP methods to scrape some sites.

JsApp.US

JsApp.US by Matthew Francis-Landau (sent by @jefkoslowski) is a hosting platform for Node apps. It's possible to run an app without
registering, but user accounts can be created for features like sharing
apps or accessing a virtual file system.

There's a database
API
, and some sample
apps
.

I think the project might be open sourced at some point, because the
GitHub repository that contains the wiki says "When the source becomes
public, this is where it will be".

Should

should.js (MIT License) by that stalwart JavaScript hacker TJ Holowaychuk is a test framework
agnostic assertion library for Node. The syntax is possibly inspired by
Thoughtbot's shoulda, but it's a little bit different:

var user = {
    name: 'tj'
  , pets: ['tobi', 'loki', 'jane', 'bandit']
};

user.should.have.property('name', 'tj');
user.should.have.property('pets').with.lengthOf(4)

It sounds like TJ has a lot of pets!

As you might have realised, this library extends Object
with a getter. By using a single getter on Object, TJ has
managed to cut down a lot of line noise that would typically be created
by using function calls. This might put you off, but he addresses this
decision in his README:

OMG IT EXTENDS OBJECT???@ Yes, yes it does, with a single getter should, and no it wont break your code, because it does this properly with a non-enumerable property.

If any of this seems confusing, look at the documentation for
defineProperty and read through
lib/should.js:

Object.defineProperty(Object.prototype, 'should', {
  set: function(){},
  get: function(){
    return new Assertion(this);
  }
});