DailyJS

Code Review: CoffeeScript

Alex R. Young

Subscribe

@dailyjs

Facebook

Google+

language code-review

Code Review: CoffeeScript

Posted by Alex R. Young on .
Featured

language code-review

Code Review: CoffeeScript

Posted by Alex R. Young on .
*Code Review* is a series on DailyJS where I take a look at an open source project to see how it's built. Along the way we'll learn patterns and techniques by JavaScript masters. If you're looking for tips to write better apps, or just want to see how they're structured in established projects, then this is the tutorial series for you.

This week's code review is on
CoffeeScript.

Why? Well, I really wanted to compare the source of CoffeeScript to
Jade because I know TJ Holowaychuk wrote Jade pretty much from scratch whereas CoffeeScript is
built with Jison. I find Jade's source
very readable and clean, so I wanted to see what value Jison brought to
CoffeeScript.

About CoffeeScript

CoffeeScript (GitHub: jashkenas / coffee-script, npm:
coffee-script, License: MIT License) by Jeremy Ashkenas is a language that compiles to JavaScript prior to execution, so there's no
compilation at runtime.

The compiled output is readable and pretty-printed, passes through JavaScript Lint without warnings, will work in every JavaScript implementation, and tends to run as fast or faster than the equivalent handwritten JavaScript.

Installation

Either check out the source from GitHub, download a tarball, or use npm:

npm install -g coffee-script

Usage

CoffeeScript is basically a command-line tool, which can be run with
coffee:

alex@b ~$ coffee -h

Usage: coffee [options] path/to/script.coffee

  -c, --compile      compile to JavaScript and save as .js files
  -i, --interactive  run an interactive CoffeeScript REPL
  -o, --output       set the directory for compiled JavaScript
  -j, --join         concatenate the scripts before compiling
  -w, --watch        watch scripts for changes, and recompile
  -p, --print        print the compiled JavaScript to stdout
  -l, --lint         pipe the compiled JavaScript through JavaScript Lint
  -s, --stdio        listen for and compile scripts over stdio
  -e, --eval         compile a string from the command line
  -r, --require      require a library before executing your script
  -b, --bare         compile without the top-level function wrapper
  -t, --tokens       print the tokens that the lexer produces
  -n, --nodes        print the parse tree that Jison produces
      --nodejs       pass options through to the "node" binary
  -v, --version      display CoffeeScript version
  -h, --help         display this help message

Structure

CoffeeScript's source in lib/ is actually generated from
CoffeeScript, found in src/.

Everything stems from bin/coffee which loads
lib/command.js to parse command-line options and then uses
lib/coffee-script.js to do the real work. This file loads
lib/parser.js and lib/lexer.js. The lexer
generates a sequence of tokens from the CoffeeScript input, then the
Jison parser parses these tokens:

return (parser.parse(lexer.tokenize(code))).compile(options);

The parser is generated by Jison
(GitHub: zaach / jison, npm: jison, License: MIT X) by Zach Carter, which is a JavaScript parser
generator. Jison is based on Bison, which can generate parsers based on
annotated context-free grammars.

Why not just use Bison? Well, Jison provides several things that make it
easier for those fluent in JavaScript to work with. Consider this
example from Zach's documentation:

var Parser = require('jison').Parser;
var grammar = {
  "lex": {
    "rules": [
     ["\\s+", "/* skip whitespace */"],
     ["[a-f0-9]+", "return 'HEX';"]
    ]
  },

  "bnf": {
    "hex_strings" :[ "hex_strings HEX", "HEX" ]
  }
};

var parser = new Parser(grammar);

// generate source, ready to be written to disk
var parserSource = parser.generate();

// you can also use the parser directly from memory

// returns true
parser.parse("adfe34bc e82a");

This example uses a parser to determine if the input contains a hex
string. BNF refers to Backus-Naur Form which is a notation for
describing context-free grammars -- ideal for describing the syntax of
the input language. Notice that the grammar is represented in JSON --
it's almost trivial for us to understand and write this.

This is all documented in
docs/grammar.html. The grammar.coffee file is the real core of the project.
Once the grammar has been described, Jison can interpret and parse it.

Grammar Examples

The grammar file provides a top-down view of how CoffeeScript works
internally. One of the interesting features of CoffeeScript is the
function arrow, which reads like this:

square = (x) -> x * x

race = (winner, runners...) ->
  print winner, runners

It's defined in the grammar like this:

  # The **Code** node is the function literal. It's defined by an indented block
  # of **Block** preceded by a function arrow, with an optional parameter
  # list.
  Code: [
    o 'PARAM_START ParamList PARAM_END FuncGlyph Block', -> new Code $2, $5, $4
    o 'FuncGlyph Block',                        -> new Code [], $2, $1
  ]

  # CoffeeScript has two different symbols for functions. `->` is for ordinary
  # functions, and `=>` is for functions bound to the current value of *this*.
  FuncGlyph: [
    o '->',                                     -> 'func'
    o '=>',                                     -> 'boundfunc'
  ]

  # An optional, trailing comma.
  OptComma: [
    o ''
    o ','
  ]

  # The list of parameters that a function accepts can be of any length.
  ParamList: [
    o '',                                       -> []
    o 'Param',                                  -> [$1]
    o 'ParamList , Param',                      -> $1.concat $3
  ]

  # A single parameter in a function definition can be ordinary, or a splat
  # that hoovers up the remaining arguments.
  Param: [
    o 'ParamVar',                               -> new Param $1
    o 'ParamVar ...',                           -> new Param $1, null, on
    o 'ParamVar = Expression',                  -> new Param $1, $3
  ]

  ParamVar: [
    o 'Identifier'
    o 'ThisProperty'
    o 'Array'
    o 'Object'
  ]

These grammar actions are used to generate nodes, found in
src/nodes.coffee:

# A function definition. This is the only node that creates a new Scope.
# When for the purposes of walking the contents of a function body, the Code
# has no *children* -- they're within the inner scope.
exports.Code = class Code extends Base
  constructor: (params, body, tag) ->
    @params  = params or []
    @body    = body or new Block
    @bound   = tag is 'boundfunc'
    @context = 'this' if @bound

All nodes in the syntax tree descend from a class called
Base. The most important method in this class is
compile:

compile: (o, lvl) ->
  o        = extend {}, o
  o.level  = lvl if lvl
  node     = @unfoldSoak(o) or this
  node.tab = o.indent
  if o.level is LEVEL_TOP or not node.isStatement(o)
    node.compileNode o
  else
    node.compileClosure o

Each subclass implements compileNode which is used to
compile a given node to JavaScript. For the function literal example,
this gets run:

code  = 'function'
code  += ' ' + @name if @ctor
code  += '(' + vars.join(', ') + ') {'
code  += "\n#{ @body.compileWithDeclarations o }\n#{@tab}" unless @body.isEmpty()
code  += '}'

This should look familiar!

Conclusion

Back in the Let's Make a Framework series, I created a mini CSS parser
which included a lexer. I was reminded of this here, and wondered if
using Jison for CSS parsing might be an interesting exercise.

The basic process at work in CoffeeScript is very simple, although it
can look daunting at first:

  • The lexer uses regular expressions to generate tokens that can be fed into a parser
  • The parser is generated from a Jison grammar
  • The parser then builds a representation of the tokens using nodes

If you wanted to build your own JavaScript-powered language using Jison,
you'd need to implement your own lexer, Jison grammar, and something to
interpret the output of the Jison parser.

If you want to dig further into the CoffeeScript source, look at the
CoffeeScript annotated source in the menu on the
homepage. As a whole the project has extremely readable comments.

Next week I'll have a look at Jade so we can see what a parser written
in JavaScript without Jison looks like.