Let's Make a Framework: DOM Manipulation

10 Mar 2011 | By Alex Young | Tags frameworks tutorials lmaf documentation dom

Welcome to part 53 of Let’s Make a Framework, the ongoing series about building a JavaScript framework.

If you haven’t been following along, these articles are tagged with lmaf. The project we’re creating is called Turing. Documentation is available at turingjs.com.

The Style and innerHTML Properties

After years of blindly manipulating the style and innerHTML properties, I noticed more modern frameworks advocate against this. If you think back to when I wrote about our animation module, you’ll remember that working with style attributes can be less than user friendly — it required a decent amount of helper methods just to do things like work with colours. That’s partially why frameworks like jQuery provide a .css() method — to provide a consistent interface.

It’s slightly harder to appreciate why working with innerHTML is bad. It’s fast, cross-browser, and easy to use. What’s not to like? Well, it’s a proprietary property. When XHTML was all the rage, using innerHTML caused problems when documents were served with an XML mime-type. It’s also inconsistently implemented; IE can treat it as read-only on tables, and there are other IE-related problems too.

Hopefully I’ve convinced you why methods like jQuery’s .css and .html are a good idea. But how do they work?

jQuery’s .html() Implementation

You’re probably wondering how exactly .html() works. After all, isn’t it just going to defer to innerHTML at some point?

The basic usage is with a string that contains HTML:

$('div.demo-container')
  .html('<p>All new content.</p>');

It can also accept a function, but let’s just consider the string case to keep things focused.

The implementation is in manipulation.js. Basically, html will use innerHTML if possible:

if ( typeof value === "string" && !rnocache.test( value ) &&
    ( jQuery.support.leadingWhitespace || !rleadingWhitespace.test( value )) &&
      !wrapMap[ (rtagName.exec( value ) || ["", ""])[1].toLowerCase() ] ) {

value = value.replace(rxhtmlTag, "<$1></$2>");

try {
  for ( var i = 0, l = this.length; i < l; i++ ) {
    // Remove element nodes and prevent memory leaks
    if ( this[i].nodeType === 1 ) {
      jQuery.cleanData( this[i].getElementsByTagName("*") );
      this[i].innerHTML = value;
    }
  }
} catch(e) {
  // If using innerHTML throws an exception, use the fallback method
  this.empty().append( value );
}

The first two lines are the most confusing part of this code. Let’s look at each part in sequence:

  • jQuery.support.leadingWhitespace is set to true in browsers that preserve whitespace when inserting content with innerHTML
  • rleadingWhitespace.test(value) checks to see if the HTML fragment has leading whitespace
  • wrapMap is an object that in this case helps look for tags that can’t be inserted normally
  • rxhtmlTag is a regex that expands self-closing tags
  • A loop removes each existing Node to prevent memory leaks
  • If innerHTML raises an exception, fall back to append

If the value is a string, and the whitespace/wrapMap expressions return true, then inserting with innerHTML might be possible. Else use append.

wrapMap

I’ll need to explain append separately, but first let’s look at wrapMap:

wrapMap = {
  option: [ 1, "<select multiple='multiple'>", "</select>" ],
  legend: [ 1, "<fieldset>", "</fieldset>" ],
  thead: [ 1, "<table>", "</table>" ],
  tr: [ 2, "<table><tbody>", "</tbody></table>" ],
  td: [ 3, "<table><tbody><tr>", "</tr></tbody></table>" ],
  col: [ 2, "<table><tbody></tbody><colgroup>", "</colgroup></table>" ],
  area: [ 1, "<map>", "</map>" ],
  _default: [ 0, "", "" ]
};

wrapMap.optgroup = wrapMap.option;
wrapMap.tbody = wrapMap.tfoot = wrapMap.colgroup = wrapMap.caption = wrapMap.thead;
wrapMap.th = wrapMap.td;

// IE can't serialize <link> and <script> tags normally
if ( !jQuery.support.htmlSerialize ) {
  wrapMap._default = [ 1, "div<div>", "</div>" ];
}

Running it returns responses like this:

'<tr><td>Example</td></tr>'
> !wrapMap[ (rtagName.exec( value ) || ["", ""])[1].toLowerCase() ] 
false

'<div><p>Example content</p></div>'
> !wrapMap[ (rtagName.exec( value ) || ["", ""])[1].toLowerCase() ] 
true

append

In jQuery, append and many other methods rely on domManip. This method accepts a list of elements to create and insert, a confusing table argument, and a callback. The callback is used to actually manipulate the DOM. In the case of append it looks like this:

function( elem ) {
  if (this.nodeType === 1) {
    this.appendChild(elem);
  }
}

The nodeType is checked to ensure it’s an element node, then appendChild is used to insert the content.

domManip

The append method is simple because domManip does the real work. Let’s take a high-level look (I’ve added some extra comments):

domManip: function( args, table, callback ) {
  var results, first, fragment, parent,
    value = args[0],
    scripts = [];

  // We can't cloneNode fragments that contain checked, in WebKit
  if ( !jQuery.support.checkClone && arguments.length === 3 && typeof value === "string" && rchecked.test( value ) ) {
    // run domManip on each element, but parse the element with jQuery() first
  }

  // If there's already an element
  if ( this[0] ) {
    parent = value && value.parentNode;

    // If we're in a fragment, just use that instead of building a new one
    if ( jQuery.support.parentNode && parent && parent.nodeType === 11 && parent.childNodes.length === this.length ) {
      results = { fragment: parent };

    } else {
      results = jQuery.buildFragment( args, this, scripts );
    }

    fragment = results.fragment;

    if ( fragment.childNodes.length === 1 ) {
      first = fragment = fragment.firstChild;
    } else {
      first = fragment.firstChild;
    }

    if ( first ) {
      table = table && jQuery.nodeName( first, "tr" );

      // Call the callback with each element
      for ( var i = 0, l = this.length, lastIndex = l - 1; i < l; i++ ) {
        callback.call(
          table ?
            root(this[i], first) :
            this[i],
          // Make sure that we do not leak memory by inadvertently discarding
          // the original fragment (which might have attached data) instead of
          // using it; in addition, use the original fragment object for the last
          // item instead of first because it can end up being emptied incorrectly
          // in certain situations (Bug #8070).
          // Fragments from the fragment cache must always be cloned and never used
          // in place.
          results.cacheable || (l > 1 && i < lastIndex) ?
            jQuery.clone( fragment, true, true ) :
            fragment
        );
      }
    }

    if ( scripts.length ) {
      jQuery.each( scripts, evalScript );
    }
  }

  return this;
}

As you might have noticed, jQuery.buildFragment seems to be doing something important here. The reality is that buildFragment manages caching and hands off the real work to jQuery.clean.

jQuery.clean

Bored yet? We’re nearly at the best part!

The middle of jQuery.clean has the magic we’ve been searching for:

if ( typeof elem === "string" && !rhtml.test( elem ) ) {
  elem = context.createTextNode( elem );

} else if ( typeof elem === "string" ) {
  // Fix "XHTML"-style tags in all browsers
  elem = elem.replace(rxhtmlTag, "<$1></$2>");

  // Trim whitespace, otherwise indexOf won't work as expected
  var tag = (rtagName.exec( elem ) || ["", ""])[1].toLowerCase(),
    wrap = wrapMap[ tag ] || wrapMap._default,
    depth = wrap[0],
    div = context.createElement("div");

  // Go to html and back, then peel off extra wrappers
  div.innerHTML = wrap[1] + elem + wrap[2];

  // Move to the right depth
  while ( depth-- ) {
    div = div.lastChild;
  }

  // Remove IE's autoinserted <tbody> from table fragments
  if ( !jQuery.support.tbody ) {

If the element is a string and doesn’t have any tags, it’s a text node. Otherwise, expand self-closing tags, trim whitespace, create a shim div to extract some delicious DOM nodes, then handle IE’s table weirdness.

Conclusion

Explaining how jQuery implements html demonstrates just how much work is required to provide a consistent API for accessing innerHTML. However, implementing this stack of functionality makes many interesting DOM manipulation possible, beyond append.

Incidentally, if you want to see code that does this without dealing with as many browser headaches, try looking at Zepto’s source. Zepto only targets WebKit, which means it’s a great way to learn the fundamental techniques without worrying about legacy IE issues.

Next week I’ll explain how css works.

References


blog comments powered by Disqus