Web DevCenter
oreilly.comSafari Books Online.Conferences.
MySQL Conference and Expo April 14-17, 2008, Santa Clara, CA

Sponsored Developer Resources

Web Columns
Adobe GoLive
Essential JavaScript

Web Topics
All Articles
Scripting Languages

Atom 1.0 Feed RSS 1.0 Feed RSS 2.0 Feed

Learning Lab

Essential JavaScript

Parsing and DOM-Tree Building With JavaScript


Remember the first time you heard about the DOM? Do you still get this shiver running down your back at the mere mention of it?

You may call it a revolution, especially compared to the former pseudo-standard interfaces. However, as important as the DOM is, there are utilities missing in the standard implementation that we would like to have.

For example -- for many years the only way to insert dynamic content into your documents was to use the document's writeln() method. It's still there and still has the old disadvantage of destroying your documents when writing to an already closed document. So what we need is a standards-compliant method to output any text, interspersed with HTML markup, at any position of a document.

I'm going to shy away from using proprietary extensions to solve this problem. Instead, I'll create a simple parser and talk about the process of building a DOM tree from a plain string. In case you're totally new to the DOM, you may want to read the last Essential JavaScript column, Document Mathematics: Count Your Words, for a brief introduction. Also, before we start, I want to spend a few minutes explaining why I think taking the standards approach is so important.

Standards: to be or not to be?

I have a philosophical question: Is it best to enhance your pages with a tempting proprietary extension (remember those weird filters in IE) or stick with the standards? On one hand, proprietary extensions are convenient and do offer handy functionality for those in your audience who have the right configuration to use them. On the other hand, standards open your pages to the greatest possible audience, but with the drawback of possibly limited functionality and probably more work.

On one level, this article is much more than pure philosophy. You probably wouldn't even be able to read this article without many of the critical standards that serve as the foundation for the Web. Viewed from this perspective, it's hard to argue against standards-compliant authoring.

Astonishingly enough, however, many developers have wandered away from standards-compliant authoring. Possibly they've been tempted by the ease of using proprietary extensions that save time and effort. Yet by using proprietary extensions, those developers are sending all the efforts of those guys creating standardized layers of communication and transport to the /dev/null device. That's because they're blocking independent communication by proprietary presentation on a higher level.

From my point of view, the solution is simple: Just stick to the standards and use them whenever possible. Given this scenario, you're ready for new browsers or other applications that support the standards you're already using.

OK, that's enough of that for now -- let's get technical again!

Creating elements

To interact with the DOM, you have to think in terms of nodes and elements. A string containing markup can't be inserted into the tree without transforming it. Even a pure plain-text string can't be inserted into the tree without creating a text node first. To create a text node you use a method of the document object:

var text_node = document.createTextNode(a_string);

Now you have a text node object, which inherits from the Node and CharacterData classes defined in the DOM Level 1 spec.

Creating an HTML element is just as easy:

var html_element = document.createElement(element_name);

When working with HTML documents, the type of the returned object depends on the given element name. The DOM Level 1 HTML defines special objects for all HTML elements. These objects were designed to address two requirements: backwards compatibility to older, non-standard DOM implementations and providing an easier view to HTML elements, as you can access the properties directly on the objects rather than dealing with special attribute nodes.

Pages: 1, 2, 3

Next Pagearrow