XHTML: The Clean Code Solution

by Peter Wiggin

XML continues to be a hot topic among web developers. Why? Because it delivers a standardized markup that separates display and layout code from syntax, making the creation, maintenance, and parsing of documents much easier for all involved.

But that's just one example of how a strict, standardized markup standard can make programming easier. As we watch the growing trend of portable web-enabled devices, we realize they require only small subsets of the bloated HTML code we are sending to desktop browsers, and multiple output formats are what XML and standardized markup languages were designed for. Getting to that point, however, will require some work.

Whether your site has 10 pages or 10,000, it's likely that the HTML code is a mix of standard HTML and browser-specific, proprietary markup. If you've been thinking about making the transition to XML, or even just standardizing your HTML code, the W3C, the Web's standards body, have provided the solution: XHTML (Extensible Hypertext Markup Language) is the latest version of HTML.

XML + HTML = XHTML (sort of)

Let's take a quick look at how these markup languages fit together.

  • HTML is a markup language described in SGML (Standard Generalized Markup Language).
  • XML is a restricted form of SGML, removing many of SGML's more complex features, but preserving most of SGML's power and commonly used features.
  • XHTML is the reformulation of HTML 4.0 as an application of XML. It is the W3C's new version of HTML.

The W3C (World Wide Web Consortium) has taken the logical step of expressing the HTML 4.0 standard in XML instead of using the more complicated SGML.

Related Articles

Extensible Graphics with SVG

The X-Acronyms

What Does XUL Have to Do With XML?

The minute details aren't important for the average web coder; the main difference is found in the document type definitions (DTDs) used by HTML and XHTML. A DTD, according to the W3C, is "a collection of declarations that, as a collection, defines the legal structure, elements, and attributes that are available for use in a document that complies to the DTD."

In other words, it's a definition of what is legal syntax in XHTML and what isn't. The DTD for XHTML is more restrictive than the DTD for HTML because XML is more restrictive than SGML.

The W3C gives two main reasons for recommending XHTML as the next step from HTML 4.0. First, XHTML, since it's an XML application, is designed to be extensible -- that's the "X" in all the acronyms. This means that new tags or "elements" in the official W3C jargon can be added without altering the entire DTD that the document is based on. Granted, if you add tags that aren't in the DTD, the document won't validate, but if you keep it well-formed, it will still parse.

Work is already underway on XHTML 1.1, which is designed to accommodate extensions through existing XHTML modules and techniques for developing new modules. These modules will permit the combination of existing and new feature sets when developing content and when designing new client software, so developers can choose among subsets of XHTML, and don't need to support the entire language.

The second reason is a follow on to that: XHTML is designed for portability. Desktop web browsers have become behemoths of code bloat. You name it, there's code in the newest browsers to do it. But according to some estimates cited by the W3C, by 2002, 75 percent of web document viewing will be through non-desktop devices like palm computers, televisions, toasters, and other alternative platforms, not through browsers on PCs. Your web-enabled toaster won't need or want to accommodate the same subset of XHTML that your PC browser does. Through a new client and document profiling mechanism, servers, proxies, and clients will be able to perform content transformation so that eventually it will be possible to develop XHTML-conforming content that is usable by any XHTML-conforming client. The server, the client, or a proxy service, will decide on the subset of XHTML that is received.

But much of that is still down the road in a number of XHTML 1.1 draft specifications still being written. The only spec that has been made a recommendation by the W3C is XHTML 1.0.

Some more specific reasons for moving to XHTML 1.0 are based on the fact that XHTML documents are XML conforming which means that they are readily viewed, edited, and validated with standard XML tools and that XHTML documents can use applications (such as scripts and applets) that rely upon either the HTML Document Object Model or the XML Document Object Model (DOM).

To help you see the main differences between HTML and XHTML, we've included a number of examples in the following section, "Differences." You'll see that most of the variances are simply stricter definitions of common HTML tags.

There are, however, some new features, which we cover in the third section, "What's New."

Pages: 1, 2, 3

Next Pagearrow