ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Using DOM to Traverse XML

by Stephanie Fesler
02/08/2001

Introduction

XML is now used frequently to model business data in large-scale applications. A common Java application task is to parse XML to retrieve its data. The Document Object Model (DOM) defines a set of interfaces for navigating and manipulating the content and structure of XML and HTML documents.

Objective

After reading this article you will be able to create a representation of an XML document in your Java program and traverse that representation in two different ways. You will be able to traverse a horizontal representation of the XML document, and you will be able to traverse a tree, or hierarchical, representation of the XML document.

The Document Object Model (DOM) in Detail

The DOM defines interfaces that allow programmers to navigate XML and HTML documents and also to manipulate their content and structure. The DOM is a specification of interfaces; it's not an implementation. Vendors are left to come up with their own implementation of DOM. Sun Microsystems has some DOM support in its Java XML Processing API. Other vendors that provide support are IBM, Oracle, and the Apache Software Foundation.

The DOM has several levels. Level 0 was the first requirements document that defined functionality similar to that used in Netscape Navigator 3.0 and Internet Explorer 3.0. On October 1, 1998, DOM Level 1 Recommendation was released which provided functionality to navigate and manipulate the structure and content of XML and HTML documents. DOM Level 2 is a set of specifications that add to the functionality defined in DOM Level 1. The following list describes the different recommendations of DOM Level 2.

  • DOM Level 2 Core Recommendation -- Further defines ways to navigate and manipulate the content and structure of XML and HTML documents. This recommendation is built on the DOM Level 1 Recommendation, filling in some of the gaps.

  • DOM Level 2 Views Recommendation specifies an interface to provide programmers the functionality to view alternate presentations of the XML or HTML document. The interfaces defined in this recommendation are optional, but if a vendor chooses to implement them, they must also implement the DOM Level 2 Core Recommendation.

  • DOM Level 2 Style Recommendation specifies an interface to provide programmers the ability to dynamically access and manipulate style sheets. The interfaces defined in this recommendation are optional, but if a vendor chooses to implement the interfaces, they must also implement the DOM Level 2 Core Recommendation.

  • DOM Level 2 Events Recommendation specifies a set of interfaces to provide a generic event system to programmers. The interfaces defined in this recommendation are optional, but if a vendor chooses to implement them, they must also provide support for the DOM Level 2 Core Recommendation.

  • DOM Level 2 Traversal-Range Recommendation specifies a set of interfaces that allow programmers to traverse a representation of the XML document. There is also a set of interfaces defined to manipulate ranges of the XML document. The interfaces defined in this recommendation are optional, but if a vendor chooses to implement them, they must also implement the DOM Level 2 Core Recommendation.

  • DOM Level 2 HTML Working Draft provides a set of interfaces that allow programmers work on HTML documents. The interfaces defined in this recommendation are optional, but if a vendor chooses to implement them, they must also provide implementation for the DOM Level 2 Core Recommendation.

Getting Started

This article explores the traversal of DOM representations of XML documents from within Java applications. The Apache Software Foundation has implemented the optional interfaces of the DOM Level 2 Traversal-Range Recommendation in their Xerces project. You can download the Xerces JAR, which contains the files you'll need. Xerces also supports the optional interfaces defined in DOM Level 2 Events Recommendation. Make sure to place the xerces.jar file in your system CLASSPATH so the Java compiler will be able to locate the appropriate files.

You also need a JDK to compile and run your Java programs. You can download a JDK from Sun Microsystems.

An Example XML Document

To learn to parse and traverse XML you'll need an example XML document. Listing 1 is the DTD for the example XML document. It's a simple representation of a bank. In this example a bank has clients, employees, and a branch identification number.


Listing 1: bank.dtd -- A DTD that defines the parts of a bank:

<!ELEMENT bank (client+, employee+, branchID)>
<!ELEMENT client (clientName, homeAddress, homePhone, account+)>
<!ELEMENT branchID (#PCDATA)>
<!ELEMENT clientName (#PCDATA)>
<!ELEMENT homeAddress (#PCDATA)>
<!ELEMENT homePhone (#PCDATA)>
<!ELEMENT account (type, accountNumber)>
<!ELEMENT type (#PCDATA)>
<!ELEMENT accountNumber (#PCDATA)>
<!ELEMENT employee (empID, empName, workAddress, workPhone, salary)>
<!ELEMENT empID (#PCDATA)>
<!ELEMENT empName (#PCDATA)>
<!ELEMENT workAddress (#PCDATA)>
<!ELEMENT workPhone (#PCDATA)>
<!ELEMENT salary (#PCDATA)>
<!ELEMENT branchID (#PCDATA)>

Listing 2, an instance of our DTD, represents a bank with two clients and two employees.

Listing 2: bank.xml -- A simple XML document representing a view of a bank.

<?xml version="1.0"?>
<!DOCTYPE bank SYSTEM "bank.dtd" >

<bank>
  <client>
    <clientName>Bill Clinton</clientName>
    <homeAddress>Nashua, NH</homeAddress>
    <homePhone>555/555-8975</homePhone>
    <account>
      <type>Checking</type>
      <accountNumber>111222333</accountNumber>
    </account>
    <account>
      <type>Savings</type>
      <accountNumber>777888999</accountNumber>
    </account>
  </client>

  <client>
    <clientName>Al Gore</clientName>
    <homeAddress>Washington, DC</homeAddress>
    <homePhone>555/555-4256</homePhone>
    <account>
      <type>Savings</type>
      <accountNumber>444777888</accountNumber>
    </account>
  </client>

  <employee>
    <empID>2105</empID>
    <empName>Ronald Reagan</empName>
    <workAddress>Nashua, NH</workAddress>
    <workPhone>555/555-1245</workPhone>
    <salary>60000</salary>
  </employee>

  <employee>
    <empID>77</empID>
    <empName>Jimmy Carter</empName>
    <workAddress>Denver, CO</workAddress>
    <workPhone>555/555-1235</workPhone>
    <salary>250000</salary>
  </employee>

  <branchID>78963</branchID>
</bank>

Pages: 1, 2, 3

Next Pagearrow