The Java(TM) Web Services Tutorial

The Java^TM Web Services Tutorial

Reading XML Data into a DOM

In this section of the tutorial, you'll construct a Document Object Model (DOM) by reading in an existing XML file. In the following sections, you'll see how to display the XML in a Swing tree component and practice manipulating the DOM.

Note: In the next part of the tutorial, XML Stylesheet Language for Transformations (page 221), you'll see how to write out a DOM as an XML file. (You'll also see how to convert an existing data file into XML with relative ease.)

Creating the Program

The Document Object Model (DOM) provides APIs that let you create nodes, modify them, delete and rearrange them. So it is relatively easy to create a DOM, as you'll see in later in section 5 of this tutorial, Creating and Manipulating a DOM.

Before you try to create a DOM, however, it is helpful to understand how a DOM is structured. This series of exercises will make DOM internals visible by displaying them in a Swing JTree.

Create the Skeleton

Now that you've had a quick overview of how to create a DOM, let's build a simple program to read an XML document into a DOM then write it back out again.

Note: The code discussed in this section is in DomEcho01.java. The file it operates on is slideSample01.xml. (The browsable version is slideSample01-xml.html.)

Start with a normal basic logic for an app, and check to make sure that an argument has been supplied on the command line:
public class DomEcho {	
   public static void main(String argv[])	
   {	
      if (argv.length != 1) {	
         System.err.println("Usage: java DomEcho 
filename");	
         System.exit(1);	
      }	
   }// main	
}// DomEcho
 
Import the Required Classes

In this section, you're going to see all the classes individually named. That's so you can see where each class comes from when you want to reference the API documentation. In your own apps, you may well want to replace import statements like those below with the shorter form: javax.xml.parsers.*.

Add these lines to import the JAXP APIs you'll be using:
import javax.xml.parsers.DocumentBuilder; 	
import javax.xml.parsers.DocumentBuilderFactory; 	
import javax.xml.parsers.FactoryConfigurationError; 	
import javax.xml.parsers.ParserConfigurationException;
 
Add these lines for the exceptions that can be thrown when the XML document is parsed:
import org.xml.sax.SAXException; 	
import org.xml.sax.SAXParseException;
 
Add these lines to read the sample XML file and identify errors:
import java.io.File;	
import java.io.IOException;
 
Finally, import the W3C definition for a DOM and DOM exceptions:
import org.w3c.dom.Document;	
import org.w3c.dom.DOMException;
 
Note: A DOMException is only thrown when traversing or manipulating a DOM. Errors that occur during parsing are reporting using a different mechanism that is covered below.

Declare the DOM

The org.w3c.dom.Document class is the W3C name for a Document Object Model (DOM). Whether you parse an XML document or create one, a Document instance will result. We'll want to reference that object from another method later on in the tutorial, so define it as a global object here:
public class DomEcho	
{ 	
   static Document document;	
	
   public static void main(String argv[])	
   {
 
It needs to be static, because you're going to generate its contents from the main method in a few minutes.

Handle Errors

Next, put in the error handling logic. This code is very similar to the logic you saw in Handling Errors with the Nonvalidating Parser in the SAX tutorial, so we won't go into it in detail here. The major point worth noting is that a JAXP-conformant document builder is required to report SAX exceptions when it has trouble parsing the XML document. The DOM parser does not have to actually use a SAX parser internally, but since the SAX standard was already there, it seemed to make sense to use it for reporting errors. As a result, the error-handling code for DOM and SAX applications are very similar:
public static void main(String argv[])	
{	
   if (argv.length != 1) {	
      ...	
   }	
	
   try {	
	
   } catch (SAXException sxe) {	
      // Error generated during parsing	
      Exception  x = sxe;	
      if (sxe.getException() != null)	
         x = sxe.getException();	
      x.printStackTrace();	
	
    } catch (ParserConfigurationException pce) {	
      // Parser with specified options can't be built	
      pce.printStackTrace();	
	
    } catch (IOException ioe) {	
      // I/O error	
      ioe.printStackTrace();	
   }	
}// main 
 
The major difference between this code and the SAX error-handling code is that the DOM parser does not throw SAXParseExceptions, but only SAXExceptions.

Instantiate the Factory

Next, add the code highlighted below to obtain an instance of a factory that can give us a document builder:
public static void main(String argv[])	
{	
   if (argv.length != 1) {	
      ...	
   }	
   DocumentBuilderFactory factory =	
      DocumentBuilderFactory.newInstance();	
   try {
 
Get a Parser and Parse the File

Now, add the code highlighted below to get a instance of a builder, and use it to parse the specified file:
try {	
   DocumentBuilder builder = factory.newDocumentBuilder();	
   document = builder.parse( new File(argv[0]) );	
} catch (SAXParseException spe) {
 
Save This File!

By now, you should be getting the idea that every JAXP application starts pretty much the same way. You're right! Save this version of the file as a template. You'll use it later on as the basis for XSLT transformation app.

Run the Program

Throughout most of the DOM tutorial, you'll be using the sample slideshows you created in the SAX section. In particular, you'll use slideSample01.xml, a simple XML file with nothing much in it, and slideSample10.xml, a more complex example that includes a DTD, processing instructions, entity references, and a CDATA section.

For instructions on how to compile and run your program, see Compiling and Running the Program and Run the Program, from the SAX tutorial. Substitute "DomEcho" for "Echo" as the name of the program, and you're ready to roll.

For now, just run the program on slideSample01.xml. If it ran without error, you have successfully parsed an XML document and constructed a DOM. Congratulations!

Note: You'll have to take my word for it, for the moment, because at this point you don't have any way to display the results. But that is feature is coming shortly...

Additional Information

Now that you have successfully read in a DOM, there are one or two more things you need to know in order to use DocumentBuilder effectively. Namely, you need to know about:

- Configuring the Factory
- Handling Validation Errors

Configuring the Factory

By default, the factory returns a nonvalidating parser that knows nothing about namespaces. To get a validating parser, and/or one that understands namespaces, you configure the factory to set either or both of those options using the command(s) highlighted below:
public static void main(String argv[])	
{	
   if (argv.length != 1) {	
      ...	
   }	
   DocumentBuilderFactory factory =	
      DocumentBuilderFactory.newInstance();	
   factory.setValidating(true);	
   factory.setNamespaceAware(true);	
   try {	
      ...
 
Note: JAXP-conformant parsers are not required to support all combinations of those options, even though the reference parser does. If you specify an invalid combination of options, the factory generates a ParserConfigurationException when you attempt to obtain a parser instance.

You'll be learning more about how to use namespaces in the last section of the DOM tutorial, Using Namespaces. To complete this section, though, you'll want to learn something about...

Handling Validation Errors

Remember when you were wading through the SAX tutorial, and all you really wanted to do was construct a DOM? Well, here's when that information begins to pay off.

Recall that the default response to a validation error, as dictated by the SAX standard, is to do nothing. The JAXP standard requires throwing SAX exceptions, so you exactly the same error handling mechanisms as you used for a SAX app. In particular, you need to use the DocumentBuilder's setErrorHandler method to supply it with an object that implements the SAX ErrorHandler interface.

Note: DocumentBuilder also has a setEntityResolver method you can use

The code below uses an anonymous inner class adapter to provide that ErrorHandler. The highlighted code is the part that makes sure validation errors generate an exception.
builder.setErrorHandler(	
   new org.xml.sax.ErrorHandler() {	
      // ignore fatal errors (an exception is guaranteed)	
      public void fatalError(SAXParseException exception)	
      throws SAXException {	
      }	
      // treat validation errors as fatal	
      public void error(SAXParseException e)	
      throws SAXParseException	
      {	
         throw e;	
      }	
	
       // dump warnings too	
      public void warning(SAXParseException err)	
      throws SAXParseException	
      {	
         System.out.println("** Warning"	
            + ", line " + err.getLineNumber()	
            + ", uri " + err.getSystemId());	
         System.out.println("   " + err.getMessage());	
      }	
   	
); 
 
This code uses an anonymous inner class to generate an instance of an object that implements the ErrorHandler interface. Since it has no class name, it's "anonymous". You can think of it as an "ErrorHandler" instance, although technically it's a no-name instance that implements the specified interface. The code is substantially the same as that described the Handling Errors with the Nonvalidating Parser section of the SAX tutorial. For a more background on validation issues, refer to Using the Validating Parser in that part of the tutorial.

Looking Ahead

In the next section, you'll display the DOM structure in a JTree and begin explore its structure. For example, you'll see how entity references and CDATA sections appear in the DOM. And perhaps most importantly, you'll see how text nodes (which contain the actual data) reside under element nodes in a DOM.

Home
TOC