Home TOC |
![]() ![]() ![]() |
The Simple API for XML (SAX) APIs
The basic outline of the SAX parsing APIs are shown at right. To start the process, an instance of the
SAXParserFactory
classed is used to generate an instance of the parser.The parser wraps a
SAXReader
object. When the parser'sparse()
method is invoked, the reader invokes one of several callback methods implemented in the application. Those methods are defined by the interfacesContentHandler
,ErrorHandler
,DTDHandler
, andEntityResolver
.Here is a summary of the key SAX APIs:
- A
SAXParserFactory
object creates an instance of the parser determined by the system property,javax.xml.parsers.SAXParserFactory
.
- The
SAXParser
interface defines several kinds ofparse()
methods. In general, you pass an XML data source and aDefaultHandler
object to the parser, which processes the XML and invokes the appropriate methods in the handler object.
- The
SAXParser
wraps aSAXReader
. Typically, you don't care about that, but every once in a while you need to get hold of it usingSAXParser
'sgetXMLReader()
, so you can configure it. It is theSAXReader
which carries on the conversation with the SAX event handlers you define.
- Not shown in the diagram, a
DefaultHandler
implements theContentHandler
,ErrorHandler
,DTDHandler
, andEntityResolver
interfaces (with null methods), so you can override only the ones you're interested in.
- Methods like
startDocument
,endDocument
,startElement
, andendElement
are invoked when an XML tag is recognized. This interface also defines methodscharacters
andprocessingInstruction
, which are invoked when the parser encounters the text in an XML element or an inline processing instruction, respectively.
- Methods
error
,fatalError
, andwarning
are invoked in response to various parsing errors. The default error handler throws an exception for fatal errors and ignores other errors (including validation errors). That's one reason you need to know something about the SAX parser, even if you are using the DOM. Sometimes, the application may be able to recover from a validation error. Other times, it may need to generate an exception. To ensure the correct handling, you'll need to supply your own error handler to the parser.
- Defines methods you will generally never be called upon to use. Used when processing a DTD to recognize and act on declarations for an unparsed entity.
- The
resolveEntity
method is invoked when the parser must identify data identified by a URI. In most cases, a URI is simply a URL, which specifies the location of a document, but in some cases the document may be identified by a URN--a public identifier, or name, that is unique in the Web space. The public identifier may be specified in addition to the URL. TheEntityResolver
can then use the public identifier instead of the URL to find the document, for example to access a local copy of the document if one exists.A typical application implements most of the
ContentHandler
methods, at a minimum. Since the default implementations of the interfaces ignore all inputs except for fatal errors, a robust implementation may want to implement theErrorHandler
methods, as well.The SAX Packages
The SAX parser is defined in the following packages listed in Table 1.
Home TOC |
![]() ![]() ![]() |