Home TOC |
![]() ![]() ![]() |
Defining Attributes and Entities in the DTD
The DTD you've defined so far is fine for use with the nonvalidating parser. It tells where text is expected and where it isn't, which is all the nonvalidating parser is going to pay attention to. But for use with the validating parser, the DTD needs to specify the valid attributes for the different elements. You'll do that in this section, after which you'll define one internal entity and one external entity that you can reference in your XML file.
Defining Attributes in the DTD
Let's start by defining the attributes for the elements in the slide presentation.
Note: The XML written in this section is contained inslideshow1b.dtd
. (The browsable version isslideshow1b-dtd.html
.)
Add the text highlighted below to define the attributes for the
slideshow
element:<!ELEMENT slideshow (slide+)> <!ATTLIST slideshow title CDATA #REQUIRED date CDATA #IMPLIED author CDATA "unknown" > <!ELEMENT slide (title, item*)>The DTD tag
ATTLIST
begins the series of attribute definitions. The name that followsATTLIST
specifies the element for which the attributes are being defined. In this case, the element is theslideshow
element. (Note once again the lack of hierarchy in DTD specifications.)Each attribute is defined by a series of three space-separated values. Commas and other separators are not allowed, so formatting the definitions as shown above is helpful for readability. The first element in each line is the name of the attribute:
title
,date
, orauthor
, in this case. The second element indicates the type of the data:CDATA
is character data--unparsed data, once again, in which a left-angle bracket (<) will never be construed as part of an XML tag. Table 4 presents the valid choices for the attribute type.
*This is a rapidly obsolescing specification which will be discussed in greater length towards the end of this section.
When the attribute type consists of a parenthesized list of choices separated by vertical bars, the attribute must use one of the specified values. For an example, add the text highlighted below to the DTD:
<!ELEMENT slide (title, item*)> <!ATTLIST slide type (tech | exec | all) #IMPLIED > <!ELEMENT title (#PCDATA)> <!ELEMENT item (#PCDATA | item)* >This specification says that the
slide
element'stype
attribute must be given astype="tech"
,type="exec"
, ortype="all"
. No other values are acceptable. (DTD-aware XML editors can use such specifications to present a pop-up list of choices.)The last entry in the attribute specification determines the attributes default value, if any, and tells whether or not the attribute is required. Table 5 shows the possible choices.
Defining Entities in the DTD
So far, you've seen predefined entities like
&
and you've seen that an attribute can reference an entity. It's time now for you to learn how to define entities of your own.
Note: The XML defined here is contained inslideSample06.xml
. (The browsable version isslideSample06-xml.html
.) The output is shown inEcho09-06
.
Add the text highlighted below to the
DOCTYPE
tag in your XML file:<!DOCTYPE slideshow SYSTEM "slideshow1.dtd" [ <!ENTITY product "WonderWidget"> <!ENTITY products "WonderWidgets"> ]>The
ENTITY
tag name says that you are defining an entity. Next comes the name of the entity and its definition. In this case, you are defining an entity named "product" that will take the place of the product name. Later when the product name changes (as it most certainly will), you will only have to change the name one place, and all your slides will reflect the new value.The last part is the substitution string that replaces the entity name whenever it is referenced in the XML document. The substitution string is defined in quotes, which are not included when the text is inserted into the document.
Just for good measure, we defined two versions, one singular and one plural, so that when the marketing mavens come up with "Wally" for a product name, you will be prepared to enter the plural as "Wallies" and have it substituted correctly.
Note: Truth be told, this is the kind of thing that really belongs in an external DTD. That way, all your documents can reference the new name when it changes. But, hey, this is an example...
Now that you have the entities defined, the next step is to reference them in the slide show. Make the changes highlighted below to do that:
<slideshow title="WonderWidget&product;
Slide Show" ... <!-- TITLE SLIDE --> <slide type="all"> <title>Wake up to WonderWidgets&products;
!</title> </slide> <!-- OVERVIEW --> <slide type="all"> <title>Overview</title> <item>Why <em>WonderWidgets&products;
</em> are great</item> <item/> <item>Who <em>buys</em> WonderWidgets&products;
</item> </slide>The points to notice here are that entities you define are referenced with the same syntax (
&entityName;
) that you use for predefined entities, and that the entity can be referenced in an attribute value as well as in an element's contents.Echoing the Entity References
When you run the Echo program on this version of the file, here is the kind of thing you see:
ELEMENT: <title> CHARS: Wake up to CHARS: WonderWidgets CHARS: ! END_ELM: </title>Note that the existence of the entity reference generates an extra call to the
characters
method, and that the text you see is what results from the substitution.Additional Useful Entities
Here are several other examples for entity definitions that you might find useful when you write an XML document:
<!ENTITY ldquo "“"> <!-- Left Double Quote --> <!ENTITY rdquo "”"> <!-- Right Double Quote --> <!ENTITY trade "™"> <!-- Trademark Symbol (TM) --> <!ENTITY rtrade "®"> <!-- Registered Trademark (R) --> <!ENTITY copyr "©"> <!-- Copyright Symbol -->Referencing External Entities
You can also use the
SYSTEM
orPUBLIC
identifier to name an entity that is defined in an external file. You'll do that now.
Note: The XML defined here is contained inslideSample07.xml
and incopyright.xml
. (The browsable versions areslideSample07-xml.html
andcopyright-xml.html
.) The Echo output is shown inEcho09-07
.
To reference an external entity, add the text highlighted below to the
DOCTYPE
statement in your XML file:<!DOCTYPE slideshow SYSTEM "slideshow.dtd" [ <!ENTITY product "WonderWidget"> <!ENTITY products "WonderWidgets"> <!ENTITY copyright SYSTEM "copyright.xml"> ]>This definition references a copyright message contained in a file named
copyright.xml
. Create that file and put some interesting text in it, perhaps something like this:<!-- A SAMPLE copyright --> This is the standard copyright message that our lawyers make us put everywhere so we don't have to shell out a million bucks every time someone spills hot coffee in their lap...Finally, add the text highlighted below to your
slideSample.xml
file to reference the external entity:<!-- TITLE SLIDE --> ... </slide> <!-- COPYRIGHT SLIDE --> <slide type="all"> <item>©right;</item> </slide>You could also use an external entity declaration to access a servlet that produces the current date using a definition something like this:
<!ENTITY currentDate SYSTEM "http://www.example.com/servlet/CurrentDate?fmt=dd-MMM- yyyy">You would then reference that entity the same as any other entity:
Today's date is ¤tDate;.Echoing the External Entity
When you run the Echo program on your latest version of the slide presentation, here is what you see:
... END_ELM: </slide> ELEMENT: <slide ATTR: type "all" > ELEMENT: <item> CHARS: This is the standard copyright message that our lawyers make us put everywhere so we don't have to shell out a million bucks every time someone spills hot coffee in their lap... END_ELM: </item> END_ELM: </slide> ...Note that the newline which follows the comment in the file is echoed as a character, but that the comment itself is ignored. That is the reason that the copyright message appears to start on the next line after the
CHARS:
label, instead of immediately after the label--the first character echoed is actually the newline that follows the comment.Summarizing Entities
An entity that is referenced in the document content, whether internal or external, is termed a general entity. An entity that contains DTD specifications that are referenced from within the DTD is termed a parameter entity. (More on that later.)
An entity which contains XML (text and markup), and which is therefore parsed, is known as a parsed entity. An entity which contains binary data (like images) is known as an unparsed entity. (By its very nature, it must be external.) We'll be discussing references to unparsed entities in the next section of this tutorial.
Home TOC |
![]() ![]() ![]() |