Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


DTDs and XML: another "not well formed" question

From: "seven.reeds" <seven.reeds@-----.--->
To: NULL
Date: 7/1/2007 5:23:00 PM

Hi,

I'm new to parsing/using xml but this project seemed reasonable to cut
my teeth on.  I have a few dozen "articles" that are local
announcements of interest for my group's customers.  They have a
simple format of a title, zero or more static or hyper-linked images
and one or more paragraphs of text.

A "Title" will hold plain text.  The "Text"s will hold plain or mixed
content.  The "Image"s will need to know about the hyper-link URL (if
any); the image source URL and possibly "height" and "width"
attributes.

I have made a stab at making a DTD

<!ELEMENT article (title, image*, text+)>
<!ELEMENT title   (#PCDATA)>
<!ELEMENT image   (src, width?, height?, link?)>
    <!ATTLIST src    CDATA #REQUIRED>
    <!ATTLIST link   CDATA #IMPLIED>
    <!ATTLIST width  PCDATA #IMPLIED>
    <!ATTLIST height PCDATA #IMPLIED>
<!ELEMENT text    (#CDATA)>

A sample xml doc looks like

<?xml version="1.0" ?>
<!DOCTYPE article SYSTEM "http://www.itg.uiuc.edu/publications/news/
news.dtd">
<article>
  <title> Applied Physics Letters Features ITG Image on Cover </title>
  <image link="http://scitation.aip.org/dbt/dbt.jsp?
KEY=APPLAB&Volume=90&Issue=21"
    src="/images/apl_cover-130.jpg" />
  <text> The cover for the <a href="http://scitation.aip.org/dbt/
dbt.jsp?KEY=APPLAB&Volume=90&Issue=21">May
21, 2007 edition of Applied Physics Letters</a>features an image
produced in the ...
</text>
</article>

Now, I have spent time searching this group and a couple others
related to the scripting language and the XML parser i am using.  I
*know* what my problem is... what i don't know is why I have it.

My XML parser chokes on the first "&" (ampersand) in the "link"
attribute of the "image" tag.  I know that being "well-formed" means
the amps should be "quoted" but I thought that the "CDATA bits in the
DTD meant that *ALL* characters are accepted in this context.

Is my DTD wrong for the xml I have?  Is my parser/validator not
picking up on the DTD?

I know that I can pre-process the incoming xml file and change the
amps to the html entity version but that feels wastefull if CDATA is
doing what i thought it should do.

other than a clue :), what am I missing?



transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent