Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries [Thread Prev] >Thread Next - Re: DTDs and XML: another "not well formed" question DTDs and XML: another "not well formed" questionTo: NULL Date: 7/1/2007 5:23:00 PM Hi, I'm new to parsing/using xml but this project seemed reasonable to cut my teeth on. I have a few dozen "articles" that are local announcements of interest for my group's customers. They have a simple format of a title, zero or more static or hyper-linked images and one or more paragraphs of text. A "Title" will hold plain text. The "Text"s will hold plain or mixed content. The "Image"s will need to know about the hyper-link URL (if any); the image source URL and possibly "height" and "width" attributes. I have made a stab at making a DTD <!ELEMENT article (title, image*, text+)> <!ELEMENT title (#PCDATA)> <!ELEMENT image (src, width?, height?, link?)> <!ATTLIST src CDATA #REQUIRED> <!ATTLIST link CDATA #IMPLIED> <!ATTLIST width PCDATA #IMPLIED> <!ATTLIST height PCDATA #IMPLIED> <!ELEMENT text (#CDATA)> A sample xml doc looks like <?xml version="1.0" ?> <!DOCTYPE article SYSTEM "http://www.itg.uiuc.edu/publications/news/ news.dtd"> <article> <title> Applied Physics Letters Features ITG Image on Cover </title> <image link="http://scitation.aip.org/dbt/dbt.jsp? KEY=APPLAB&Volume=90&Issue=21" src="/images/apl_cover-130.jpg" /> <text> The cover for the <a href="http://scitation.aip.org/dbt/ dbt.jsp?KEY=APPLAB&Volume=90&Issue=21">May 21, 2007 edition of Applied Physics Letters</a>features an image produced in the ... </text> </article> Now, I have spent time searching this group and a couple others related to the scripting language and the XML parser i am using. I *know* what my problem is... what i don't know is why I have it. My XML parser chokes on the first "&" (ampersand) in the "link" attribute of the "image" tag. I know that being "well-formed" means the amps should be "quoted" but I thought that the "CDATA bits in the DTD meant that *ALL* characters are accepted in this context. Is my DTD wrong for the xml I have? Is my parser/validator not picking up on the DTD? I know that I can pre-process the incoming xml file and change the amps to the html entity version but that feels wastefull if CDATA is doing what i thought it should do. other than a clue :), what am I missing? | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
