Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: Can I un-CDATA my CDATA section and elaborate a transformation for the contained data?

From: Peter Flynn <peter.nosp@-.--------.-->
To: NULL
Date: 3/5/2006 8:58:00 PM
troppfigo@e... wrote:
> I have this example of xml
> 
> <?xml version="1.0"?>
> <xml>
>    <![CDATA[
>       <metadata>
>          <title>Embedded Markup</title>
>          <body>Someone told to me...</body>
>       </metadata>
>    ]]>
> </xml>

This is usually very poor design. The content of a CDATA section is
just text: by putting the CDATA markup round it you are explicitly
telling the XML parser that it must no longer be regarded as markup,
so as far as the software is concerned, &lt;metadata> and all the rest
of the content is just a bunch of characters with no special meaning.

See http://xml.silmaril.ie/authors/cdata

> I want to extract the contained data from <body> tag using an xslt
> transformation.
> I want to obtain this
> 
> <html>
> Someone told to me...
> </html>
> 
> 
> it is possible to make this operation?
> Can you post some example code?

You must remove the CDATA code first. Then your XML software will be 
able to treat the markup as markup, and access the elements properly
(and tell whoever generated it that they are making it impossible to
process as XML otherwise).

As it currently stands, you'd need to process the file twice. This
first piece of XSLT will remove the CDATA markup (provided you use a
processor that supports disable-output-escaping -- support for it is
not obligatory, so only some software will do it properly):

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   version="1.0">

   <xsl:output method="xml"/>

   <xsl:template match="xml">
     <xml>
       <xsl:value-of disable-output-escaping="yes" select="."/>
     </xml>
   </xsl:template>

</xsl:stylesheet>

This produces:

<?xml version="1.0"?>
<xml>
       <metadata>
          <title>Embedded Markup</title>
          <body>Someone told to me...</body>
       </metadata>
</xml>

Now it's real markup, so you can process it with another stylesheet, eg:

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   version="1.0">

   <xsl:output method="html"/>

   <xsl:template match="xml">
     <html>
       <head>
         <title>Test</title>
       </head>
       <body>
         <xsl:apply-templates select="metadata/body"/>
       </body>
     </html>
   </xsl:template>

   <xsl:template match="body">
     <p>
       <xsl:apply-templates/>
     </p>
   </xsl:template>

</xsl:stylesheet>

to produce what you appear to mean.

///Peter
-- 
XML FAQ: http://xml.silmaril.ie/


transparent
Print
Mail
Digg
delicious
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent