Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries >Thread Prev - Re: XSL for removing words less than 4 letters in a sitemap >Thread Next - Re: XSL for removing words less than 4 letters in a sitemap Re: XSL for removing words less than 4 letters in a sitemapTo: NULL Date: 4/3/2008 2:56:00 AM On 2 abr, 13:35, Martin Honnen <mahotr...@yahoo.de> wrote: > Olagato wrote: > > I need to transform this: > > > <urlset xmlns="http://www.google.com/schemas/sitemap/0.84"> > > <url> > > <loc>http://localhost/index.php/index./Paths-for-the-extreme-player</ > > loc> > > </url> > > <url> > > <loc>http://localhost/index.php/index.php/Games/The-edge-of-the- > > wall</loc> > > </url> > > </urlset> > > > into this: > > > <urlset xmlns="http://www.google.com/schemas/sitemap/0.84"> > > <url> > > <loc>http://localhost/index.php/index./Books/Paths-for-the- > > extreme-player</loc> > > <news:news> > > <news:keywords>Books, Paths, extreme, player</ > > news:keywords> > > </news:news> > > </url> > > <url> > > <loc>http://localhost/index.php/index.php/Games/The-edge-of-the- > > wall</loc> > > <news:news> > > <news:keywords>Games, edge, wall</news:keywords> > > </news:news> > > </url> > > </urlset> > > > I mean, I need a template for creating a <news:keywords> tag which > > contents all the words from <loc> tag with words of more than 3 > > letters. > > Do you want to use XSLT 2.0 or 1.0? > What about words like 'localhost' or 'index', how do you decide that > those are not taken? > > Here is an XSLT 2.0 stylesheet that should show you an approach using > the tokenize method: > > <xsl:stylesheet > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > xmlns:news="http://example.com/2008/news" > xmlns:sm="http://www.google.com/schemas/sitemap/0.84" > exclude-result-prefixes="sm" > version="2.0"> > > <xsl:output method="xml" indent="yes"/> > > <xsl:strip-space elements="*"/> > > <xsl:template match="@* | node()"> > <xsl:copy> > <xsl:apply-templates select="@* | node()"/> > </xsl:copy> > </xsl:template> > > <xsl:template match="sm:url"> > <xsl:copy> > <xsl:apply-templates select="@* | node()"/> > <news:news> > <news:keywords> > <xsl:value-of > select="for $s in tokenize(sm:loc, '/')[position() > 5] > return tokenize($s, '[\-/]')[string-length(.) > 3]" > separator=", "/> > </news:keywords> > </news:news> > </xsl:copy> > </xsl:template> > > </xsl:stylesheet> > > Result with Saxon 9 when run against your posted input sample (with a > 'root' element added and a namespace choosen for the 'news' prefix) is > > <root> > <urlset xmlns="http://www.google.com/schemas/sitemap/0.84"> > <url> > > <loc>http://localhost/index.php/index./Paths-for-the-extreme-player</loc> > <news:news xmlns:news="http://example.com/2008/news"> > <news:keywords>Paths, extreme, player</news:keywords> > </news:news> > </url> > <url> > > <loc>http://localhost/index.php/index.php/Games/The-edge-of-the-wall</loc> > <news:news xmlns:news="http://example.com/2008/news"> > <news:keywords>Games, edge, wall</news:keywords> > </news:news> > </url> > </urlset> > </root> > > -- > > Martin Honnen > http://JavaScript.FAQTs.com/ Thank you for your help, Martin. > Do you want to use XSLT 2.0 or 1.0? I'm using XSLT 1.0 > What about words like 'localhost' or 'index', how do you decide that those are not taken? It's not a problem now. Maybe a sentence like next: translate( translate( substring-after( sm:loc, 'http://localhost/ index.php/index.php/') ,'-', ',') ,'/',',') I'm trying your XSL from PHP without success: <?php header('Content-Type: application/xhtml+xml; charset=utf-8'); $xml = new DOMDocument; $xml->load('original_news.xml'); $xsl = new DOMDocument('1.0','UTF-8'); $xsl->load('news_to_google_markup.xsl'); try{ $proc = new XSLTProcessor(); $proc->importStylesheet($xsl); $newXml = $proc->transformToXML($xml); echo $newXml; }catch(Exception $pEx){ return $pEx->getMessage(); } ?> 1- original_news.xml is: <?xml version="1.0" encoding="UTF-8"?> <?xml-transform type="text/xsl" href="news_to_google_markup.xsl"?> <urlset xmlns:sm="http://www.google.com/schemas/sitemap/0.84"> <url> <loc>http://localhost/index.php/index/Paths-for-the-extreme-player</ loc> </url> <url> <loc>http://localhost/index.php/index.php/Games/The-edge-of-the- wall</loc> </url> </urlset> 2- and your XSL that I've renamed as "news_to_google_markup.xsl" is: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:news="http://example.com/2008/news" xmlns:sm="http://www.google.com/schemas/sitemap/0.84" exclude-result-prefixes="sm" version="2.0"> <xsl:output method="xml" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy> </xsl:template> <xsl:template match="sm:url"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> <news:news> <news:keywords> <xsl:value-of select="for $s in tokenize(sm:loc, '/')[position() > 5] return tokenize($s, '[\-/]')[string-length(.) > 3]" separator=", "/> </news:keywords> </news:news> </xsl:copy> </xsl:template> </xsl:stylesheet> 3- Error reported in PHP is: <b>Warning</b>: XSLTProcessor::importStylesheet() [<a href='function.XSLTProcessor-importStylesheet'>function.XSLTProcessor- importStylesheet</a>]: Invalid expression in <b>C:\Webs\...\htdocs \sitemap\index.php</b> on line <b>18</b><br /> 4- line 18 is: $proc->importStylesheet($xsl); Maybe an invalid XSL version or namespace on header but I dont't know how to resolve this. Any idea will be appreciated. | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
