Altova Mailing List Archives>Archive Index >microsoft.public.xsl Archive Home >Recent entries >Thread Prev - XSLT how do I get info from external html web page >Thread Next - Re: XSLT how do I get info from external html web page Re: XSLT how do I get info from external html web pageTo: NULL Date: 6/6/2005 5:18:00 PM If your "plain text file" contained XML you could use the document() function as desribed here : http://www-128.ibm.com/developerworks/xml/library/x-tipcombxslt/ http://www.xml.com/pub/a/2002/03/06/xslt.html But, since your plain text is not XML (it has no closing tags for CIK and NAME for example) you probably won't be able to extract the CIK node content using XSLT. It might work better though if you were able to preprocess your text file to be actual XML You also *cannot* go to the web page and 'extract the title' using XML http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000925645 It's not XML it's HTML4, and can't be read by an XML parser. So you need to have some mechanism (such as running HTMLTidy on your web server) to convert the HTML to real XML, or you need some other scripting language such as ASP or PHP or ..... to extract the title element from the HTML. PHP can execute fopen(url) on that web page, you can then use a regular expression to extract the title. Cheers - Neil On Mon, 6 Jun 2005 08:07:17 -0700, "Frank" <Frank@d...> wrote: >I am stepping through an xslt stylesheet and transforming an xml file into >html output. At some point in the xslt file I need to go to a plain text >file which is in the same location as the xslt file (C:\data\) and retrieve >the CIK. The text file looks like this: > ><SUBJECT-COMPANY> > <CIK> 0000925645 > <NAME> Big Corporation ></SUBJECT-COMPANY> > >and then I need to go to the webpage below and grab the Title value and plug >that into my html output. > >http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000925645 > >Any ideas? | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
