Altova Mailing List Archives>Archive Index >microsoft.public.xsl Archive Home >Recent entries >Thread Prev - Re: XSLT how do I get info from external html web page >Thread Next - Re: XSLT how do I get info from external html web page Re: XSLT how do I get info from external html web pageTo: NULL Date: 6/6/2005 11:46:00 AM Thanks! That explains why I couldn't get the document() function to work. Can I use VBScript while stepping through the XSLT file? And, if so, could VBScript do the job? "Neil Smith [MVP Digital Media]" wrote: > If your "plain text file" contained XML you could use the document() > function as desribed here : > > http://www-128.ibm.com/developerworks/xml/library/x-tipcombxslt/ > http://www.xml.com/pub/a/2002/03/06/xslt.html > > But, since your plain text is not XML (it has no closing tags for CIK > and NAME for example) you probably won't be able to extract the CIK > node content using XSLT. > > It might work better though if you were able to preprocess your text > file to be actual XML > > You also *cannot* go to the web page and 'extract the title' using XML > http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000925645 > > It's not XML it's HTML4, and can't be read by an XML parser. So you > need to have some mechanism (such as running HTMLTidy on your web > server) to convert the HTML to real XML, or you need some other > scripting language such as ASP or PHP or ..... to extract the title > element from the HTML. PHP can execute fopen(url) on that web page, > you can then use a regular expression to extract the title. > > Cheers - Neil > > > On Mon, 6 Jun 2005 08:07:17 -0700, "Frank" > <Frank@d...> wrote: > > >I am stepping through an xslt stylesheet and transforming an xml file into > >html output. At some point in the xslt file I need to go to a plain text > >file which is in the same location as the xslt file (C:\data\) and retrieve > >the CIK. The text file looks like this: > > > ><SUBJECT-COMPANY> > > <CIK> 0000925645 > > <NAME> Big Corporation > ></SUBJECT-COMPANY> > > > >and then I need to go to the webpage below and grab the Title value and plug > >that into my html output. > > > >http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000925645 > > > >Any ideas? > > | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
