Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: XSLT how do I get info from external html web page

From: Frank@-----------.---------.---
To: NULL
Date: 6/6/2005 11:46:00 AM
Thanks!  That explains why I couldn't get the document() function to work.  
Can I use VBScript while stepping through the XSLT file?  And, if so, could 
VBScript do the job?

"Neil Smith [MVP Digital Media]" wrote:

> If your "plain text file" contained XML you could use the document()
> function as desribed here : 
> 
> http://www-128.ibm.com/developerworks/xml/library/x-tipcombxslt/
> http://www.xml.com/pub/a/2002/03/06/xslt.html
> 
> But, since your plain text is not XML (it has no closing tags for CIK
> and NAME for example) you probably won't be able to extract the CIK
> node content using XSLT.
> 
> It might work better though if you were able to preprocess your text
> file to be actual XML 
> 
> You also *cannot* go to the web page and 'extract the title' using XML
> http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000925645
> 
> It's not XML it's HTML4, and can't be read by an XML parser. So you
> need to have some mechanism (such as running HTMLTidy on your web
> server) to convert the HTML to real XML, or you need some other
> scripting language such as ASP or PHP or ..... to extract the title
> element from the HTML. PHP can execute fopen(url) on that web page,
> you can then use a regular expression to extract the title.
> 
> Cheers - Neil
> 
> 
> On Mon, 6 Jun 2005 08:07:17 -0700, "Frank"
> <Frank@d...> wrote:
> 
> >I am stepping through an xslt stylesheet and transforming an xml file into 
> >html output.  At some point in the xslt file I need to go to a plain text 
> >file which is in the same location as the xslt file (C:\data\) and retrieve 
> >the CIK.  The text file looks like this:
> >
> ><SUBJECT-COMPANY> 
> >      <CIK> 0000925645
> >      <NAME> Big Corporation
> ></SUBJECT-COMPANY>
> >
> >and then I need to go to the webpage below and grab the Title value and plug 
> >that into my html output.
> >
> >http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000925645
> >
> >Any ideas?
> 
> 


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent