Altova Mailing List Archives


RE: [xsl] xslt, xml, and databases - howto combine?

From: Graham Seaman <graham@--------->
To:
Date: 4/9/2002 9:46:00 AM
On Tue, 9 Apr 2002, Michael Kay wrote:

> You are struggling hard to find an alternative to using an XML database.
> 
> Why not give up the struggle, and use an XML database? After all, this is
> why they were invented!

Because I'm a cheap-skate (or alternatively, just have no money to spend
on this), am trying to use only free software,  and the only open source
native-xml database I know of is unfinished and currently just a wrapper
for a relational db anyway. And only has a Java interface, where I'm using
perl. Maybe not good reasons, but I'm stuck with them :-( 

But I guess I should take your reply as letting me know I'm trying to do
something unreasonable..

Graham


> 
> Michael Kay
> Software AG
> home: Michael.H.Kay@xxxxxxxxxxxx
> work: Michael.Kay@xxxxxxxxxxxxxx
> 
> > -----Original Message-----
> > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of
> > Graham Seaman
> > Sent: 09 April 2002 15:58
> > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > Subject: [xsl] xslt, xml, and databases - howto combine?
> >
> >
> > Hi all,
> >
> > This question may be a little off-topic; if anyone thinks it belongs
> > elsewhere, please let me know of a more suitable list!
> >
> > I have a web site based on numbers of relatively small xml documents
> > (currently stored as flat files). The documents contain a mix
> > of simple
> > elements (strings, integers etc) and elements with longer stretches of
> > text.
> > Each HTML page is typically generated using xsl from a number of these
> > files. Given the filenames for the xml files, the xsl processing is
> > reasonably fast (it will never be a very high traffic site).
> >
> > The problem comes with identifying the filenames in the first place.
> > Currently, minimal information (id, filename,date) on each
> > file is held in
> > a database and a database lookup is used to select a set of files; the
> > filenames are passed to the xsl as a csv string. I only have
> > access to a
> > standard relational db, not xml-based ones.
> >
> > To do more complicated queries to select the set of files to
> > process for a
> > page by anything more than date, I could:
> >
> > a) Use xsl to search through each file using document(). I
> > believe this
> > would be impossibly slow (number of files is currently only
> > around 1,000,
> > but it will grow).
> >
> > b) Duplicate the simpler information from the xml files in
> > the database,
> > and search on this to identify files, combined with a search engine
> > indexer to index the text in the files. I would then have
> > duplicated data
> > to be kept coherent through each update; which makes the site rather
> > fragile.
> >
> > c) Move the whole of the flat files into the database and
> > generate the xml
> > from them on demand. All searches can then be directly on the
> > database,
> > and there's no duplication of data. But there's an extra
> > processing stage
> > needed to generate the xml for the xsl to process.
> >
> > c) sounds the most robust, but both b) and c) run into a problem: I'm
> > trying to keep the background site code (perl) and the xsl relatively
> > independent of the particular xml format; for example,
> > generating forms by
> > using xsl to process an xsd schema, so that schema changes
> > don't need the
> > site code to be rewritten. But if I force my xml into a relational
> > database, I lose this independence.
> > One way I can see to keep it is to assume a one-to-one correspondence
> > between field names and XPath expressions (eg. fields with names like
> > item/admin/poster/email), so that I can regenerate xml from
> > the database
> > without one-off rules which break if the schema changes. But
> > this means I
> > can only ever have one leaf node identified by a particular XPath
> > expression, which seems very limiting.
> > Another possibility might be to generate xml from the
> > database in multiple
> > stages: first a quite flat xml document which directly
> > reflects db field
> > names, which is then processed by xsl via a rule-set which
> > converts the
> > first stage xml to xml as defined in the schema. Yet another layer of
> > processing...
> >
> > Am I missing other ways round this? Is there some standard way to deal
> > with this problem? Or are pure xml databases the only way to avoid
> > kludges like this?
> >
> > Thanks for any advice
> > Graham
> >
> >
> >  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> >
> 
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> 
> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Disclaimer

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.