Altova Mailing List Archives
>xsl-list Archive Home
>Thread Prev - RE: [xsl] xslt, xml, and databases - howto combine?
>Thread Next - RE: [xsl] xslt, xml, and databases - howto combine?
RE: [xsl] xslt, xml, and databases - howto combine?
Date: 4/9/2002 9:46:00 AM
On Tue, 9 Apr 2002, Michael Kay wrote: > You are struggling hard to find an alternative to using an XML database. > > Why not give up the struggle, and use an XML database? After all, this is > why they were invented! Because I'm a cheap-skate (or alternatively, just have no money to spend on this), am trying to use only free software, and the only open source native-xml database I know of is unfinished and currently just a wrapper for a relational db anyway. And only has a Java interface, where I'm using perl. Maybe not good reasons, but I'm stuck with them :-( But I guess I should take your reply as letting me know I'm trying to do something unreasonable.. Graham > > Michael Kay > Software AG > home: Michael.H.Kay@xxxxxxxxxxxx > work: Michael.Kay@xxxxxxxxxxxxxx > > > -----Original Message----- > > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx > > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of > > Graham Seaman > > Sent: 09 April 2002 15:58 > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > > Subject: [xsl] xslt, xml, and databases - howto combine? > > > > > > Hi all, > > > > This question may be a little off-topic; if anyone thinks it belongs > > elsewhere, please let me know of a more suitable list! > > > > I have a web site based on numbers of relatively small xml documents > > (currently stored as flat files). The documents contain a mix > > of simple > > elements (strings, integers etc) and elements with longer stretches of > > text. > > Each HTML page is typically generated using xsl from a number of these > > files. Given the filenames for the xml files, the xsl processing is > > reasonably fast (it will never be a very high traffic site). > > > > The problem comes with identifying the filenames in the first place. > > Currently, minimal information (id, filename,date) on each > > file is held in > > a database and a database lookup is used to select a set of files; the > > filenames are passed to the xsl as a csv string. I only have > > access to a > > standard relational db, not xml-based ones. > > > > To do more complicated queries to select the set of files to > > process for a > > page by anything more than date, I could: > > > > a) Use xsl to search through each file using document(). I > > believe this > > would be impossibly slow (number of files is currently only > > around 1,000, > > but it will grow). > > > > b) Duplicate the simpler information from the xml files in > > the database, > > and search on this to identify files, combined with a search engine > > indexer to index the text in the files. I would then have > > duplicated data > > to be kept coherent through each update; which makes the site rather > > fragile. > > > > c) Move the whole of the flat files into the database and > > generate the xml > > from them on demand. All searches can then be directly on the > > database, > > and there's no duplication of data. But there's an extra > > processing stage > > needed to generate the xml for the xsl to process. > > > > c) sounds the most robust, but both b) and c) run into a problem: I'm > > trying to keep the background site code (perl) and the xsl relatively > > independent of the particular xml format; for example, > > generating forms by > > using xsl to process an xsd schema, so that schema changes > > don't need the > > site code to be rewritten. But if I force my xml into a relational > > database, I lose this independence. > > One way I can see to keep it is to assume a one-to-one correspondence > > between field names and XPath expressions (eg. fields with names like > > item/admin/poster/email), so that I can regenerate xml from > > the database > > without one-off rules which break if the schema changes. But > > this means I > > can only ever have one leaf node identified by a particular XPath > > expression, which seems very limiting. > > Another possibility might be to generate xml from the > > database in multiple > > stages: first a quite flat xml document which directly > > reflects db field > > names, which is then processed by xsl via a rule-set which > > converts the > > first stage xml to xml as defined in the schema. Yet another layer of > > processing... > > > > Am I missing other ways round this? Is there some standard way to deal > > with this problem? Or are pure xml databases the only way to avoid > > kludges like this? > > > > Thanks for any advice > > Graham > > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list