Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xml-dev] Indexing solution for native XML database

From: "Ken North" <kennorth@---------.--->
To: <xml-dev@-----.---.--->
Date: 12/1/2005 10:17:00 PM
Michael Kay wrote:
> Relational databases are a very useful tool, there are some jobs they are
> very good at (mainly the kind of jobs that people used punched cards for 50
> years ago).

They do handle numbers and characters quite well, but you're describing the
strengths of vintage-1980s SQL technology. A lot has changed.

Many of the SQL platforms have evolved to a universal database model that
support queries over rich types such as video, images, HTML, XML and so on.
Some use a plug-in model similar to how  web browsers uses plug-ins for Flash or
Adobe PDFs.

Using an SQL database for rich types has implications for query optimization.
Here's an example I've probably beaten to death.

You want to write a web app or services that uses tabular data, XML, maps and
geo-spatial data:

"My GPS coordinates are x, customer profile y and presentation format is
[1024x768|320x240|104x208].
Find the nearest location where I can buy a widget for less than 500 Euros. Give
me a top 10 list with directions, map and a consumer review."

One approach is to use specialized servers -- one for a native XML database, one
for maps, etc. -- each with their own data model, indexing technology and access
methods. To write a query optimizer for that solution requires an understanding
of queries across distributed data stores using different data models, indexing
and access methods.

That solution requires integration middleware that processes query results
before delivering a solution to the client. It will require plenty of network
roundtrips to get statistics, data and metadata.

Contrast that with having all of the data managed by a single DBMS (SQL/XML).
Because the query optimizer understands the disparate types and their associated
access methods and indexing techniques, we don't have to develop an optimizer
for a query against an XML database, an image database, geo-spatial data and so
on. We can rely on the query optimizer to determine in what order to retrieve
data instead of having to build that logic into our application or middleware.

To optimize the query and prepare its access plan, the query optimizer does disk
seeks or cache reads, not RPCs over TCP/IP connections. Retrieving the data when
executing the plan is also a matter of doing disk seeks or reading cache memory,
instead of round-tripping messages across a network wire.

The performance gains you expect from using specialized data stores are negated
if your applications have to process several types of data, each with their own
specialized server or engine.


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent