Re: newbie: Word2003 -> XML -> SQL Server

From: Kent Tegels <ktegels@-------.--->
Date: 3/18/2006 1:12:00 PM
Hello jockster,

> Apologise for the simplistic newbie question.
> I need to get transform many (100's) of word documents into XML for
> import into SQL Server 2000 (or 2005) in an easy and efficient manner.
> Objective is as follows,
> 1. Sections/paragraphs of the individual word documents can be queried
> and then published from SQL Server.

Your best choice here then is to have save the Word Docs as XML [0] (Office 
2003 can do that and you should be able to write a little VSTO application 
that does over a folder-full of directories as needed).

> 2. Should be able to (easily if possible) tag sections/paragraphs of
> word documents (from within Word) prior to conversion to XML.Tagging
> information should then be able to be imported and used to facilitate
> search for documents in SQL Server

Not sure what you're really aiming at here. The OfficeXML format in O11 writes 
paragraphs nicely and you might be able to mark sections and so on using 
existing features in Word itself. You should then be able to use XQuery over 
that in SQL Serve 2005.

> 3. If it was possible to create a template within Word to better
> enable the above then that would be good for when new documents are
> produced by original authors

That should be completely doable, and then you help make the document itself 
more intelligent by using VSTO as well. More on VSTO at [1].

> If people could suggest methods, tools and or books/articles
> explaining how the above can be acheived I would be very grateful

Aside from the VSTO Link, there's an article or two in my blog about loading 
and querying XML documents into SQL Server. The basic trick is to use some 

insert into select * from openrowset(bulk 'path-to-file',single_blob) 
as p 

to load

and then use something like:

select doc.query('...') from where doc.exist(...) 

to find whatever it is your looking for. Its pretty easy to do against the 
document properties. Content queries are helped much by combining Full-Text 
Search and XQuery.


Thank you,
Kent Tegels


