Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries >Thread Prev - Re: Word 2007 XML merge & PDF conversion on Unix [Thread Next] Re: Word 2007 XML merge & PDF conversion on UnixTo: NULL Date: 8/4/2007 10:04:00 PM [Jongware] wrote: > "Praveen Mohanan" <nospam@n...> wrote in message > news:_R3ti.1360$qa3.1073@n...... >> I can convert all the Word documents to Word 2007 xml & store on unix >> platform. >> >> The Q I have is If I use xml/xslt to merge the data with the Word XML >> document & then store it back as an xml on unix ,how do I convert into PDF? > > 1. You don't "convert" something to PDF. Ever. Please repeat for yourself. > PDF is printer output, just as the paper from your printer. Have you ever > converted something to paper? > So, if PDF is printer output, your question becomes: "how do I print an xml on > unix to pdf?" > > 2. XML is an abstract data format. If you print XML, you'll get lots and lots of > <this>stuff</this>. "Hey, now I *know* you're wrong! My Word file can be > converted to XML!" Not so. Your Word file, saved as XML, is not different from > the Word .DOC file (well, it shouldn't be). It is saved in another output > format, yes, but you can't print your .DOC file to a printer either. (No you > can't. You need Word to read the byte codes and interpret them for you.) > > 3. The only way you can print your XSLT'ed file in the format you expect (a > nice-looking text document, not line after line of <..>'s) is if you ensured > your output XML format is still readable by Word. Then you can use Word to print > to PDF. The first two are right on target, but the third not the only answer. If the merged data+text are now in an XML format, you can use XSL[T] transform to PDF by one of two methods: XSL:FO --> FO --> PDF using any FO processor XSLT --> LaTeX -- PDF using LaTeX Both work fine: LaTeX has better typographics but you have to learn it and it's not written in Java (some process pipelines demand end-to-end Java). Using XSL:FO you have to reinvent the wheel every time, and the only free processor (fop) is incomplete. Either way it's going to be tedious because Word does not identify the important parts of your document in a form that a computer can recognise, only its appearance in a form that human eyes and brain can understand, unless your authors have used specifically designed styles in a template. If you allowed your authors to put anything anywhere, in any format they wanted, you will now have to cope with the result, which can be painful. ///Peter -- XML FAQ: http://xml.silmaril.ie/ | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
