[Date Prev][Date Next]
[Thread Prev][Thread Next]
[Date Index]
[Thread Index]
[New search]
To: Dov Isaacs <isaacs@xxxxxxxxx>
Subject: Re: MS Word and XML
From: "Jeremy H. Griffith" <jeremy@xxxxxxxxx>
Date: Wed, 30 Apr 2003 10:09:43 -0700
Cc: framers@xxxxxxxxxxxxxx, framers@xxxxxxxxx
In-Reply-To: <5.2.1.1.2.20030428103409.03cdfb30@mailsj.corp.adobe.com>
Organization: Omni Systems, Inc.
References: <003001c30da1$fc161ff0$5800a8c0@plutonium> <Pine.OSF.4.44.0304280818190.28716-100000@mail.internorth.com> <LISTMANAGER-25396-5885-2003.04.28-11.23.40--isaacs#adobe.c om@lists.FrameUsers.com> <5.2.1.1.2.20030428103409.03cdfb30@mailsj.corp.adobe.com>
Sender: owner-framers@xxxxxxxxx
On Mon, 28 Apr 2003 11:15:12 -0700, Dov Isaacs <isaacs@Adobe.COM> wrote: >Even RTF changed although in an upward-compatible manner. Not entirely upward-compatible. For example, in earlier versions of Word, graphics sizes were expressed in twips (1440/inch). But starting with Word 8/97, the units were silently changed to 0.1mm (2540/inch), MS "himetric". There was *no* way in RTF to indicate which unit size was being used, so Word 8 docs misinterpret Word 7 RTF to make the graphics look considerably smaller. There are more differences; we find them regularly while improving our Word RTF export filter, a never-ending task... ;-) >I would disagree that there is ANYTHING straightforward about writing >an import filter for Microsoft Word-format documents, whether binary >or RTF. Not only is there the problem of physically parsing these >formats, but there is the larger problem of INTERPRETATION of the >formatting data therein. Amen. The MS RTF specs are a travesty of technical documentation. Aside from the outright errors and typos (many), they totally lack examples of usage. To understand what a given element *does*, you need to examine Word RTF files and use inductive logic. Heavily. And some MS formats that are used throughout Windows apps, like the Structured Storage (OLE object) format, are not documented at all. Deliberately so, in the case of OLE, so that you are forced to use MS licensed libraries to read or write them. These libraries, oddly <g>, are available only on certain platforms... excluding, for example, UNIX. We had to reverse-engineer that one ourselves. It took weeks of work, just so that we could extract the preview WMF from embedded OLE graphics in FrameMaker. >If Microsoft has trouble interpreting all the versions of their >documents, what do you we and others have?!? A hard row to hoe... ;-) Nonetheless, we just might take a shot at it sometime. <bg> We have some interesting ideas about how a filter to import Word into Frame should work, rather different from the design principles used in the present native filters, and if we can get past the shuddering that we experience after opening a current Word RTF file in a text editor, we may embody those ideas in an import filter. In our lifetime. ;-) -- Jeremy H. Griffith, at Omni Systems Inc. (jeremy@omsys.com) http://www.omsys.com/ ** To unsubscribe, send a message to majordomo@omsys.com ** ** with "unsubscribe framers" (no quotes) in the body. **