[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Re: Conversion of Word documents to structured frame documents



= From: wendy_ling@uk.ibm.com
= Message-ID: <80256746.0029072C.00@d06mta03.portsmouth.uk.ibm.com>
= Date: Thu, 1 Apr 1999 08:27:05 +0100
= Subject: Conversion of Word documents to structured frame documents
= 
= Does anyone have any experience or advice on converting Microsoft Word 6
= (or later) documents into a structured frame document format?
= 
= Wendy Ling
= Hursley Information Development
= Phone: 01962-815797
= E-mail: wendy_ling@uk.ibm.com

Wendy -

Absolutely. The company I work for, INFOCON, developed a solution for
exactly this sort of conversion for the U.S. Air Force. The AF authors
variously use Word 2, Word 5, Word 6, Word 97 and Word 98.

This process is a based on custom set of FDK (Frame Developers Kit) API
client programs, in order to apply FrameMaker-specific paragraph tags
to the Word document after it has been imported into FrameMaker (via
the standard included filters).  This is done via keyword recognition
(e.g., "Chapter" or "Attachment") as well as based on a very strict
numbering scheme (1., 1.1., 1.1.1., etc.) mandated by the Air Force
policy directive governing publications.

The program then imports the FM-SGML template which includes:
 - EDD format and structure rules;
 - paragraph, character, table, and cross-reference formats (tags);
 - master page layouts and referenece pages for TOC, INDEX, etc;
 - ISO Entity definitions (both FM variables and special character
   format combinations).

The next step is to use a structure conversion table that maps from the
character and paragraph styles to SGML elements, and wraps sequences of
'simple' elements into more complex structures, finally arriving at a
fully structured document.

The final step is to impose master pages on the file so that:

 - wide figures (graphics) and tables are placed onto landscape pages;
 - empty pages (at the end of a file) are associated with a special
      "blank" master page (as they must display the "This Page Left
      Intentionally Blank" statement);
 - any necessary security classification appears in the header/footer
   of all pages of the file.

Admittedly, the process is not perfect, but does (as of the last review)
perform about a 98 percent conversion from the flat Word document to a
fully conforming SGML file.

Cleanup is easy and quick in FM (MUCH faster than the Adept Editor which
was previously used); overall throughput of the publishing group has
increased approximately 500 percent since this solution was adopted:

 - average before FM+SGML was 17 documents processed per month
	(of varying lengths from 3 to 600 pages);

- after FM+SGML process introduced is 95 documents per month
	(same size criteria: between 3 and 600 pages in length)

Once validated, this file is then saved as an SGML instance, and also
output as a PDF document for online review and word-wide distribution.

If you wish, I can supply more information, or if this sounds to be the
sort of solution you require, we can provide a quotation on customizing
these tools to match your requirements.

NOTE: this entire process is _very dependent_ on the SGML DTD and style
requirements, both of the MS Word file(s) and the resulting FM+SGML
publication, and therefore, can not be simply sold "as is" and expected
to provide acceptable performance, conversion, or conformity to anything
other thna the precise DTD and styles for which it was programmed.

Also, note that ifyour Word documents are strictly styled, the overall
complexity of this can be reduced, as the first phase of this process
(to apply FM styles) would (probably) not be needed, and you could
just import the structured template and convert to a FM+SGML instance
with the appropriate conversion table file.  And you may not need the
master page imposition, either.

I hope this helps. Again, if you wish more information, please contact
me (via email at lsmalley@infocon.com <mailto:lsmalley@infocon.com>)

- Lester
----------------------------------------------------------------------
 Lester C. Smalley                    | email:  lsmalley@infocon.com
 Manager, Computer Systems & Training | USMail: P. O. Box 310
 Information Consultants, Inc.        | Phone:  (302) 239-2942 ext-13
 Hockessin, DE 19707-0310             | FAX:    (302) 239-1712
--------------------------------------+-------------------------------
INFOCON is a Premium VAR for Adobe, Sun, and related hardware/software
 dedicated to providing integrated office solutions for productivity.
----------------------------------------------------------------------
                         http://www.infocon.com/

** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **