[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Re: Conversion of Word documents to structured frame documents




Dan Emory wrote:

> FM+SGML V5.5.6's XML import/export capabilities are a side issue. But I
> contend that FM+SGML is not an XML-aware editor, and thus cannot cannot
> create an XML-conforming document instance that can be exported as XML in
> the manner used to export SGML. Instead, the FM+SGML V5.5.6 XML export
> capability uses the same methodology employed to export HTML from an
> unstructured doc, namely, paragraph mapping.

No, it doesn't - I've looked into it further. For structured documents, the XML
export maps the SGML elements to XML elements using the same name by default. You
can change the name using read/write rules as you do in an SGML application.
Following is a quote from the documentation for the XML export which is found in
Appendix H of the Developer's Guide:

"For exporting XML from structured documents (from a FrameMaker+SGML file that uses
structure), the process is the same as that used for export to SGML: a mapping from
elements in the source FrameMaker+SGML file to elements in the output XML file may
be specified in a Write Rules file. If this mapping is not specified then the export
function will use a default one-to-one mapping. See Chapter 10, “Introduction to
Translating between SGML and  FrameMaker+SGML,” and Chapter 21, “Read/Write Rules
Reference,” of this manual for instructions on setting up the mappings. This
appendix covers issues specific to XML export."

Based on this, I would say that FrameMaker+SGML is very much an XML editor, though
perhaps calling it a structured editor would suffice. As a side issue, I believe
that Adobe should and will change the name sometime soon - for one thing, if you're
as poor a typist as I am, it's too damn many letters...

> SEMA's RTF-DOC DTD and rtf2rdc filter can do that.
> SEMA also has an rdc2rtf filter that converts RTF-DOC-conforming structured
> docs back to RTF. I then waxed lyrically that such a round-trip capability
> could solve a problem that's constantly coming up in postings on the two
> Framers lists, namely the unreliability of document conversions between
> FrameMaker and Word.

The unreliability of document conversions between FrameMaker and Word is not the
same as round tripping. Yes, people do complain about not being able to go in one
direction or the other, but I have rarely seen postings from people who seriously
want to round trip.

> As you can see, the DTD/EDD is quite simple, and is capable of being used to
> produce either SGML or XML document instances. All of the original RTF
> formatting information is preserved in attributes and EMPTY elements. The
> rtf2rdc filter recognizes what type of document object each RTF statement is
> describing, wraps the document object contents (if any) in the corresponding
> RTF-DOC element, and converts the RTF formatting information for that object
> to element attribute values.

As you pointed out, "The subject of this thread, originated by
wendy_ling@uk.ibm.com, was whether there was a way to convert Word docs to
structured FM+SGML docs". The fact that the documents conform to a DTD when they
come into FrameMaker+SGML doesn't mean that they're usefully structured. All
recursion information (except possibly very high level things like sections)
disappears when you convert back to RTF. In keeping with the lowest common
denominator theory, that means not adding any structure in FrameMaker+SGML, as it
will be blasted anyway. I don't consider these to be structured documents.

> If a document instance were originated in FM+SGML using the RTF-DOC EDD, it
> ought to be possible to use the FDK to develop an API client that would, on
> export to SGML, insert all (or most) of the format-rule-specified formatting
> properties into the applicable attributes of each instance of each element,
> so that the formatting specified in the EDD would be preserved in the
> exported document instance. Then, using the SEMA rdc2rtf filter, the
> exported instance could be converted to RTF so it can be opened as a
> faithfully reproduced, error-free, unstructured document in Word,
> FrameMaker, or any other DTP that imports RTF.

I have written many filters to go from SGML to RTF, so my approach would be
different. I would save the SGML out of FrameMaker+SGML and write a conversion that
dealt with converting SGML conforming to a specific DTD to RTF. This typically only
takes a couple of days, unless your DTD is huge. Now I have the SGML to RTF side
covered. Can I get the RTF to SGML? No, because my SGML structure is more
complicated than RTF is capable of representing. Your approach seems to involve
dumbing your structure down to something that matches RTF - if the customer is
satisfied with such a structure, then you do indeed have a solution. I don't however
see this very narrow band of users as being the salvation of the product.

> 8. What I was proposing, in my original post to this thread, was that the
> RTF-DOC DTD, combined with the round-trip filters from SEMA, might offer a
> solution for publications groups confronted with the dilemma described in
> item 4 above. Although SGML/XML offers the ultimate solution for the
> electronic interchange of information, I was suggesting something much more
> limited than that. Namely: Replace FrameMaker with FM+SGML and the RTF-DOC
> DTD/EDD. This is not really a structured document solution. It's simply a
> solution that requires a structured document approach in order to carry out
> error-free round-trip conversions between FrameMaker and Word.

It would accomplish that, but I really don't believe that there's a large market for
it. If it really was a winner, the good people at Adobe would have spent some of
their fortunes on beefing up the RTF filters. After all, that cuts out the
uncertainty involved with introducing SEMA into the loop.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein



** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **