[Date Prev][Date Next]
[Thread Prev][Thread Next]
[Date Index]
[Thread Index]
[New search]
To: Dan Emory <danemory@xxxxxxxxxxxx>
Subject: Re: Conversion of Word documents to structured frame documents
From: Marcus Carr <mrc@xxxxxxxxxxxxxx>
Date: Thu, 08 Apr 1999 19:11:26 +1000
CC: Hedley_S_Finger@xxxxxxxxxxxxxxxxx, framers@xxxxxxxxx
Organization: Allette Systems (Australia)
References: <2.2.16.19990406031241.4717d49a@pop.primenet.com>
Sender: owner-framers@xxxxxxxxx
Dan Emory wrote: > FM+SGML V5.5.6's XML import/export capabilities are a side issue. But I > contend that FM+SGML is not an XML-aware editor, and thus cannot cannot > create an XML-conforming document instance that can be exported as XML in > the manner used to export SGML. Instead, the FM+SGML V5.5.6 XML export > capability uses the same methodology employed to export HTML from an > unstructured doc, namely, paragraph mapping. No, it doesn't - I've looked into it further. For structured documents, the XML export maps the SGML elements to XML elements using the same name by default. You can change the name using read/write rules as you do in an SGML application. Following is a quote from the documentation for the XML export which is found in Appendix H of the Developer's Guide: "For exporting XML from structured documents (from a FrameMaker+SGML file that uses structure), the process is the same as that used for export to SGML: a mapping from elements in the source FrameMaker+SGML file to elements in the output XML file may be specified in a Write Rules file. If this mapping is not specified then the export function will use a default one-to-one mapping. See Chapter 10, “Introduction to Translating between SGML and FrameMaker+SGML,” and Chapter 21, “Read/Write Rules Reference,” of this manual for instructions on setting up the mappings. This appendix covers issues specific to XML export." Based on this, I would say that FrameMaker+SGML is very much an XML editor, though perhaps calling it a structured editor would suffice. As a side issue, I believe that Adobe should and will change the name sometime soon - for one thing, if you're as poor a typist as I am, it's too damn many letters... > SEMA's RTF-DOC DTD and rtf2rdc filter can do that. > SEMA also has an rdc2rtf filter that converts RTF-DOC-conforming structured > docs back to RTF. I then waxed lyrically that such a round-trip capability > could solve a problem that's constantly coming up in postings on the two > Framers lists, namely the unreliability of document conversions between > FrameMaker and Word. The unreliability of document conversions between FrameMaker and Word is not the same as round tripping. Yes, people do complain about not being able to go in one direction or the other, but I have rarely seen postings from people who seriously want to round trip. > As you can see, the DTD/EDD is quite simple, and is capable of being used to > produce either SGML or XML document instances. All of the original RTF > formatting information is preserved in attributes and EMPTY elements. The > rtf2rdc filter recognizes what type of document object each RTF statement is > describing, wraps the document object contents (if any) in the corresponding > RTF-DOC element, and converts the RTF formatting information for that object > to element attribute values. As you pointed out, "The subject of this thread, originated by wendy_ling@uk.ibm.com, was whether there was a way to convert Word docs to structured FM+SGML docs". The fact that the documents conform to a DTD when they come into FrameMaker+SGML doesn't mean that they're usefully structured. All recursion information (except possibly very high level things like sections) disappears when you convert back to RTF. In keeping with the lowest common denominator theory, that means not adding any structure in FrameMaker+SGML, as it will be blasted anyway. I don't consider these to be structured documents. > If a document instance were originated in FM+SGML using the RTF-DOC EDD, it > ought to be possible to use the FDK to develop an API client that would, on > export to SGML, insert all (or most) of the format-rule-specified formatting > properties into the applicable attributes of each instance of each element, > so that the formatting specified in the EDD would be preserved in the > exported document instance. Then, using the SEMA rdc2rtf filter, the > exported instance could be converted to RTF so it can be opened as a > faithfully reproduced, error-free, unstructured document in Word, > FrameMaker, or any other DTP that imports RTF. I have written many filters to go from SGML to RTF, so my approach would be different. I would save the SGML out of FrameMaker+SGML and write a conversion that dealt with converting SGML conforming to a specific DTD to RTF. This typically only takes a couple of days, unless your DTD is huge. Now I have the SGML to RTF side covered. Can I get the RTF to SGML? No, because my SGML structure is more complicated than RTF is capable of representing. Your approach seems to involve dumbing your structure down to something that matches RTF - if the customer is satisfied with such a structure, then you do indeed have a solution. I don't however see this very narrow band of users as being the salvation of the product. > 8. What I was proposing, in my original post to this thread, was that the > RTF-DOC DTD, combined with the round-trip filters from SEMA, might offer a > solution for publications groups confronted with the dilemma described in > item 4 above. Although SGML/XML offers the ultimate solution for the > electronic interchange of information, I was suggesting something much more > limited than that. Namely: Replace FrameMaker with FM+SGML and the RTF-DOC > DTD/EDD. This is not really a structured document solution. It's simply a > solution that requires a structured document approach in order to carry out > error-free round-trip conversions between FrameMaker and Word. It would accomplish that, but I really don't believe that there's a large market for it. If it really was a winner, the good people at Adobe would have spent some of their fortunes on beefing up the RTF filters. After all, that cuts out the uncertainty involved with introducing SEMA into the loop. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein ** To unsubscribe, send a message to majordomo@omsys.com ** ** with "unsubscribe framers" (no quotes) in the body. **