[Date Prev][Date Next]
[Thread Prev][Thread Next]
[Date Index]
[Thread Index]
[New search]
To: Marcus Carr <mrc@xxxxxxxxxxxxxx>
Subject: Re: Conversion of Word documents to structured frame documents
From: Dan Emory <danemory@xxxxxxxxxxxx>
Date: Thu, 8 Apr 1999 14:18:17 -0700 (MST)
Cc: Hedley_S_Finger@xxxxxxxxxxxxxxxxx, framers@xxxxxxxxx
Sender: owner-framers@xxxxxxxxx
At 07:11 PM 4/8/99 +1000, Marcus Carr wrote: > >-------------------Snip >No, it doesn't - I've looked into it further. For structured documents, the XML >export maps the SGML elements to XML elements using the same name by default. You >can change the name using read/write rules as you do in an SGML application. >----------------Snip +++++++++++++++++++++++++++ Thanks, Marcus for finally clearing this up. I'm still suspicious that the devil is in the details, however. I guess I'll just have to break down and get a copy of FM+SGML 5.5.6 to find out for myself. ++++++++++++++++++++++++++++++++++++ >> SEMA's RTF-DOC DTD and rtf2rdc filter can do that. >> SEMA also has an rdc2rtf filter that converts RTF-DOC-conforming structured >> docs back to RTF. I then waxed lyrically that such a round-trip capability >> could solve a problem that's constantly coming up in postings on the two >> Framers lists, namely the unreliability of document conversions between >> FrameMaker and Word. > >The unreliability of document conversions between FrameMaker and Word is not the >same as round tripping. Yes, people do complain about not being able to go in one >direction or the other, but I have rarely seen postings from people who seriously >want to round trip. >--------------------------------Snip >As you pointed out, "The subject of this thread, originated by >wendy_ling@uk.ibm.com, was whether there was a way to convert Word docs to >structured FM+SGML docs". The fact that the documents conform to a DTD when they >come into FrameMaker+SGML doesn't mean that they're usefully structured. All >recursion information (except possibly very high level things like sections) >disappears when you convert back to RTF. In keeping with the lowest common >denominator theory, that means not adding any structure in FrameMaker+SGML, as it >will be blasted anyway. I don't consider these to be structured documents. >--------------------------Snip >I have written many filters to go from SGML to RTF, so my approach would be >different. I would save the SGML out of FrameMaker+SGML and write a conversion that >dealt with converting SGML conforming to a specific DTD to RTF. This typically only >takes a couple of days, unless your DTD is huge. Now I have the SGML to RTF side >covered. ++++++++++++++++++++++++++++++++++++++++++++++++ But if you do it the way you describe above, there's no way to include the RTF formatting information (font definitions, style sheets, ad-hoc format overrides, etc.) ++++++++++++++++++++++++++++++++++++++++++++++++++ Can I get the RTF to SGML? No, because my SGML structure is more >complicated than RTF is capable of representing. Your approach seems to involve >dumbing your structure down to something that matches RTF - if the customer is >satisfied with such a structure, then you do indeed have a solution. I don't however >see this very narrow band of users as being the salvation of the product. ++++++++++++++++++++++++++++++++++++++++++++++ After further analysis, it appears that SEMA's round-trip filters have the principal purpose of archival storage of unstructured Word (and other RTF-compatible WP products) documents in a neutral format (XML or SGML) that preserves the formatting information so that they can be recovered years later when the original WP is no longer available. However, the SEMA rtf2rdc filter's preservation (in attributes) of the original font definitions, stylesheet, and document header, combined with the preservation (again in attributes) of any ad hoc format variations in XML/SGML paragraph and character style element instances, is a nice touch. The fact that each paragraph (PARA) and character style (CS) element in the RTF-DOC DTD has attributes that identify the applicable stylesheet instance being used offers the opportunity to use SGML- or SML-aware tools to convert RTF-DOC document instances to more elaborate structures if: 1. The stylesheet names are indicative of the structure, AND 2. Consistent tagging was utilized during the preparation of the original WP document. ++++++++++++++++++++++++++++++++++++++++++++++++++++ >> 8. What I was proposing, in my original post to this thread, was that the >> RTF-DOC DTD, combined with the round-trip filters from SEMA, might offer a >> solution for publications groups confronted with the (round-trip) dilemma... ----------------------Snip >It would accomplish that, but I really don't believe that there's a large market for >it. If it really was a winner, the good people at Adobe would have spent some of >their fortunes on beefing up the RTF filters. After all, that cuts out the >uncertainty involved with introducing SEMA into the loop. +++++++++++++++++++++++++++++++++++++++++++++++++++++ I think Adobe's strategy is being driven mainly by what the existing major license holders want. Most of those companies' businesses are in the military, aerospace, semiconductor, pharmaceutical, and telecommunications fields, where the Word vs. Frame debate is already resolved in favor of Frame. They aren't much concerned about round-tripping. But Frame, although it is ideally suited for producing proposal documents, has little penetration of that market, and one of the main reasons is the need for Word-to-Frame round-tripping. Most proposal input comes from people who use Word. If the proposal group uses Frame or FM+SGML, their documents must be converted back to Word for editing/updating by the proposal contributors. And many US government agencies still require that proposals be submitted in Word or WordPerfect, even when submittals in PDF or HTML are also allowed. The increasing US government requirement for page-limited proposals imposes even greater demands on round-tripping, since conversions in one direction or the other might change the page count. I know of several instances where major Frame license holders have considered using Frame or FM+SGML in their proposal groups, but abandoned the idea because of the unreliability of the round-trip conversion process. In a pressure-cooker proposal environment, round-trip conversions must be almost completely free of errors, or the necessity for any post-conversion clean-up. Even when round-tripping is not a regular occurrence, it can still be a vital requirement. For example: 1. Source data is often created in Word, and must be converted to Frame without introducing errors or the need for extensive post-conversion clean-up. 2. Legacy documents in Word need to be converted to Frame, particularly when an organization first acquires Frame. 3. An enterprise's tools for converting to on-line context-sensitive help (e.g., HTML Help, WinHelp, RoboHelp, etc.) may require that the input be in Word or RTF, necessitating error-free conversions of Frame docs to RTF or Word. 4. Documents created in Frame may have to be repurposed using Word (e.g., training materials produced by departments other that don't use Frame. In summary, Frame is still (and probably always will be) a niche product, which means that it is not widely distributed electronically in its native format. Presently, the only reliable conversion available within the FrameMaker product is to PDF, and even that conversion is often problematic. Conversion of FM+SGML structured docs to SGML or XML often requires extensive development. If the market for Frame products is to broaden, its capability for round-trip conversions to/from other formats must be expanded and improved. The most likely source of such conversion tools is third-party software vendors like SEMA, Omni Systems, Blueberry, Quadralay, and (in the case of SGML/XML) OmniMark. It now appears that Adobe plans to issue a major new release of Frame about once every two years (provided they can find a way to avoid bug-ridden point releases such as 5.5). Third-party software vendors of conversion tools are on a much shorter release schedule, because the nature of their business demands it. Adobe would be better off if it subsidized, or in other ways supported, those third-party vendors rather than trying to develop adequate conversion tools within the Frame product itself. Promotional deals could be struck that offered these third-party products at a deep discount to Frame license holders. ==================== | Nullius in Verba | ==================== Dan Emory, Dan Emory & Associates FrameMaker/FrameMaker+SGML Document Design & Database Publishing Voice/Fax: 949-722-8971 E-Mail: danemory@primenet.com 10044 Adams Ave. #208, Huntington Beach, CA 92646 ---Subscribe to the "Free Framers" list by sending a message to majordomo@omsys.com with "subscribe framers" (no quotes) in the body. ** To unsubscribe, send a message to majordomo@omsys.com ** ** with "unsubscribe framers" (no quotes) in the body. **