[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Re: Office 2003 Beta (long)



The dirty little secret which no one seems to acknowledge is that format 
(and even layout) are part of content. If that were not so, then all the 
hand-wringing and difficulties involved in importing/ exporting documents 
between different DTPs would not exist. There would be a simple solution 
which completely eliminates all the problems with filters. Namely, everyone 
would simply export their documents as ASCII text, which is importable into 
any DTP. If that were the solution, we wouldn't see the huge volume of 
posts on the various Framers lists about the problem of reliably 
round-tripping documents between FrameMaker and Word without loss of 
formatting and layout.

If you believe meaning, understandability, and readability are invaluable 
parts of content, then you must concede that the original document format 
and layout must somehow be preserved when exchanging or delivering 
documents. One way of doing this is to use the PDF format. Obviously, 
however, that solution has severe shortcomings as a means of information 
interchange.

Format carries meaning in many ways, the most obvious being the use of 
emphasis, bolding, color, different fonts and sizes, autonumbering, etc. to 
convey important information about content to the reader. Content without 
readability is next to worthless. Format and layout are provided in DTPs to 
enhance meaning, understandability and readability. If that were not true, 
then we'd all create our documents using an ASCII text editor like Notepad, 
and there'd be no wrestling with information exchange between diverse DTPs.

Having said all that, there is no doubt the separation of raw content from 
format and layout information is vital to achieving machine (not human) 
readability. More and more we realize that human readers are not the sole 
(and perhaps not even the most dominant) consumers of technical content. 
Machine readability can greatly enhance the retrieval, management, 
interchange and preservation of information.

Hence the emergence of SGML, HTML, XML and Unicode, all of which are 
adopted international standards based on the premise that neither humans 
nor machines should have to rely on proprietary software to create, 
process, read, exchange, use, review, retrieve, manage, or display 
information. But each of those standards (other than Unicode) provide a way 
(separated from raw content) to preserve (and optionally use) format, 
either in the manner intended by the originator, or in some alternative 
format. None of these standards, however, provide a viable way to preserve 
the original page layout, including such things as running header/footers, 
multiple columns, sideheads, article threads, landscaped pages, etc.

On the web we experience all the time the effects of this failure to 
preserve the original page layout design. Many studies have shown the stark 
drop-off in comprehension and readability of non-PDF on-line documents 
compared to well-designed page-layout-oriented paper documents. And when we 
try to improve readability and comprehension by printing such non-PDF 
on-line documents, we typically find graphics which split across pages, 
graphics and text which get truncated, graphics of poor quality when 
printed, lack of running header/footers, and many similar problems which 
detract from readability and comprehension.

My understanding is that the Oasis OpenOffice initiative is an attempt to 
solve these many problems by establishing a non-proprietary standard (but 
extensible) XML DTD and schema for office documents of many types, together 
with a separate standardized way to preserve the intended formatting 
(presumably using XSL) and layout information. Any viewing or DTP software 
which conforms to the OpenOffice standard can import or export documents in 
the OpenOffice form. Of course, such documents can alternatively be 
delivered as raw XML, in which case at least the formatting information, if 
not the layout, can be successfully processed by any XML-aware software 
which can process XSL

Now, let's look at the Word 11 "solution" offered by Microsoft in the 
Office 2003 Beta. Yes (unlike FrameMaker) it can use schema and Unicode. 
Yes, it will deliver raw XML, but it cannot (even optionally as I 
understand it) deliver an accompanying XSL which preserves the formatting 
in the original Word document. Nor can it deliver the original layout 
information. The alternative delivery method is WordXM, which preserves, in 
a proprietary format, a document that is openable only in Word 11. Although 
I am not certain of this, the only way in which Word can structure XML is 
by a mapping of paragraph and character tags to the corresponding elements 
defined in the schema. Anyone who has attempted to convert typical 
unstructured Frame documents to structured ones by mapping paragraph and 
character tags to a conformant EDD/ DTD using structure rules tables can 
attest to the futility of this method. The likelihood that the typical Word 
user will properly and consistently tag such unstructured documents is so 
low as to render meaningless the entire methodology of Word 11.

So what do we have here? Microsoft recognizes, for all of the reasons I've 
cited above, that the delivery of raw XML solves the machine readability 
problem, but in no way solves the formatting and layout problems needed to 
review or successfully comprehend information. Since most users require 
formatting information, and, certainly in the beginning, will be 
disinclined to develop XSL formatting solutions for diversely formatted 
Word documents delivered as raw XML, they must instead rely upon document 
exchange using the proprietary WordXML format of Word 11.

The Word 11 "solution" is no solution at all. It is a chimera designed to 
convince the executives (typically uninformed) who impose Word on their 
corporate users that Microsoft has a real XML solution, when in fact that 
solution is nothing but an artifice designed to preserve the dominance of 
Microsoft stinking Word.

I want an approach that converts Microsoft stinking Word into Microsoft 
sinking Word, and the hope of doing that now rests on the emerging 
OpenOffice standard.
==============================================
After saying all that, if Adobe intends to preserve FrameMaker as a viable 
alternative to Word, it must:

1. Provide a full implementation of Unicode.

2. Add a Schema capability.

3. Embrace and participate in the development of the OpenOffice standard, 
and make it possible to successfully import and export OpenOffice documents.

4. Develop a way to export EDD format rules as fully compliant XSL.

FrameMaker/FrameMaker+SGML Document Design & Database Publishing
DW Emory <danemory@globalcrossing.net>


** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **