[Date Prev][Date Next]
[Thread Prev][Thread Next]
[Date Index]
[Thread Index]
[New search]
To: Ted Goranson <tedg@xxxxxxxx>, Free Framers <framers@xxxxxxxxx>
Subject: Re: FM+SGML Information Design
From: Dan Emory <danemory@xxxxxxxxxxxx>
Date: Sun, 6 Jun 1999 18:57:09 -0700 (MST)
Sender: owner-framers@xxxxxxxxx
At 11:27 AM 6/3/99 -0500, Ted Goranson wrote: >Dan- > >I am impressed, both by your kindness in sending this to my attention, and >in the clarity of the document itself. Some of the ideas in the paper are >new to me, but I see some possibilities and want to follow through. If you wish, I will be happy to rewrite the message with a preface for posting to the Frame list. Or alternately, I'll write a summary. ===================================================================== Unfortunately, my posting privileges on frameusers.com were suspended about 9 months ago for having offended the list "owner." He appears disinclined to lift the suspension. I will, however, post it to the Free Framers list (you might want to consider subscribing to that one too--information on how to subscribe is in my signature block). You have my permission to post this response to the "other" list. ============================================================================= >--I already have 500 pages or so in a well-formed Frame document. It is >well-formed in the sense of everything being tagged and the tags making >sense. When upgrading to SGML, I suppose it is not difficult to go through >and reassign everything to new conventions, just time consuming. ==================================================================== The method used by FM+SGML for converting unstructured docs to structured ones is quite similar to that used by FrameMaker to convert to HTML. The FM+SGML Structure Rules Tables method used by FM+SGML for conversions is quite robust, but it requires consistent tagging of the unstructured doc. However: 1. Any ad-hoc character formatting (e.g., making a word Bold without using a character format tag for that purpose) will be lost. 2. Any ad-hoc overrides to paragraph tag formats will be lost. 3. Any significant amount of mistagging will almost always produce an unwelcome outcome. 4. The method has no capability, at the lowest level of structure, to properly wrap unstructured tagged objects in elements based on the objects' context. However, when two or more different unstructured object tags correlate to the same element, and that element can occur in different contexts, it is often possible to use the unstructured tagname to "qualify" the resulting element so as to indicate its context for higher-level wrapping. In some cases, it may even be possible to assign attribute values to elements at the lowest level of structure. Despite the limitations described above, I've had quite a bit of success in using structure rules tables to accomplish conversion to structured docs. On one large project in which I'm presently involved, we're achieving something close to 90% structural validity on the first pass. ===================================================================== >--The idea of information modeling the document is compelling. You seem to >be dealing with "ordinary tech manuals" where procedures are described. ======================================================================== Not really. The methods described in the paper should be applicable to almost any document type. In section 5, "Extensibility of the Modular Structure", the paper describes how encapsulation wrappers can be used to encapsulate different information types. ======================================================================== >But >my content is related more deeply-in other words, I don't have a time >sequence to fall back on. ===================================================================== Printed books are linear. Hypertexts and databases are not. SGML was primarily intended for the former, which probably explains why its hypertext linking capabilities are much less than needed to implement non-linear hypertexts, or to store it in a database with sufficient metadata to permit reliable retrieval. XML (hopefully) will remove these limitations. To facilitate information access, a Universal Resource Identifier (URI) and a Reference Description Framework (RDF) description of the content can be assigned to each chunk. XML links (or database queries) will retrieve the chunk by specifying the chunk's unique URI, and even an anchor point (i.e., node) within that chunk. ===================================================================== I'm prepared to go the extra distance and >actually model the relationships using some entity-relationship tool. Do >people have experience using your idea and modeling? This would make the >notions of the wrapper and the contents closer. ================================================================= In XML, Each information chunk could be contained in an encapsulation wrapper, and the encapsulation wrapper would be assigned a unique URI and an RDF. The RDF describes the wrapper's information content, and includes a pointer to the URI. The wrapper (with its contents), plus its RDF, would be separately stored in a database. This would provide (at least) two ways to retrieve the chunk: 1. A database search directed at the RDFs would deliver the information chunks whose RDFs meet the search criteria. 2. Any hypertext link can retrieve a specific information chunk by specifying its unique URI. The URI serves as a pointer to the chunk's storage location (e.g., its location in the database). ========================================================================= >--The result would be a collection of information that has three >structures: XML (for whatever web capability I desire); ODMA (as the >ultimate open standard in document accessibility) ; and as structured data. >I would probably move this data into a Filemaker database (with which I >have no current experience). So the question is have you seen small users >link Frame and FileMakers this way, and do you think that maintaining three >structures (XML/ODMA/E-R)is possible? ======================================================================== I would say that the ultimate purpose of XML is to play a role in a system that can deliver information in any form needed by a human or a machine, and which provides the capability to access not only documents but also meaningful information packets, which may or may not be part of conventional "documents". In other words, XML is one component of a system that could provide the ultimate in information access and interchange, in which document access is only one (possibly insignificant) of those capabilities. Information access would embody not only access by humans, but also by machines (e.g., computers, music players, process controllers). XML has (or will soon have) many features (e.g., Unicode, RDF, XLink, XSL) that should make it superior to any other method of achieving seamless information retrieval and interchange. XML is intended to be the best method of storing information packets and their metadata in a database repository. An ideal system would be capable of delivering information, not only in XML, but in almost any other form specified by the human or machine requesting it. I doubt very much whether FileMaker is robust enough to serve as the information repository in such a system, even if the system requirements were significantly relaxed from that described above. High-powered database repositories with the needed capabilities currently range in price from the middle 5 figures up to 7 figures. We might expect the price for an XML-aware database repository with the minimum required capabilities will drop into the middle 4-figure range as XML begins to catch on, and competition increases. ========================================================================= >--It appears that everything depends on intelligent initial specification >of the EDD/DTD. One cannot be refining it as one goes, right? =================================================================== In some ways XML is more adaptable to evolving structure than SGML. For instance, well-formed XML does not have to be conformant to a DTD in order for it to be readable by humans and machines. Nevertheless, intelligently designed structure remains the crux of the matter, and a modular design makes it much easier to deal with evolving structure. Although neither XML nor the XML capabilities of the latest release of FM+SGML are quite ready for prime time, developing an EDD/DTD for SGML should assure that your SGML documents can, when the time comes, be easily convertable to XML. It remains to be seen, however, whether FM+SGML will evolve into the tool of choice for importing and exporting XML. ===================================================================== ==================== | Nullius in Verba | ==================== Dan Emory, Dan Emory & Associates FrameMaker/FrameMaker+SGML Document Design & Database Publishing Voice/Fax: 949-722-8971 E-Mail: danemory@primenet.com 10044 Adams Ave. #208, Huntington Beach, CA 92646 ---Subscribe to the "Free Framers" list by sending a message to majordomo@omsys.com with "subscribe framers" (no quotes) in the body. ** To unsubscribe, send a message to majordomo@omsys.com ** ** with "unsubscribe framers" (no quotes) in the body. **