[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Re: XML when?



Lee Richardson wrote:

>There's a general belief around Adobe that well-formed but not valid XML
>is going to be common, where 20% <= 'common' <= 80% of all content in XML
>that Adobe cares about.

My pessimistic guess is that it will be close to the high end of
that range. OTOH, even without a DTD most content bounced in and
out of Frame will likely be tagged consistently. For the majority
who aren't using FM+SGML's structured editing functions, well-formed
but not valid will be more than good enough -- it becomes a matter
of transforming elements to FM tags or vice-versa.

Those who want structured editing, round-tripping, and the whole ball
of wax, have it a little harder. But after some early success with XSLT,
I've started studying the XSL(FO) spec, and I think it should be possible
(note I didn't say "easy" :-) to derive an EDD from an XSL stylesheet
and vice versa. Ideally, a Frame+XML could build general and format
rules by analyzing <xsl:template match="foo/bar/baz"> elements.

I suspect, in the end, that most XML will be expected to conform to
stylesheets rather than DTDs. If I'm right, that's going to affect
how *any* XML-aware authoring tool will deal with import & export.
The stylesheet will at least imply a structure, without enforcing
order or number of elements like a DTD -- like you say below, a
looser world. But it's still a world with structure & rules.


>There's also an issue with XML fragments not necessarily being valid,
>depending on how they're created and placed. These freestanding chunks
>of XML can be treated as well-formed but not necessarily valid depending
>on how they're used.

Again, I'd expect that those "well-formed chunks" would be tagged
fairly consistently, at least for a single customer. The current
read/write rules mechanism, or an extension to the current conversion
tables, may be a good way to deal with fragments.


>There's a third issue that XML content may have originally been
>created with a particular DTD, but the DTD has been lost or forgotten,
>or the XML has been transformed from something to something else that
>no longer corresponds to the original DTD.

That sounds like an issue of single-shot imports vs. working with
a known quantity (e.g. a particular version of DocBook). I suppose
a current FM+SGML user, faced with importing a one-time oddball
document, would have the same issues. In the XML world, *if* I'm
right about stylesheets, DTDs won't matter so much. And what has
been transformed can be transformed again.

<aside type="interesting">In fact, I think that it may be possible
to transform an XML file to MIF -- which would allow for a form of
round-tripping for those who need it *NOW* and have the patience to
write the transform rules. There's a reasonably good standalone,
open-source XSLT processor (for OSes with command lines) called
"Sablotron" at http://www.gingerall.com -- I'm going to try using
it to build a MIF transform next time there's a break in the action
here at work.</aside>


>To rephrase my comment- XML is shaping up to be a looser world than SGML.

No doubt. But I believe it will have its own set of rules, just
somewhat different from SGML. People who expect to import XML
will have to do some work up front, just like SGML users have to
do. It's too much to expect Frame -- or any other authoring tool --
to just take any random chunk of XML & turn it into a properly-
formatted document. BUT, an XML file conforming to a stylesheet
*could* be formatted automatically.

	Larry



** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **