[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Re: FrameMaker 5.5.6 and XML experience?

As I mentioned in my last post, we're getting a long way off topic here - if you're
not very keen, feel free to ignore this post.

Dan Emory wrote:

> At 06:02 PM 10/27/98 +1100, Marcus Carr wrote:
> >Search engine optimisation is a long way from my sphere, but I've never
> heard that
> >attribute searching was a selling point.
> *****************************************************************
> It's not a matter of search engine optimization. It's that attributes (or
> what I call metadata enrichment) can greatly increase the search possibilities.

Have you looked at the Dublin Core Metadata initiative
(http://purl.oclc.org/metadata/dublin_core/)? It does what you describe, and is
equally applicable to HTML data.

> Suppose, for example the container element for a "chunk" has a "strings"
> type attribute that permits the author to list keywords (or phrases) which
> (s)he considers relevant to that particular chunk. It's not necessary for
> any of those keywords to physically appear in the text of the chunk in order
> to get a hit on that chunk, because the author (or a librarian) has
> intelligently analyzed it, and determined what keywords/phrases are
> applicable. That's far more powerful than blindly searching for every
> occurrence of a particular word or phrase, which usually produces far too
> many hits to be useful.

No question of it, but again this ability isn't restricted to XML - the Dublin Core
covers exactly that scenario with the element subject. To quote, subject is "The
topic of the resource. Typically, subject will be expressed as keywords or phrases
that describe the subject or content of the resource. The use of controlled
vocabularies and formal classification schemes is encouraged." You may also wish to
check out "An Introduction to the Resource Description Framework" at

> Suppose other attributes provide correlations of the content of a chunk with
> external documents (e.g., specifications, regulations, policies &
> procedures, requests for quotations). Once again, these external documents
> are (usually) not identified in the text of the chunk, but the chunk bears
> in some manner the "imprint" of those external documents. At the time the
> chunk was written, the author knows of those correlations, because (s)he
> referred to those external documents while writing the chunk. Attributes
> provide a way to preserve those correlations.

Attributes will only preserve the correlations if the data is used for the same
purpose and in the same sized chunk as when it was created. Otherwise, attributes
could be extremely misleading. This has been a perennial question around
organisations such as defense - if a security attribute doesn't match the lowest
referenceable chunk, you would be able to look at an object out of context - ie a
list has a security attribute of restricted, but an item doesn't. If you take the
item in isolation, you may lose the information about the status. With XML's ideals
about fragmenting data and using those fragments with or without a DTD, the problem
is compounded, inspiring a search for more robust solutions than attributes.

> Keeping all formatting out of the document is an unavoidable byproduct of
> the SGML storage paradigm. It doesn't facilitate anything except the
> increased possibility that a document will be inadvertently (or
> intentionally) formatted in a way that impairs its meaning. Formatting
> attributes provide one way for the document originator to declare: "The
> meaning of this paragraph (or string) will be best conveyed if it is
> formatted thusly."

Formatted thusly on what media? On paper? Would you apply another set of attributes
for ideal screen presentation, with subsets for different screen resolutions? Aside
from tables and maybe an emphasis element, formatting attributes are a one-way trip
to nowhere.

> The W3 working group declares that XSL is not intended to replace DSSL and
> other printed document formatting methods (such as FM+SGML). The use of
> attributes that specify formatting embellishments ought to be included in
> the XSL methodology. XML documents will be printed as well as viewed, and it
> should be possible to produce high-quality print output from an XML document
> that fully replicates the way it was intended (by the originator) to look in
> printed form.

XSL doesn't compete with FM+SGML - in fact I hope that in the near future FM will
save out and possibly even read in an XSL stylesheet. Another XSL stylesheet can be
produced to facilitate paper publishing, so they could be produced by any conforming
XML application and you could use the data exactly as the author intended. In fact,
the author may provide two stylesheets - one for paper and one (or more) for screen.
That accomplishes what you want without polluting the data with attributes that may
or may not be appropriate for you rendition of the data.

> Using attribute values to determine formatting works beautifully in FM+SGML,
> and there's no reason not to eventually have such a capability in XSL.

The capability exists now for applying characteristics to data via an XSL stylesheet
- it's the wisdom of doing so that I question.


Marcus Carr                      email:  mrc@allette.com.au
Allette Systems (Australia)      www:    http://www.allette.com.au
"Everything should be made as simple as possible, but not simpler."
       - Einstein

** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **