[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Re: [FrameSGML] Structured Document Design for XML or SGML




Dan Emory wrote:

> Now, SGML purists would argue that, in a structured document, these "atomic"
> object types and sub-types must be assigned names that describe their
> content. Thus, if there are 25 content types, there would have to be 25 element
> names for text paragraphs, 25 names for figure captions, 25 names for
> bulleted paragraphs, and so on (and on and on and on, reductio ad absurdum).

I consider myself an SGML purist and I would never think of doing that.

> I contend that this is not only unnecessary but also self-defeating. Elements
> at the "atomic level should be given names that describe their
> objectness (i.e.,object type/subtype), which is distinctly different from
> formatting information. For example, Bullet_Item describes a paragraph
> of sub-type bulleted item. If there is a compelling need (unlikely) to describe
> the content type at this low level, then it should be done by assigning one
> or more attributes for that purpose.

I don't get what you mean by "content type" - you don't mean providing information
about the context, do you? Surely you're not suggesting that it's common practice
to declare two elements such as PrefacePara and IntroPara that have identical
content models but are intended to be formatted differently? I doubt whether you'd
find that to be very common in serious DTDs, though it may always happen as a
transitional measure.

> Incidentally one of the odd things about SGML purists is the way they
> cling to the idea that a single (usually cryptic) element name is sufficient
> to describe its content. Usually, content has many different facets.
> It makes more sense (to me at least) to provide attributes for this purpose.
> Not only does this approach to describing content make more sense,
> it also makes the DTD much simpler, and less vulneragle to the impact
> of evolving technologies and processes.

As well as less reusable. Why wouldn't you describe something as simply as possible
and derive information from the context that it's used in? (Obviously discussions
get mired down as we consider our own implementations - can you provide an
example?)

> STRUCTURED DOCUMENTS BEGIN TO DIFFER FROM
> UNSTRUCTURED ONES AT THE "MOLECULAR" LEVEL
> Here, groups of "atomic" elements are wrapped in containers.
> For example, a sequence of Bullet_Item elements would be wrapped
> in a BulletList container, the elements that compose a Figure
> (e.g., a Graphic element preceded or followed by a Figure_Caption
> element) would be wrapped in a Figure container, and so on.
> Although there are exceptions, most molecular-level container elements
> of the types I'm describing here are actually "super objects"
> that ought to also be given names that describe their objectness, not their
> content. If necessary at this level, attributes should be used to
> describe content.

You're losing me - are you using "content" to mean "formatting"? How do you
categorise a "super object"? Is a para that contains nothing but a thousand cross
references a super object? How about a para that contains only data characters?

> THE ADVANTAGES OF UNIVERSAL BUILDING BLOCKS
> The atomic and molecular elements described so far are the
> universal building blocks of any structured document, no matter what
> variations in content-oriented superstructure are imposed by different
> DTDs . Ideally, everyone would agree on definitions and naming convention
> for them so that this core element set could become common to all
> future DTDs.

Maybe I just don't understand, but it seems that you're advocating the DocBook
approach of an uberDTD. Get your data into this structure you won't have to worry
about formatting it...

> USING ATTRIBUTES TO SPECIFY FORMATTING
> Element context alone is usually not enough to define formatting. In my
> EDD/DTD designs, I use formatting attributes at all level of structure, and
> the combination of element context and attribute values determines the
> formatting.
>
> For example, formatting attributes for the ubiquitous Para element
> might include:
> ParaStyle Attribute
>    Plain (default)
>    Bold
>    Italics
>    Underlined
>    Message (uses Courier font)
> TextSize Attribute
>    Large (2 points larger than regular).
>    Regular--the font size in the default paragraph format (default).
>    Small (2 points smaller than regular).
> Width Attribute
>    Across All Columns--text spans the sidehead and normal text columns.
>    Normal--the text appears in the normal text column (default).
> Alignment Attribute
>    Left (default)
>    Centered
>    Right
> The TblCellVertAlign Attribute - Para elements contained in a table cell
> have their
> vertical alignment within the cell specified, as follows:
>    Top (default)
>    Middle
>    Bottom
>
> I know this approach gives SGML purists fits, but it allows the author to
> deploy a single
> element named Para in virtually any context where a text paragraph is needed.
> This approach, at least to me, makes more sense than using processing
> instructions or other obtuse techniques to specify formatting.

Why hardcode the formatting values into the data when you can code them into the
application responsible for rendering it? What do you do when you want to use the
same data in a different media? Of course this gives purists fits - so does the
thought of a world built on HTML. I don't object to the idea of a formatting DTD
per se, but it's only SGML by coincidence. You state at the top of your paper that
"... the subject of this paper is information design, not document design" - that
seems contradictory to what you hold above.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein



** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **