[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Markup vs. Formatting (was Exporting autonumbers, suffixes, and prefixes to SGML or XML)

At 09:04 AM 11/6/00 +0100, a subscriber to Free Framers List wrote:
>Setting up the API to export autonumbers to *ML is a no-brainer, as far as
>programming is concerned.  However, I question the wisdom.  Doesn't that 
>as formatting - hance as something that does *not* belong in the markup?  The
>argument against it is not that Maker will be confused about adding 
>to already numbered paragraphs...  That can be controlled by formatting rules.
1. I suggested that there ought to be an option in FM+SGML to select
whether autonumbers, suffixes, and prefixes should be included on export
to SGML or XML.

2. If autonumbers, prefixes, and suffixes are placed in elements by
themselves, with the associated section titles (for instance) placed in
separate elements, then there is no conflict when the ML instances
are round-tripped, because I can use read/write rules to drop the
content of those elements on import. Format rules in the EDD would
then restore the correct autonumbering, prefix, or suffix to the
imported document. Elements containing only an autonumber would
have a runin paragraph format specified in the EDD format rules when they
are inserted before the element containing the accompanying text.
Elements containing only a prefix or suffix would be defined as
text range elements which have prefix rule(s) that specify text
string(s). These text range elements are inserted before or after
the element containing the text that has a prefix or suffix prepended
or appended to it.

3. For example, I use a special text range element
to prepend security classification marks to paragraphs.
An attribute specifies what the security classification is, and the
EDD prefix rules for that element specify the corresponding string
(e.g. (U), (C), (S), (TS), corresponding to Unclassified, Confidential,
Secret, and Top Secret). Also, the attribute values for
this element are used to determine the highest classification level on each 
and that attribute value appears in FM+SGML as part of the running
header on each page. It is usually vital that these security marks
be properly exported to SGML or XML, but when I export such
documents, the security marks do not appear in the text.

4. Another example. In FM+SGML cross-reference formats often contain
paragraph, figure, table, or step numbers. When I export such documents
to SGML or XML, the content (including the number) is preserved in the
cross-reference text, but the numbers in the source which those
cross-references refer to are omitted from the exported ML instance.
Anything a cross-reference can refer to is content, thus autonumbers
are content, not formatting, and  the source numbers must be preserved
on export to XML or SGML.

5. There is nothing in either the SGML or XML standards about excluding 
information from markup. If it can be represented within the standard 
markup, it's ok.
For instance, metadata, in the form of attributes, are used all the time to 
what is undeniably formatting information (e.g., most of the attributes in 
the CALS table
model, and the attributes in graphic elements). Text range elements used within
paragraphs to specify (among other things) modifications to the default 
of the paragraph are another example. List container elements typically 
have an attribute
that specifies the list type (e.g., bulleted or numbered), which is 
formatting information
for the items within the list. In many cases, element names themselves are
used to specify how the content of those elements should be formatted when the
text is viewed in an SGML browser. The other way formatting information can 
be specified
within the markup is by means of processing instructions.

6. The SGML purists who argue that formatting information should be 
rigorously excluded
from markup cannot successfully enforce their own rule, as shown in 5 above,
and it is utter nonsense to maintain that those purists are simply trying to
enforce the original intent of the standards. Whatever is allowed in the
markup, whether it be content or formatting metadata, is legal and parsable,
and that's all that counts.

7. In the end, all SGML and XML instances require at least some minimal
formatting information in the markup to make the content useful to human
readers, and the middleware (e.g., DSSL, FOSI, CSS, or XSL) that does
the formatting must rely on that information. And that includes explicit
formatting information, which typically resides in attributes. The nice thing
about this arrangement is that the middleware can ignore or modify some
or all of the formatting instructions, as might be the case when the 
is being delivered to a non-human user.

8. I would argue that anything I can write on a yellow-lined pad or type as
ASCII text in (say) notepad is content, and that certainly includes
things such as section or paragraph numbers, step numbers,
bulleted items, etc. The fact that I use EDD format rules and paragraph 
formats in
FM+SGML to automatically produce certain types of content does not convert
that information from content to format.

| Nullius in Verba |
Dan Emory, Dan Emory & Associates
FrameMaker/FrameMaker+SGML Document Design & Database Publishing
Voice/Fax: 949-722-8971 E-Mail: danemory@primenet.com
10044 Adams Ave. #208, Huntington Beach, CA 92646
---Subscribe to the "Free Framers" list by sending a message to
majordomo@omsys.com with "subscribe framers" (no quotes) in the body.

** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **