[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Re: Variations in importing SGML docs into FM+SGML



Dan wrote:

++++++++++++++++++++++++++++++++++++++++++++++++++
. . . .snip. . . .
The variations in structure and content were as follows:

1. Using Prefix rules in the EDD to produce lead-in titles for each field
   (file 1 only).

2. Using entity references converted to variable definitions to produce
   those same lead-in titles (files 2 thru 5).

3. Putting each of the 23 fields in a separately named text
range container
   (files 1 & 2)

4. Concatenating all 23 fields in a single text range
container (files 3 & 4)

5. Reducing the number of entity references by 50% (files 4 & 5)

6. Wrapping one field in a separate text range container, and
concatenating
   the remaining 22 fields in a single text range container (file 5).

Each of the 5 files used an extremely simple DTD/EDD (the
only differences
resulting from variations 1, 3, 4, and 6). Each file, upon import into
FM+SGML, produced identical 16-page printed outputs.

As a benchmark comparison, I used a very complex 53-page
structured document
(tables, graphics, very complex text structures) created in
FM+SGML, using
an extremely complex EDD (190 pages, including 35 pages of format change
lists). This document contains about 2,000 elements, many of which have
numerous attributes. I then exported this document to SGML,
producing a 202K
SGML file. When this file was imported into FM+SGML (replicating the
original document), the import time was only 90 seconds. If there is some
overhead with loading any file, and/or if the complexity of the document
contributes to the load time, then it would have been most
apparent in the
benchmark document, which is the most complex by far, and
also the smallest
in SGML file size.

So, adding in the benchmark file, the element loading rate
(elements/sec) is
as follows:

FILE            LOAD RATE (ELEMENTS/SEC)     FILE SIZE
Benchmark               22.22                   202K
1                       28.57                   400K
2                        5.714                  510K
3                        6.666                  360K
4                        6.666                  345K
5                        5.416                  390K

Now, it becomes apparent that file size is a major determinant, and that
FM+SGML may be hitting some kind of wall at some point around a file size
400K. It's also apparent that SGML docs with lots of entity
references also
increase load time, but that file size seems to be equally
important, given
that the number of entity references in file 4 is 50% less
than in file 3,
yet the load rate is the same.

File 1 seems to be anomalous, and the only explanation I have is that the
absence of entity references makes a big difference.
+++++++++++++++++++++

Why is the benchmark file not considered "anomalous"? It has
no entity references either.

Comparing 2 completely different files - the benchmark and
pick any one of the others - is comparing apples and oranges.
You can draw no conclusions about file size and performance
from this comparison. Your conclusion that the absence of
entity references makes a big difference in performance makes
sense, since the absence of entity references is about the
only factor the benchmark file and file #1 have in common.

Janice


** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **