[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Variations in importing SGML docs into FM+SGML



Dan,

Concerning the different opening times for files #2 and #3,
there is a known issue (#239927) with importing files with a
lot of entity references which are converted to variables in
version 5.1.1. This problem was resolved in version 5.5.6.

You compared the opening times for files #1 and #2, and
suggested that it is related to size of file. I don't believe
this is a valid comparison since, as you pointed out, the
files are extremely different in content. Without examining
the documents and their respective applications, I can't
speculate about opening time performance.

Janice
 
*****************************
Wide variations in the time required to import SGML document
instances into
FM+SGML have been observed:

WIN platform (266 MHz CPU)
Memory Size: 32 MB
Memory Read/Write Cache: 2048 KB
Virtual Memory: 111 MB (temporary)
FM+SGML version 5.1.1

TEST FILES:
Benchmark SGML file: 202 KB, containing about six graphic entities, plus
complex tables and text structures, using a very complex DTD/EDD. The EDD
has 190 pages, including about 35 pages of format change
lists, and a file
size of 2.4 MB

Test SGML file #1: 400 KB, containing 600 biographical records (about 6
lines each) extracted from a conventional database and tagged to produce
SGML.  Each biographical record contains up to 23
concatenated data fields
(each field is contained in a descriptively named SGML
element) The DTD is
quite simple.  The EDD (10 pages, 176 KB) contains prefix
rules that specify
the lead-in titles that precede some of the biographical data
fields (e.g.,
"Education:", "Address:", "Phone:", "Fax:", "E-mail:")

Test SGML file #2: 510 KB, containing the exact same 600 biographical
records as Test File #1, and using the same DTD. However,
instead of using
EDD prefix rules to specify the lead-in titles, the SGML elements contain
entity references (e.g., &Educ; &Addr; &Ph; &Fx; &Eml;) to produce those
titles. The SGML document instance contains internal entity
declarations for
these entities of the form:

	<!ENTITY Educ "FM variable: Educ">
For each such entity, the template used for import has a
variable definition
that produces the corresponding lead-in title.

ANALYSIS OF THE THREE FILES:
Test Files #1 and #2  are identical, with the exception that
Test File #2
has the added entity references and entity declarations,
which accounts for
the 110 KB difference in the size of the two files

The Benchmark SGML file produces, on import to FM+SGML a
richly structured
and formatted 53-page document.

Test Files 1 and 2 both produce, on import to FM+SGML,
identical documents
containing the 600 biographical records in 6.5-point type.
The structure and
formatting are simple. The EDD has no format change lists,
and very simple
format rules.

HERE ARE THE TIMES IT TAKES TO COMPLETE THE IMPORT-TO-FM+SGML ACTION:

Benchmark SGML file: 53 pages in 90 Seconds to produce a 1.0
MB FM+SGML file.

Test SGML File #1: 16 pages  in about 7 minutes to produce a
2.3 MB FM+SGML
file.

Test SGML File #2: 16 pages  in about 35 minutes to produce a
2.3 MB FM+SGML
file.

All tests were conducted several times, with nothing running but FM+SGML

CONCLUSIONS
>From the foregoing, it would appear that:

1. The complexity of the EDD seems to have little impact on import time.

2. A doubling of SGML file size from 200 KB to 400 KB
increases the import
time by a factor of at least 4.6. 

3. The use of prefix rules in the EDD produces a 5-fold
reduction in import
time compared to the use of entity references for the exact same purpose.

Doe anyone have an explanation for these wide variations in
import times? 


** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **