[Date Prev][Date Next] [Thread Prev][Thread Next]
[Date Index] [Thread Index] [New search]

Variations in importing SGML docs into FM+SGML



Wide variations in the time required to import SGML document instances into
FM+SGML have been observed:

WIN platform (266 MHz CPU)
Memory Size: 32 MB
Memory Read/Write Cache: 2048 KB
Virtual Memory: 111 MB (temporary)
FM+SGML version 5.1.1

TEST FILES:
Benchmark SGML file: 202 KB, containing about six graphic entities, plus
complex tables and text structures, using a very complex DTD/EDD. The EDD
has 190 pages, including about 35 pages of format change lists, and a file
size of 2.4 MB

Test SGML file #1: 400 KB, containing 600 biographical records (about 6
lines each) extracted from a conventional database and tagged to produce
SGML.  Each biographical record contains up to 23 concatenated data fields
(each field is contained in a descriptively named SGML element) The DTD is
quite simple.  The EDD (10 pages, 176 KB) contains prefix rules that specify
the lead-in titles that precede some of the biographical data fields (e.g.,
"Education:", "Address:", "Phone:", "Fax:", "E-mail:")

Test SGML file #2: 510 KB, containing the exact same 600 biographical
records as Test File #1, and using the same DTD. However, instead of using
EDD prefix rules to specify the lead-in titles, the SGML elements contain
entity references (e.g., &Educ; &Addr; &Ph; &Fx; &Eml;) to produce those
titles. The SGML document instance contains internal entity declarations for
these entities of the form:

	<!ENTITY Educ "FM variable: Educ">
For each such entity, the template used for import has a variable definition
that produces the corresponding lead-in title.

ANALYSIS OF THE THREE FILES:
Test Files #1 and #2  are identical, with the exception that  Test File #2
has the added entity references and entity declarations, which accounts for
the 110 KB difference in the size of the two files

The Benchmark SGML file produces, on import to FM+SGML a richly structured
and formatted 53-page document.

Test Files 1 and 2 both produce, on import to FM+SGML, identical documents
containing the 600 biographical records in 6.5-point type. The structure and
formatting are simple. The EDD has no format change lists, and very simple
format rules.

HERE ARE THE TIMES IT TAKES TO COMPLETE THE IMPORT-TO-FM+SGML ACTION:

Benchmark SGML file: 53 pages in 90 Seconds to produce a 1.0 MB FM+SGML file.

Test SGML File #1: 16 pages  in about 7 minutes to produce a 2.3 MB FM+SGML
file.

Test SGML File #2: 16 pages  in about 35 minutes to produce a 2.3 MB FM+SGML
file.

All tests were conducted several times, with nothing running but FM+SGML

CONCLUSIONS
>From the foregoing, it would appear that:

1. The complexity of the EDD seems to have little impact on import time.

2. A doubling of SGML file size from 200 KB to 400 KB increases the import
time by a factor of at least 4.6. 

3. The use of prefix rules in the EDD produces a 5-fold reduction in import
time compared to the use of entity references for the exact same purpose.

Doe anyone have an explanation for these wide variations in import times? 
     ____________________
     | Nullius in Verba |
     ********************
Dan Emory, Dan Emory & Associates
FrameMaker/FrameMaker+SGML Document Design & Database Publishing
Voice/Fax: 949-722-8971 E-Mail: danemory@primenet.com
10044 Adams Ave. #208, Huntington Beach, CA 92646
---Subscribe to the "Free Framers" list by sending a message to
   majordomo@omsys.com with "subscribe framers" (no quotes) in the body.


** To unsubscribe, send a message to majordomo@omsys.com **
** with "unsubscribe framers" (no quotes) in the body.   **