[Date Prev][Date Next]
[Thread Prev][Thread Next]
[Date Index]
[Thread Index]
[New search]
To: Janice McConnell <jym@xxxxxxxxx>
Subject: Re: Variations in importing SGML docs into FM+SGML
From: Dan Emory <danemory@xxxxxxxxxxxx>
Date: Mon, 22 Mar 1999 10:13:59 -0700 (MST)
Sender: owner-framers@xxxxxxxxx
At 07:14 AM 3/22/99 -0500, Janice McConnell wrote: >In your email below, you suggested that I was wrong about >bug#239927 being the cause of performance differences between >your original test files 1 and 2. I went back and re-checked >that bug report. Actually, it was reported against version >5.5, so you may be correct that this is not the cause. ====================================================== OK, so this was a bug in in 5.5.x which was fixed in 5.5.6. ============================================================== >Running a timed test WITH THE SAME FILE on the same machine >with each version of FM+SGML would resolve the question of >whether number of entities converted to variables drastically >slowed performance in versions before 5.5.6. =========================================================== Yes, that's true, but it's even more interesting to find out how import times vary when you use the same FM+SGML version to import the exact same data content while varying one one or more factors in the SGML content model and/ or in the EDD format rules. I've created a test bed for doing that, and the results of that testing are what I reported in my original post on this subject, as well as in my reply to your post. The import time for the benchmark SGML file served only as a beginning reference point. The data source is a database extract (i.e., ASCII flat file) containing 600 records, each having 29 character-delimited fields. I then use UniMerge and a FrameMaker report template (unstructured) to merge the flat file and convert it to valid SGML. Each time I change a factor, I modify the FM report template to produce valid SGML with the changed factor, and also modify the DTD and EDD accordingly. In all cases, the resulting output produced by printing the imported SGML instance from FM+SGML is identical in all respects. ================================================================= >Unless my eyes are crossing again this morning, which they do >sometimes (i.e. mis-labeling your test and benchmark files), >the only test that you have run so far which compares files >with only one difference is your original test of test file >#1 against test file #2. In every other test that you ran, >you changed more than one parameter between the test files. >That's what I meant when I said that you were comparing >apples and oranges. One cannot conclude that a particular >factor is causing a slow-down in peformance unless all other >factors are the same. ============================================================ In my reply to your post, I reported the effects on import time of the following additional changes from the original test SGML file #2, which took 35 minutes to import. This version had each of the 23 bio fields in a separately named text range container, and used entity references to produce the prefixes to each field: CASE 1. Concatenated all 23 of the bio fields in a single container element, preserving all the entity references in the original file, This reduced the element count from about 12,000 to about 2,000 and produced a major reduction in file size. RESULT: Reduced the import time from 35 minutes to 5 minutes. CASE 2. Halved the number of entity references, with everything else the same as CASE 1, producing a relatively small reduction in file size. RESULT: No change in import time from that observed in CASE 1. CASE 3. Left 22 of the 23 bio fields as a concatenated string in a single container element, with same entity references as CASE 2 above, and wrapped the first bio field in a separate container element, producing a 30% increase in element count (from 2000 to 2600), and a relatively small increase in file size. RESULT: Increased import time to 8 minutes (a 62% increase over that observed in CASES 1 and 2). The common denominators that seem to affect the import time in these three test cases are file size and element count. Import time seems to be proportional to file size, and above a certain point (a file size somewhere in the 375 KB to 500KB range), the import time appears to increase almost exponentially with increasing file size. ____________________ | Nullius in Verba | ******************** Dan Emory, Dan Emory & Associates FrameMaker/FrameMaker+SGML Document Design & Database Publishing Voice/Fax: 949-722-8971 E-Mail: danemory@primenet.com 10044 Adams Ave. #208, Huntington Beach, CA 92646 ---Subscribe to the "Free Framers" list by sending a message to majordomo@omsys.com with "subscribe framers" (no quotes) in the body. ** To unsubscribe, send a message to majordomo@omsys.com ** ** with "unsubscribe framers" (no quotes) in the body. **