[Date Prev][Date Next]
[Thread Prev][Thread Next]
[Date Index]
[Thread Index]
[New search]
To: EMMY_ARICIOGLU@xxxxxxxxxxxxxxxxxxxxxxxxxx, FrameUsers List <Framers@xxxxxxxxxxxxxx>, Frame List <Framers@xxxxxxxxx>
Subject: Re: Conversion Utility Wanted (scanned text --> live text)
From: Jay Smith <jay@xxxxxxxxxxxx>
Date: Tue, 11 May 1999 15:46:13 -0400
Organization: Jay Smith & Associates
References: <H00018080e2833a8@MHS>
Sender: owner-framers@xxxxxxxxx
Emmy, When you say that your scanner can only produce .bmp and .pdf files, I suspect that it is the DRIVER, not the SCANNER that has this limitation. The first thing that I would do is to check with the scanner manufacturer to see that there is not an updated scanner driver available. To get your "page images" into workable text, you need to run them through an OCR (optical character recognition) program. [Note that OCR is only going to be 95-99.9% accurate; the result will need to be proofed.] The only one with which I am personally familiar is OmniPage (I think made by Caere??), but there are several. GOOD ocr programs are not cheap. And that leads us back to your scanner. If your scanner is so low-end that you can only get bmp and pdf, then that scanner is probably not going to be any more productive/fast (when you include the OCR time and cost as well) than actually TYPING the content. If you have a serious amount of this work to do, you may find that what you need is a fast scanner (minimum $1000 for a good, FAST one) and decent OCR software. Last I checked OmniPage was a few hundred bucks. HOWEVER, if you use your existing scanner, you can still use OCR software. What you need to do is convert your .bmp files to 300 dpi .tif (TIFF) files -- which is what I believe most OCR programs prefer to use. There are several programs that can do such conversions, each with varying effectiveness. Check to see what is already in your library of image editing programs. Note that there are conversion programs specifically designed to do such batch conversions -- shop around. AND HOWEVER AGAIN.... The problem with trying to do this scanning/ocr without a better scanner (or better scanner driver -- probably TWAIN compliant) is that OCR programs understand how to thread these various page images together into a FLOW of text. If you just work with individual .bmp files, YOU will have to connect the flows after OCR'ing them. If this project is really worth doing via scanning/ocr, it is probably only worth doing it right. With a fully functional scanner, TWAIN compliant scanner driver, and a good OCR program. Jay -- Jay Smith e-mail: jay@jaysmith.com The Press for History(tm), The Press for Education(tm), The Press for [Your Industry](tm), The Press for....(tm) On-demand printing and binding of hardbound books. Minimum run one copy. P.O. Box 650 Snow Camp, NC 27349 USA Phone: Int+US+336-376-9991 Toll-Free Phone in US & Canada: 1-800-447-8267 Fax: Int+US+336-376-6750 EMMY_ARICIOGLU@hp-roseville-om3.om.hp.com wrote: > > Howdy Everyone, > > Through research we can find older material produced by our company > that is still useful to us, but the original files no longer exist. We > would like to be able to scan the old pages, then edit the text into a > new doc. The scanner we are using is only able to produce bmp and pdf > files. But we can't seem to manipulate these in any way. We can import > the images into Word and Frame, but what we want is the actual text so > that we can update and reuse it. > > A utility that would convert scanned text into Word would be fine. The > engineers could use it and the writers could import Word into Frame. > > Can anyone recommend software that would do the job for us? > > TIA, > Emmy > emmy_aricioglu@hp.com > ** To unsubscribe, send a message to majordomo@omsys.com ** ** with "unsubscribe framers" (no quotes) in the body. **