Re: Scanning advice needed ! (somehow long...) Message #2 Posted by Vieira, Luiz C. (Brazil) on 23 Feb 2004, 9:54 a.m., in response to message #1 by Valentin Albillo
Hi, V;
I'm glad you're gonna share your treasures with us here, and I cannot help telling you I'm a lot curious about it all. Thanks!
For what I can tell you about my own experience scanning and OCR processing (not that much, but about 90% successful results), I enumerate some procedures of mine I take as useful; hope they help you, too.
- try separating originals as for typewritten, with and without graphics (and set graphics as single with lines and complex with gray-scaled images), hand written with colored or black pen;
- originals with high contrast give better results if directly scanned with B&W default; originals with fade images should be scanned with 256 gray scale OR color set. This is because it's easier to manage already existing images with higher number of colors so you can reduce later instead of having to scan them back many times till you find what's best;
- it's not unusual a set of "What.. if"'s when you are first trying; for sure, after a few successful results, you'll set your own intuitive parameters, and that's gold;
-dpi resolution is another issue: 300 dpi is sort of "all weather" choice. Most of regular printed material will do fine with 300 dpi. Lower resolutions are a matter of "What... if". Higher resolutions might apply only if originals use small typesetting (like the 82104A Card Reader Owner's Handbook, pages with to HP67/97 compatibility keystrokes) or have details that do not look fine with selected resolution;
- KEEP ALL OF YOUR ORIGINALS IN SAFE and process copies of them; if you decide what's done is not what you want, there will be no need to scan them again; anyway, if whatever you do does not give the best resulting images, start from scratch and scan the originals with a different setting (higher dpi setting, gray scale instead of B&W, etc...); naming directories with a reference for dpi and color scheme will also help you choosing the best; do not wipe out all first-scanned images, instead keep them in a "trash" directory so you can be sure they are useless after the job is done; you know that unwrinkling some thrown-away paper sometimes save people's lives... or may condemn them ;^)
- final e-document should be set to the lower number of colors you decide that better express original images. In some circumstances, a 90% reduction in size may be observed without any reduction in quality, only by reducing # of colors. You may reduce the # of colors with the software you decide to use to process the scanned image. Although the number of colors is almost always a target, the dpi resolution should not be altered unless you actually need to. IF you want to reduce the dpi of an existing e-document (and so it's size), KEPP THE ORIGINAL IMAGE and compare results. Enlarging a reduced image back to its original size will never bring pixels back (of course, it does not apply if your image processor offers the UNDO command), but there's no way to "see" the results of reducing an image if you don't actually do it;
- about printing and viewing: most of the times, WYSIWYG (What You See Is What You Get) does not apply because there are monitors and monitors and so there are printers and printers. Maybe the same e-doc looks much better than prints in one system and shows the opposite in others. So, do what pleases you most in your system and ask for a tryout to some friends and let them give you a feedback. This is some other source of information.
- final format is another issue: images that look great may suffer some sort of quality reduction when saved under some compression techniques. Most extensions (like JPEG, PCX and others) that are fairly used to compose images allow compression rates that may simply "destroy" what you have with the original resolution. What I can tell you is that B&W images (one dot per bit) are smaller when saved as PCX or TIFF (I never tried compression with BMP extension... to be honest, I don't even know if it's possible); gray scaled images also fit well as PCX and GIF, although GIF results in larger files for the same source images; although many guys hate JPEG high-compression images (they tend to generate "minute particles" surrounding borders), you may find a reasonable balance between image quality, compression rate and file size;
About pdf generation and image storage: I think PDF is a great distribution format, and it somehow "protects" ownership. I found that the best Windows-based SW (I think) to compose images into a final PDF "booklet" is Imaging, the standard Windows image/scanner manager. I don't like to use Imaging to process images or scanners, but I like it to compose a set of images into pages of the same document. It shrinks and stretches images with different sizes to fit inside a default page size, shrunk images do not loose resolution in final PDF (if you have the MoHPC CD's, have a look at the Portuguese version of the 82104A Owner's Handbook, HP67/97 compatibility) and final size is fairly acceptable. I don't know image processors used in other platforms (Mac, Linux-base PC's, etc.), but I know that generating a PDF from an original doc under Linux is a standard procedure (default), and you'll need some extra, non-standard plug-ins to generate PDF in Windows.
Wow! I wrote too much. When we write too many things, there is a potential margin of errors... Anyway, I think I did not forget the main themes.
I'd like to add that this is all based in an original text I prepared (in Portuguese) to my friend José Ernesto, and I'm not sure I actually sent it at the time I wrote it... Zé, if I did not, please forgive me... <:^(
Hope this helps you, Valentin.
Cheers.
Luiz
|