The Museum of HP Calculators


Scanning and Transmitting files for Document Sets

The following is somewhat long and detailed but please slog through it! If you can't meet all the specifications below, please do what you can and I will post-process your scans. In the past I have post-processed about 70% of the scans I received which caused me to become the choke-point of the whole process, so the closer you can get scans to "perfection" (as defined below) before sending, the more manuals we can get done.

Please contact me before you start to scan so you don't scan the same document that someone else does.

File Formats

TIFF:  TIFF files always seem to work and provide excellent quality, but they are generally very large.  If your software provides compression options, please make sure that "non-lossy" options are selected.

JPEG:  JPEG may also be used if the quality settings are high. (low compression.) If your software also has chroma subsampling options, please set it for NO subsampling.  Please check a few files before choosing a final compression setting - ie: write a few files, then read them back in and enlarge the image and look around text and around sharp image transitions for JPEG artifacts.  Note that some newer versions of Paint Shop Pro have a bug where you actually have to select maximum chroma subsampling to get no chroma subsampling. So, when it doubt, read the file back in and see what you actually wrote.

PDF: Sending files as PDFs also works.  In fact, as I have become busier lately, a well-polished/ready-to-go PDF is becoming my preferred format.  Please set the PDF version to be no later than 1.4 (Acrobat 5.x). Here too, please set compression options for reasonably low compression.  PDF creators can have the same issues with compression as we see in JPEG files.  So please make a few tests of your settings and make sure you don't see artifacts around the text. As a general guideline, most of the high quality color manuals in the museum are in the range of 150K per page to 400K per page (at 300DPI). B&W manuals are typically around 30K to 50K per page (at 300 DPI). Also please make sure there is no password/security on the file. Creating PDFs with Adobe products (Acrobat Pro) is preferred, because some PDFs created with other products have had subtle incompatibilities. If you can send it with OCR, that's great too. Usually you set it to keep the image displayed, with the OCR'd text on an invisible layer. This allows searching while keeping the original appearance exactly.

If the files are too large to send over the Internet,  you can always send CDs - see below.

DPI and File Formats

Manuals that Use Color

Dots Per Inch (DPI): 300 or 400
Bits Per Pixel: 24 or 8 Color

Most manuals are not "photographic" in their use of color, so if you are sending TIFFs, converting them to 8 bit can be a way to reduce their file size.

Manuals that Use B&W and Grays

Dots Per Inch (DPI): 300 or 400
Bits Per Pixel: 4 or 8 Grayscale

Pure B&W Manuals

Dots Per Inch (DPI): 300 or 400
Bits Per Pixel: 1,4, or 8 Grayscale

Most pure black and white documents will scan fine at 1 bit B&W but they may look somewhat better if scanned at 4 bits or 8 bits per pixel. Since we are moving to larger and larger media, 4 and 8 bit scans are becoming preferred even for Pure B&W manuals.

Full Color Photographic Brochures

File Type: TIFF
Dots Per Inch (DPI): 150-400
Bits Per Pixel: 24 or 8 Color

Brochure pages with photographs are typically scanned at 24 bit per pixel color. These files come out quite large so try to save 24 bit for the pages that need it. Please experiment with "descreening" options in your scanning software as they can improve a brochure image while also reducing the file size.

 

Internals and Service Manuals with schematics and/or very fine print

Dots Per Inch (DPI): 300-600
Bits Per Pixel: 1 bit B&W or more (4 bit ot 8 bit gray) as needed

These documents can be challenging due to tiny text in flowcharts or schematics. Please make some test scans and scan at a resolution that makes the detailed text readable. The smallest text doesn't have to be beautiful - just readable. Please use the high resolution on the pages that really need it, and stick to 300 DPI for general text pages.

General

The above are guidelines that may change for the specific document. Please use your best judgment. If a manual has fine text that can't be read at 300 DPI, you may bump it up, but please don't send a 400 DPI scan of a brochure just to make a document number or other tiny detail clear. It's fine also to change the format within the document, for example is only the cover if color.

Every scanner is a bit different so some experimentation is generally helpful when you start out.

Please try not to take pictures of pages with a digital camera. The results are usually much worse than a scanner. However, if it's the only way, and it's an obscure document like a service manual, then a camera will do. If you must use a camera, try to find a way to hold the pages perfectly flat and hold the camera directly above the center of each page to avoid distortion. (ie turning pages from rectangles to trapezoids.) If using auto exposure, apply compensation to make the pages white. Auto exposure will want to make your pages gray.

Please scan one page at a time whenever possible and scan so the pages are upright in the images.

My standards are much higher for new color scans to replace B&W scans that we already have. For these replacement scans, I am looking for very nice looking pages. For manuals that we have no existing scans of, I'm willing to settle for just readable if that's all your scanner can produce.

Page Backgrounds

Try to achieve pages that are pure white. If your scanning software has settings for document and photo, make sure it is set for document. Use the scanner's contrast and brightness controls to get the background white. The contrast is more useful because it keeps the text black. If your scanning software has an automatic exposure option, it's usually best to use it once on a typical page, then adjust the contrast and brightness to make then page white and the text black - and then don't auto expose on the following pages.

Goal: Black text on white pages     Avoid: Gray pages
   

While either image looks OK on the the screen, the gray background results in much larger files, looks bad when printed and wastes a lot of ink.

Keep Pages Flat

In many cases you will need to place small heavy objects on the back side of the pages near the edges to hold them flat to the scanner. Keeping the pages flat keeps them from turning brown or gray at the edges and avoids distortions from the page curling back from the scanner glass. It also saves ink when printing. Please also experiment with your contrast and brightness controls so that the page is white as far out to the edges as possible while keeping the text black and graphics realistically colored. There is a balancing act here since too much contrast can give the page unnatural colors. Keeping the page flat against the glass will minimize the amount of contrast that your need to add to keep the page white. With many spiral bound books, you can consider bending the wire on the end and unscrewing the binding from the pages. This makes high quality scanning much easier. Then you screw the binding back into the stack of pages.

On the scan on the left, weights were placed on the page next to the spiral binding to keep them flat. The scan on the right was made without weights.

Goal: White pages edge to edge     Avoid: Dark edges from curling
   

If you have the time and the tool, please crop each page to eliminate the binding like you see on the right side of the images above.

Avoid JPEG Compression For Text

JPEG is great for photos but it creates distortion and noise for text. The following was created with a high JPEG compression setting for emphasis but many people are sensitive to far lower amounts of JPEG artifacts in text documents.

Goal: Clean Text     Avoid: Jpeg File
   

Transmitting Files

Please use the incoming directory at ftp://ftp.hpmuseum.org/incoming

Please ftp scans to that directory rather than attaching them to emails. Very large emails have a tendency to get rejected. You can also use something like dropbox if you have an account.

Many FTP clients don't understand URLs, so you connect to the server ftp.hpmuseum.org and then cd to the incoming directory. For example, here is a session with a simple text mode FTP like the ones that are included with Windows, Unix, Linux, OS X etc. We have already CD'd to the local directory with the pages before starting the ftp program:

ftp ftp.hpmuseum.org
Connected to hpmuseum.org.
220 hpmuseum.org NcFTPd Server (licensed copy) ready.
User (hpmuseum.org:(none)): anonymous
331 Guest login ok, send your complete e-mail address as password.
Password:
230-You are user #2 of 16 simultaneous users allowed.
ftp> cd incoming
250 "/incoming" is new cwd.
ftp> bin              ← important - sets binary mode
200 Type okay.
ftp> prompt           ← prevents ftp from asking you to send each file
Interactive mode Off .
ftp> mput *.tif
ftp> quit
221 Goodbye.

This directory is write-only so you will not be able to see the names of files stored there. Many GUI FTP clients always try to display the remote directory so they will typically display "Permission Denied" in the remote directory window. Just ignore this.

To prevent name collisions, please name your files using the model number, language, and page. For example, if you were uploading French hp10 manuals, please name them like: hp10_french_001.tif, hp10_french_002.tif etc. (Please include the leading zeros on the page numbers too so that the scans naturally sort in page order - this helps me a lot!) You can also place the files in a zip file if you prefer.

When they're uploaded, please send me an email and I'll fetch them.

In some cases it may be better to send me a CD/DVD/Flash drive.

Feel free to ask to ask for clarifications on any of the above. Hopefully the above doesn't sound too scary or demanding. The more each person can get "right" before sending, the more time I have to get more done.

Go back to the DVD V8 Page
Go back to the software library
Go back to the main exhibit hall