The Museum of HP Calculators
HP Articles Forum
[Return to the Index ]
[ Previous | Next ]
Mapping HP48 Text to Unicode
Posted by Chris Dreher on 16 Jan 2013, 1:15 a.m.
Here is a table for how to translate HP48 text characters into Unicode characters, which is what modern computers use for text. Now HP48 developers create software that transfers, displays, or edits HP48 characters (ex: copying a file from an HP48 calc to a computer) with code that will reliably display a corresponding character. By using this mapping table, we should be able to avoid the garbage data, bugs, and crashes that has been an issue for some PC/Mac/Linux side HP48 software.
Further details and explanations in an easier to read format is available at
Mapping HP48 Text to Unicode
Most HP48 characters can be directly mapped to Unicode characters for the ranges of 0x00 to 0x1E, 0x20 to 0x7E, and 0xA0 to 0xFF. For example, an HP48 character of 'A' is 0x41 (65 in decimal) and in Unicode would be 0041 (65 in decimal). However, the 34 special characters of 0x1F and from 0x7F to 0x9F should be translated by the below table:
Dec Hex Code Name
31 1F 2026 Ellipsis
127 7F 2592 Medium Shade
128 80 2220 Measured Angle
129 81 0101 Latin Small Letter a with Macron
130 82 2207 Nabla
131 83 221A Square Root
132 84 222B Integral
133 85 03A3 Greek Capital Letter Sigma
134 86 25B6 Black Right-Pointing Triangle
135 87 03C0 Greek Small Letter Pi
136 88 2202 Partial Differential
137 89 2264 Less-Than or Equal To
138 8A 2265 Greater-Than or Equal To
139 8B 2260 Not Equal To
140 8C 03B1 Greek Small Letter Alpha
141 8D 2192 Rightwards Arrow
142 8E 2190 Leftwards Arrow
143 8F 2193 Downwards Arrow
144 90 2191 Upwards Arrow
145 91 03B3 Greek Small Letter Gamma
146 92 03B4 Greek Small Letter Delta
147 93 03B5 Greek Small Letter Epsilon
148 94 03B7 Greek Small Letter Eta
149 95 03B8 Greek Small Letter Theta
150 96 03BB Greek Small Letter Lamda
151 97 03C1 Greek Small Letter Rho
152 98 03C3 Greek Small Letter Sigma
153 99 03C4 Greek Small Letter Tau
154 9A 03C9 Greek Small Letter Omega
155 9B 0394 Greek Capital Letter Delta
156 9C 03A0 Greek Capital Letter Pi
157 9D 03A9 Greek Capital Letter Omega
158 9E 25A0 Black Square
159 9F 221E Infinity
If you are using UTF-8, then it is necessary to encode each Unicode characters into the appropriate 1, 2, or 3 byte sequences.
In some cases, the choice of what Unicode character to use was trivial. Some of the below were not.
- Character 0x80 (angle)
- Instead using 2220 for character 0x80, others have incorrectly used 221F. 221F is the Right Angle character and is not intended for any generic angle. Also, it does not visually match the HP48.
- While 2221 is visually an even better match, this character often does not render properly on various computer platforms and software. In short, some users will just see empty boxes.
- Character 0x81 (x-bar)
- In theory, Unicode allows two characters to be visually combined if the 2nd character is a "combining character". This would allow for the display of an x with a "combining macron" character, which would be 0078 followed by 0304. However, there are two problems with this.
- This combining of these two characters often renders poorly or not at all and will leave the user confused.
For additional examples of how x-bar is inconsistently rendered based on font, go http://www.kreativekorp.com/charset/encoding.php?file=hp-48.kte&char=81.
- Using two characters to represent one HP48 character breaks the pattern having a simple one-to-one mapping. Some HP48 developers will likely have bugs in the code when converting back from Unicode to HP48 characters.
- Instead, a-bar (0101) is used. It is a single Unicode character so it is easy for HP48 developers to deal with, leading to less bugs. Also, x-bar is used in statistics as the notation for average. The 'a' in a-bar looks like an 'a' for average.
- Character 0x82 (nabla)
- The character 2207 was chosen over other triangles since this is the Nabla character which is used in mathematics. Details can be read http://en.wikipedia.org/wiki/Nabla_symbol.
- Characters 0x8D through 0x90 (arrows)
- In Unicode, there are a large number of characters that represent arrows. However, 2190 through 2193 were chosen because these are just simple arrow characters and donít carry any additional implied meaning. Also, this set of arrow characters supports all four directions where as some of the other sets do not. Lastly, some of the alternative arrow characters do not consistently get rendered on some computing platforms.
- Characters 0x85, 0x8C, 0x9B, 0x9C, 0x9D (various Greek symbols)
- These are Greek symbols that could have alternatively been represented by various mathematical or electrical Unicode characters. However there are several reasons for preferring the Greek symbols:
- We can gain insight into the original HP48 developers intentions by looking at how they translated these characters when using ASCII transfer mode 2 or 3 over a serial link. These characters were translated into \GS, \Ga, \GD, \PI, and \GW respectively. If we assume that "G" stands for Greek, then we can assume these translations mean Greek Capital Sigma, Greek lower alpha, Greek Capital Delta, Capital Pi, and Greek Capital Omega (a lower omega looks like a 'w'). This pattern holds for all the other translated Greek letters as well, except for \pi which is trivial to see that this is lower pi.
- Using all Greek symbols results in a visually clean look. In contrast, when symbols from math, electronics, and Greek symbols are mixed together, they often look sloppy because they donít line up, have different line weights, and different drawing styles.
- Character 0x9E (box)
- Instead of using 25A0 as the Black Box, others have incorrectly used 25AC which is the Black Rectangle. This visually does not match.
Unicode Standard: http://unicode.org/
Unicode Character Name Index: http://www.unicode.org/charts/charindex.html
UTF-8 summary: http://en.wikipedia.org/wiki/Utf-8
Edited: 16 Jan 2013, 1:16 a.m.
[ Return to the Message Index ]
Go back to the main exhibit hall