New HP-12C Review

The Museum of HP Calculators

HP Forum Archive 19

New HP-12C Review
Message #1 Posted by Katie Wasserman on 27 Apr 2009, 6:47 p.m.

Thanks entirely to Charlie Oxford's efforts I now have a new HP-12C to play with.

If you like the original 12C you will love this machine. It's functionally identical to the original (we know this because it's emulating the original) but everything runs 60 (sixty) times faster: a long program (my 70 decimal digits of pi) ran in 90 seconds vs 90 minutes on the original 12C; a long amortization, 1.5 seconds vs 90 seconds, 69 factorial gives you an answer almost before you hit the key, it's ludicrously fast! I hope that this blazing speed does not scar the set-in-their-ways wall street crowd from believing the calculation results.
Build quality, key clicks and display readability are excellent. Most importantly, no missed keystrokes. The keys themselves use the new, lower density plastic compared to the original, but this is a very minor nit to pick.
The only things that are functionally different are the self tests. The manual has not been updated in this section, so it's wrong in most cases:
- "ON" + "/" runs the sequential key press test but shows 1,2 or 3 segments at a time and does not give "error 9" if you press the keys in the wrong order. When the test ends it doesn't show the "12" on the display just returns you to the X register.
- "ON" + "x" run a self test (I think) for one second and just results to the x register.
- [ON] + [-] will clear the calculator and show "Pr Error".
- [ON] + [+] will run a continues self test (I think) ending when you press and hold any key.
Now for the new stuff that I found from playing around:
- [ON] + [g] shows the curious display below but I haven't found anything that you can do with this yet:

- [ON] + [g] + [ENTER} starts a testing menu:

1.L - LCD test -- this turns on all the segments. If you then press [Rv] it will turn off half of them, press [Rv] again and you'll toggle the segments on to off and vice versa
2.C - Copyright. First you'll see this:

the next key press will give you this:

then next you'll see this:

3.H - Extended LCD and key test. All segments will turn on at first. hit any key and it will turn off 1 to 3 segments that sort of map to the position of the keys -- rows and columns.
There are probably other easter eggs in here too, but I have yet to find them.
So, how do you find one of these in the store? You can see the bottom edge of the calculator through the packaging, here's what it looks like:

Here's a picture of the back, see how big that batter cover is?

And here's what it looks like under the door, I need to get one of those special SDK cables from HP so I can start playing around with some alternative firmware.

(I'm tempted to pull off the feet and peek at the circuit board, but I'm going to wait for a bit longer.)
I think they did a great job on this. Proven technology and ergonomics updated with modern speed and firmware replacement/updatability. I can't wait for the modern 15C (and, dare I suggest, 16C). Way to go HP! Kudos to Eric, Cyrille, Sam, Gene and whoever else had a hand in this.

Edited: 28 Apr 2009, 1:47 a.m. after one or more responses were posted

Re: New HP-12C Review
Message #2 Posted by Paul Dale on 27 Apr 2009, 7:24 p.m.,
in response to message #1 by Katie Wasserman

I think I'm going to have to keep an eye open for this one.
- Pauli

Re: New HP-12C Review
Message #3 Posted by Scott Newell on 27 Apr 2009, 7:32 p.m.,
in response to message #1 by Katie Wasserman

Thanks for the pics of the battery door. It confirms what I've been looking for.

Re: New HP-12C Review
Message #4 Posted by Egan Ford on 27 Apr 2009, 8:03 p.m.,
in response to message #1 by Katie Wasserman

Thanks. This will be the 3rd post-48GX HP I purchase (35s and 50g being the other two).
Has any started a wiki or blog on reprogramming it?

Re: New HP-12C Review
Message #5 Posted by hpnut on 27 Apr 2009, 8:45 p.m.,
in response to message #4 by Egan Ford

I see another stick-on serial number. sigh, is it too much to ask for manufacturer-engraved SN? ;-)

Re: New HP-12C Review
Message #6 Posted by Bruce Bergman on 27 Apr 2009, 11:16 p.m.,
in response to message #4 by Egan Ford

I will create an entry on my wiki for it (the same place as the 20b repurposing project). Feel free to contribute any and all information to it.
Katie, can I abscond with your findings to post them, or would you be willing to post them on the wiki? There's some great material there that I don't want to lose...
URL: http://www.wiki4hp.com
thanks, bruce

Re: New HP-12C Review
Message #7 Posted by Katie Wasserman on 28 Apr 2009, 12:08 a.m.,
in response to message #6 by Bruce Bergman

Bruce,

Quote:
can I abscond with your findings
Abscond away! I'll keep working at it to see if I can come up with anything else.
Thanks for the wiki and all your work on the 20b,
-Katie

Re: New HP-12C Review
Message #8 Posted by hpnut on 27 Apr 2009, 8:47 p.m.,
in response to message #1 by Katie Wasserman

Katie,
any pictures of the calculator from the front? It appears the classic 12C gold trim is back.

Re: New HP-12C Review
Message #9 Posted by Katie Wasserman on 27 Apr 2009, 9:07 p.m.,
in response to message #8 by hpnut

I just added one to the top of the review.

Re: New HP-12C Review
Message #10 Posted by Seth Morabito on 27 Apr 2009, 9:04 p.m.,
in response to message #1 by Katie Wasserman

It's a pity that the serial number label is so shoddy looking -- but, I'm picking the tiniest of nits here. I'm really pleased with what you've described! I think I'll have to get one. Believe it or not, I don't own a 12c in any incarnation yet!

Re: New HP-12C Review
Message #11 Posted by tony (nz) on 27 Apr 2009, 10:41 p.m.,
in response to message #1 by Katie Wasserman

Brilliant review, thanks Katie. I have one with serial number CNA 849... and the LCD screen has the most "yellowy" tinge I have seen in a 12C. Still useable though. The "ON" key action is more direct than on the old 12C - on the new one the screen "lights up" even when I press and hold the "ON" down. The old one only turns on when the "ON" key is released. This makes the self test proceedure here a little different - more like [/] + [ON] rather than the old [ON]+[/] :-) I managed all your tests - great fun - oh except for the last one I couldn't escape from the menus ;-)
Cheers, Tony


Re: New HP-12C Review
Message #12 Posted by Katie Wasserman on 28 Apr 2009, 10:19 a.m.,
in response to message #11 by tony (nz)

Tony,
On that last test you need to press all the keys down at least once to blank the display. Once the whole display is blanked you can exit the test. A hardware reset (under the battery door) is the only other way to quit that I found.
Also, when a program starts with [f][CLEAR sigma] the display is not blanked when running like it is on the original 12C.
-Katie

Re: New HP-12C Review
Message #13 Posted by tony(nz) on 28 Apr 2009, 4:00 p.m.,
in response to message #12 by Katie Wasserman

Thanks Katie - I did at least test that running after clearsigma still showed "running". But I never thought to test the tests ;-)

Re: New HP-12C Review
Message #14 Posted by Walter Lam on 28 Apr 2009, 2:22 a.m.,
in response to message #1 by Katie Wasserman

Are there any change in this new 12C package?
How to distinguish the new 12c and the old one?

Re: New HP-12C Review
Message #15 Posted by Martin Pinckney on 28 Apr 2009, 7:36 a.m.,
in response to message #14 by Walter Lam

Go back to Katie's original post. She describes how to identify the new version by looking through the packaging.

Re: New HP-12C Review
Message #16 Posted by Michael de Estrada on 28 Apr 2009, 11:16 a.m.,
in response to message #1 by Katie Wasserman

Reading through the Hewlett-Packard Digest, Volume Eight, 1981, there is an article entitled "Quality By Design," which expounds on the key reliability of the Voyager keyboard design. Metal key operators and a gold-plated circuit board ensures low resitance contact points for precise actuation and durability, resistive to to wear and corrosion. Indeed, the many original Voyagers still in use today attests to this quality. I have a 1982 HP-15C and 1987 HP-12C with keyboards that still work perfectly.
So, I have to wonder how much of this sort of quality has carried over to the newest incarnation of the HP-12C. The cheesy serial number sticker does not concern me as much as any compromises in quality that may have been made under the skin. I suppose it is unreasonable to expect the modern pricepoint calculators to have 25+ year lifespans, but I still hope that HP has set the bar high on this product with regards to quality.

Re: New HP-12C Review
Message #17 Posted by Martin Pinckney on 28 Apr 2009, 3:17 p.m.,
in response to message #16 by Michael de Estrada

Quote:
I suppose it is unreasonable to expect the modern pricepoint calculators to have 25+ year lifespans, but I still hope that HP has set the bar high on this product with regards to quality.
I would assume the quality would be on a par with the existing 12c or 12c platinum, if anyone has any experience with these.

Re: New HP-12C Review
Message #18 Posted by Don Shepherd on 28 Apr 2009, 4:12 p.m.,
in response to message #17 by Martin Pinckney

Martin, I would say the quality of the new unit is the same as that of a 12c I bought 8 years ago, and a 12cp 25th anniversary edition I bought about 3 years ago. They are all made in China, but I have never had any problems with any of those units. The first platinums had a problem with keystroke programs that were more than about 250 lines, as I recall, and I had one of those. But the new unit is blazingly fast. Like Katie said, a program I had that took maybe 2 minutes to run on the old 12c takes less than 2 seconds on this one. That ARM processor does make a huge difference!

Re: New HP-12C Review
Message #19 Posted by BruceH on 28 Apr 2009, 7:46 p.m.,
in response to message #16 by Michael de Estrada

Quote:
So, I have to wonder how much of this sort of quality has carried over to the newest incarnation of the HP-12C.
Michael, IMHO the current 12C keyboard is the best of any of the current models. In fact, I really don't understand why HP don't use this key mechanism with the 20B, for example.

Re: New HP-12C Review
Message #20 Posted by hpnut on 28 Apr 2009, 11:41 a.m.,
in response to message #1 by Katie Wasserman

I looked very hard and couldn't find an = key. Is this a pure RPN machine? If so, yahoo!!

Re: New HP-12C Review
Message #21 Posted by Don Shepherd on 28 Apr 2009, 12:27 p.m.,
in response to message #20 by hpnut

Yep, pure RPN, just like the original 12c.

New 12c loop count speed test
Message #22 Posted by Gene Wright on 28 Apr 2009, 6:23 p.m.,
in response to message #21 by Don Shepherd

Program of
01 + 02 GTO 01
with the stack filled with 0 in X and 1 in Y, Z and T
counts to well over 45,000 in 60 seconds.

Re: New 12c loop count speed test
Message #23 Posted by Katie Wasserman on 28 Apr 2009, 10:07 p.m.,
in response to message #22 by Gene Wright

Gene,
You have a hyper-speed 12C! I only get to about 30,000 in 60 seconds. The factor of 1.5 difference agrees with your statement several months ago that the new 12C runs 90 times faster, I (and Don too) found that it's "only" 60 times faster.
How do I get to hyper-speed? :)
-Katie


Re: New 12c loop count speed test
Message #24 Posted by DaveJ on 28 Apr 2009, 10:35 p.m.,
in response to message #23 by Katie Wasserman

Quote:
Gene,
You have a hyper-speed 12C! I only get to about 30,000 in 60 seconds. The factor of 1.5 difference agrees with your statement several months ago that the new 12C runs 90 times faster, I (and Don too) found that it's "only" 60 times faster.
How do I get to hyper-speed? :) -Katie
Don't get too excited about that speed, it comes at the expense of battery life efficiency.
See here for why:
http://www.alternatezone.com/eevblog/?p=32
I bet they are running this sucker at 30MHz just like the 20B. And if they are, continuous processing would drain the batteries in way under 30 hours. Anyone want to put their unit into a continuous loop and see how long it actually lasts?
Dave.

Re: New 12c loop count speed test
Message #25 Posted by db (martinez, ca.) on 29 Apr 2009, 12:24 a.m.,
in response to message #24 by DaveJ

that's sad news. or maybe it's not sad news. i don't speak ausie ;-) but you seem to be saying that if someone(s) write new operating systems for the 12c and 20b; they can drastically lower the power consumption in both by changing the clock speed. if i get this correctly; the worst that the savings can be is 10X, but since the calc will sit idle most of the time while we write and think, and if we choose a standby speed of less than your 3 meg; those batteries can last a very long time indeed. did hp do this with either of the new units as shipped?

Re: New 12c loop count speed test
Message #26 Posted by DaveJ on 29 Apr 2009, 1:51 a.m.,
in response to message #25 by db (martinez, ca.)

Quote:
that's sad news. or maybe it's not sad news. i don't speak ausie ;-) but you seem to be saying that if someone(s) write new operating systems for the 12c and 20b; they can drastically lower the power consumption in both by changing the clock speed.
You can reduce the losses in the battery resistance by running at a slower speed, yes. This will give greater battery life at the expense of calculation speed.
A big speed reduction will almost certainly have no visible speed impact on normal calculations. It's only looping program calculations where the speed matters, but even then 30MHz seems crazy. My uWatch runs at 250KHz and does C floating point calculations all but instantly. In fact it's practically instant running at 32KHz.

Quote:
if i get this correctly; the worst that the savings can be is 10X, but since the calc will sit idle most of the time while we write and think, and if we choose a standby speed of less than your 3 meg; those batteries can last a very long time indeed. did hp do this with either of the new units as shipped?
The 20B runs at 30MHz only when doing calculations, then sits idle drawing almost nothing at a slow speed. So it's doing it properly except for the fact that they chose the top speed of 30MHz, and it peaks like this for *every* calculation regardless if it needs it or not! This is very poor low power calc design IMHO.
I can hardly imagine a program running on such a calc that would warrant a 30MHz clock rate on a 32bit ARM processor.
Every time you do a calc you are gulping a quick 15mA from those poor little CR2032 batteries with their high output resistance, it makes me want to cry!
I don't know about the 12B, I'm just assuming it's the same as the 20B.
Dave.
Edited: 29 Apr 2009, 2:03 a.m.

Re: New 12c loop count speed test
Message #27 Posted by Eric Smith on 29 Apr 2009, 4:14 a.m.,
in response to message #26 by DaveJ

Quote:
My uWatch runs at 250KHz and does C floating point calculations all but instantly. In fact it's practically instant running at 32KHz.
uWatch is running native math code, and the ARM-based 12C is not. Cyrille's put in a lot of optimizations [*], but at 250 kHz it would most likely be slower than the original 12C.
I agree, though, that 30 MHz is absurd and wastes battery life.
Eric
[*] I proposed some optimizations to the BCD math, and I know Cyrille experimented with them, but I'm not sure whether he put them in the production code. Aside from that, I know he designed his emulation code from the ground up to be very efficient.

Re: New 12c loop count speed test
Message #28 Posted by Katie Wasserman on 29 Apr 2009, 4:51 a.m.,
in response to message #27 by Eric Smith

I just measured the current draw on the new 12C with good equipment. You need two power supplies for this as the batteries are in parallel with a common positive contact. Here are my findings:
power off: 4uA
power on, static display, no keystrokes: 45uA
continuous keystrokes (number entry): 1mA
long amortization function: 15mA
tight program loop: 15mA
continuous self-test ([ON]+[+]): 4mA
test modes ([ON]+[g] and [ON]+[g]+[ENTER]) , static display : 1.8mA
So it beats up on those CR2032's but only to give you the fast speed. I think this is justified however since you really do want the amortization results as fast as possible. Other functions run so fast that stress to the batteries is minimal. For most practical user programs on the 12C the same is true.
Given the higher current draw in the new modes, my guess is that the the boot loader on the Atmel chip is running. You probably need to be in this mode to talk to the serial port. Perhaps that's the purpose of [ON]+[g], just to put you in that mode and show the the status of the CPU.
-Katie

Edited: 29 Apr 2009, 12:56 p.m. after one or more responses were posted

Re: New 12c loop count speed test
Message #29 Posted by DaveJ on 29 Apr 2009, 5:09 a.m.,
in response to message #28 by Katie Wasserman

Quote:
I just measured the current draw on the new 12C with good equipment. You need two power supplies for this as the batteries are in parallel with a common positive contact. Here are my findings:
power off: 4uA
power on, static display, no keystrokes: 45uA
continuous keystrokes: 1mA
long amortization function: 15mA
tight program loop: 15mA
continuous self-test ([ON]+[+]): 4mA
test modes ([ON]+[g] and [ON]+[g]+[ENTER]) , static display : 1.8mA
So it beats up on those CR2032's but only to give you the fast speed. I think this is justified however since you really do want the amortization results as fast as possible. Other functions run so fast that stress to the batteries is minimal. For most practical user programs on the 12C the same is true.
Given the higher current draw in the new modes, my guess is that the the boot loader on the Atmel chip is running. You probably need to be in this mode to talk to the serial port. Perhaps that's the purpose of [ON]+[g], just to put you in that mode and show the the status of the CPU.
-Katie
Thanks for the measurements. But are you SURE it doesn't actually take 15mA spikes during normal calculations?
What equipment?, what method? "good equipment" doesn't mean anything if your method has limitations (e.g. you are only reading the average with a multimeter). Sorry to be pedantic, but it's easy to get false measurements on pulse current readings like this.
Obviously the processor is working at 30MHz during program execution as expected, so your 15mA figures are spot on. My bet is it also peaks at 15mA doing a simple addition.

Quote:
Other functions run so fast that stress to the batteries is minimal.
Sorry, you can't beat ohms law. The battery losses remain the same as I pointed out in my video, regardless of how "quick" the pulse is.
Dave.
Edited: 29 Apr 2009, 5:27 a.m.

Re: New 12c loop count speed test
Message #30 Posted by Katie Wasserman on 29 Apr 2009, 12:24 p.m.,
in response to message #29 by DaveJ

I was using a Fluke 867b, measuring the current draw from the common "+" supply line from the batteries to the calc (burden voltage drop is minimal on the 10 amp range). Yes, it does draw 15ma peak on every function. I just rechecked this using a HP 34401A and got the same readings. They also seem to agree with what Cyrille posted.


Re: New 12c loop count speed test
Message #31 Posted by cyrille de Br�bisson on 29 Apr 2009, 10:07 a.m.,
in response to message #28 by Katie Wasserman

hello,

Quote:
power off: 4uA power on, static display, no keystrokes: 45uA continuous keystrokes: 1mA long amortization function: 15mA tight program loop: 15mA continuous self-test ([ON]+[+]): 4mA test modes ([ON]+[g] and [ON]+[g]+[ENTER]) , static display : 1.8mA
the basic figures for the ARM are: power off 4�A running at 2Mhz (internal oscillator) ~1.5ma running at 30Mhz 15ma LCD on (12C, no charge pump: 45�a, 20b, charge pump, 150�a)
cyrille

Re: New 12c loop count speed test
Message #32 Posted by DaveJ on 29 Apr 2009, 4:58 a.m.,
in response to message #27 by Eric Smith

Quote:
uWatch is running native math code, and the ARM-based 12C is not. Cyrille's put in a lot of optimizations [*], but at 250 kHz it would most likely be slower than the original 12C.
Yes, the 12C is a different beast because it's running an emulator which has much more overhead. The 20B on the other hand...
If you assume the new 12C works at 30MHz, and it's been measured as 60 times faster, then it's obvious it only needs to run at 500K to emulate the original 12C speed (sounds about right to me). Speed improvement is nice, so a nice round 5 or 10 times improvement would have been sufficient for marketing, giving only a few MHz operation which would be very sensible. Or better yet, if possible, make it smart - so for normal calcs keep it running at a low 500KHz, and only switch to high clock speed when it's running a program or something.
Or even better still, give the user a speed option, it's only a few lines of code. By default, make it slow for maximum battery life, and those who need super speed can select it if needed.
Dave.

Re: New 12c loop count speed test
Message #33 Posted by Katie Wasserman on 29 Apr 2009, 12:30 p.m.,
in response to message #32 by DaveJ

I like the user-specified speed setting with a default of a few MHz. Most users would never read the manual to know how to change it but would experience a 10x speedup over the original 12C and very long battery life. Geeks would push it to the limit but have plenty of spare batteries around.
Still, at 30MHz when functions are run -- even with the bad battery losses -- given typical usage patterns I think that most users will experience several years of usage on one set of batteries.


Re: New 12c loop count speed test
Message #34 Posted by DaveJ on 29 Apr 2009, 5:37 p.m.,
in response to message #33 by Katie Wasserman

Quote:
Still, at 30MHz when functions are run -- even with the bad battery losses -- given typical usage patterns I think that most users will experience several years of usage on one set of batteries.
The 20B is rated at "an average of 9 months" battery life, so the 12C should be an identical spec.
Dave.

Edited: 29 Apr 2009, 6:02 p.m.

Re: New 12c loop count speed test
Message #35 Posted by cyrille de Br�bisson on 30 Apr 2009, 10:26 a.m.,
in response to message #34 by DaveJ

hello

Quote:
The 20B is rated at "an average of 9 months" battery life, so the 12C should be an identical spec.
actually, no. the 20b in idle mode (screen ON, not 'working') uses 150�a versus 45 or so for the 12C. so there will be a difference in battery life.
it also takes more keys to do something on average with the 20b, so there is more overhead there...
cyrille

Re: New 12c loop count speed test
Message #36 Posted by cyrille de Br�bisson on 29 Apr 2009, 10:05 a.m.,
in response to message #27 by Eric Smith

Hello,

Quote:
[*] I proposed some optimizations to the BCD math, and I know Cyrille experimented with them, but I'm not sure whether he put them in the production code. Aside from that, I know he designed his emulation code from the ground up to be very efficient.
Yep, the emulator is quite efficient. As for BCD calculations, I further optimized the code that you proposed in 32 bit ARM assembly using the full power of 3 operations per instruction offered by the ARM (shift + operation + carry detection) allowing me to do a 64 bit BCD add in 18 instructions..
cyrille

Re: New 12c loop count speed test
Message #37 Posted by Eric Smith on 29 Apr 2009, 5:01 p.m.,
in response to message #36 by cyrille de Br�bisson

Of course, for Nut emulation you don't NEED 64-bit BCD operations. Maybe you've incorporated the routines into firmware for other calculators that do need 64-bit BCD operations?

Just imagine trying to produce a product for THIS group of fanatics . . .
Message #38 Posted by Paul Brogger on 30 Apr 2009, 1:08 a.m.,
in response to message #37 by Eric Smith

My hat is off to Cyrille!

Re: New 12c loop count speed test
Message #39 Posted by cyrille de Br�bisson on 30 Apr 2009, 10:33 a.m.,
in response to message #37 by Eric Smith

hello

Quote:
Of course, for Nut emulation you don't NEED 64-bit BCD operations. Maybe you've incorporated the routines into firmware for other calculators that do need 64-bit BCD operations?
But my code works for 64 bits :-) thanks to assembly codding and the power of ARM assembly, adding the last nibble only required one extra instruction...
// used to do an addition on 2 DCB number with a result in DCB // for example, dcbAdjust(a, b) when a and b are dcb representation // of number will return the dcb represention of the number r=a+b... // note that the addition is done prior to the call... // int64 dcbAddAdjust(int64 r, int64 b); // r=a+b
// a += 0x0666666666666666ULL; // preadjust as if carries occur // u64 s = a + b; // compute the sum // b = a ^ b ^ s; // find the carries // b = ~b & 0x1111111111111111ULL; // compute mask for non-carries // return s - ((b >> 4)*6); // subtract out 6 * non-carries dcbAddAdjust: ldr r12, cte66666666 // load 66666666 in lr to perform the a+666666666666666 /* 10hex = 16decimal. so the difference between the BCD representation of 10 and the hex value of 10 is 6. Adding 6 to each digit of one of the 2 numbers correspond to assuming that the addition will generate a carry for each digit and preemtively adds the 6 to each digit.
This need to be done so that we can easely detect which digit addition really had a carry. Then the algorytm will remove the extra 6 from these digits.
The carry for each digit does appear as an extra '1' added to the last bit of each digit. This means that the parity of each digit is equal to parity of digit in a exclusive or parity of digit in b exclusive or carry from previous digit.
This means that calculating a xor b xor (a+b) and only looking at the first bit of each digit will tell us for each digit if there is a carry or not. Note that the fact that we addedd 6 to each digit of a does not affect the calculation of parity for a as 6 is an even number */ adds r0, r0, r12 // a+=6666666666666666 adc r1, r1, r12 ldr r12, cte88888888 // preload 888888888888 preloading it earlier than needed will reduce 1 wait state later on... /* step 2 of the algorythm. we now have pre addjusted a, we calculate the sum of a+b. however, here we pay attention to keeping the carry out of that calculation in the CPU carry flag (this is why we use the adcS instruction as in add r1 and r3 and the current carry AND keep the carry out in the flag register) so that we can later remove the 6 from the last digit of the result if that carry is clear, or in our case, clear the bit in the bitfield representing the digits where we need to remove 6 for digit 16 if the carry is set.*/ adds r4, r0, r2 // s= a+b adcs r5, r1, r3 // keep carry!!!
eor r2, r0, r2 // b=a^b. This calculates the combined parity of each digits of a and b. eor r3, r1, r3 // note that only the first bit of each digit has any interest. the others will be removed later eor r2, r2, r4 // b=a^b^s. 'removing' the combined parity of each digit of a and b from the parity of the eor r3, r3, r5 // sum of a and b gives the carry bit for each digit. /* the next 3 instructions perform 3 64 bit operations at once (a normal non ARM CPU would need 6 instructions to do so!), so please follow! we now have a set of 64 bits, 16 of them are of interest to us (the least significant bit of each digit) as it indicates the carry. so, we need to 1: clear all the other bits (the b&1111111111111111 in the C code) 2: if the carry bit for digit 'n' is NOT set (ie, no carry), it means that we need to remove 6 from digit n-1. So we need to invert each carry bit 3: we need to shift our bitfield to get the bit indicating the carry caused by digit n located in the same region of the registers as digit n (for the moment, the bit for digit n is the least significant bit of digit n+1)
Thanks to the ARM cpu instruction set, we can do this in only 3 instructions. - as far as 1 and 2 are concerned we can use the bic instruction (Bit Clear) which performs a "a and ~b" operation so we can combine them - a shift operation normally takes 3 operations, but if we decide to shift the bitfeild by only 1 bit (which would place the carry for digit n in the bit 3 of digit n), then we can use the shift ability of the ARM to perform in 1 operation the 88888888 and (~b shift 1) for the lower 32 bits of b, then use 1 instruction (the sub) to handle the carry for digit 7 (which is now held as the least significatn bit of the register holding the upper 32 bits of b) and finish the work by handling the upper 32 bits of b. because the shift is done 'onthe fly', we now need to and b not by 11111..., but by 1111... shifted by 1 or 8888......
Note that if the ARM had 64 bit registers, we could do the whole thing in only 1 instruction */ bic r2, r12, r2, lsr #1 // (~b>>1)&0x888888888888888 sub r2, r2, r3, asl #31 bic r3, r12, r3, lsr #1
/* Handles carry that we lost from the register due to 64 bit limitiation during the addition of a+b but that was kept in the flag register of the ARM CPU and never modified since! if the carry is SET (ie, there is a carry on the last digit), then we do NOT need to remove 6 from the last digit, and the bit corresponding to the need to remove that 6 is cleared from the register. */ subcs r3, r3, #0x80000000
/* the last step of the algorythm is to remove 6 (or remove 2 and remove 4 as 2+4=6) from each digit where they were no carry. for each digit, we have a bit (bit 3 to be precise) in variable b that is set if 6 needs to be removed from this digit. so, we need to remove b>>1 (correspond to carry bit*4) and b>>2 (correspond to carry bit *2) from the sum to get our result. note that because we know that only 1 bit in each group of 4 bit is potentially set, there is no need to handle bit movement from higher 32 bits to lower 32 bits of b. If the ARM had 64 bit registers, this would be a non issue. */ subs r4, r4, r2, lsr #2 // s-(b>>2)*3 = s-b<<2-b<<3 sbc r5, r5, r3, lsr #2 subs r0, r4, r2, lsr #1 sbc r1, r5, r3, lsr #1 END of Function
cte66666666: DC32 0x66666666 cte88888888: DC32 0x88888888 cte11111111: DC32 0x11111111
cyrille

Re: New 12c loop count speed test
Message #40 Posted by Andr�s C. Rodr�guez (Argentina) on 30 Apr 2009, 6:21 p.m.,
in response to message #39 by cyrille de Br�bisson

Cyrille: with some guessing (I apologize), I tried to put the code you shared in a format compatible with the MoHPC Forum, which at times is rather unfriendly with respect to long posts formatting. Please correct any mistake. Andr�s

But my code works for 64 bits :-) thanks to assembly coding and the power of ARM assembly, adding the last nibble only required one extra instruction...
// Used to do an addition on two DCB numbers with a result in DCB // for example, dcbAdjust(a, b), when a and b are dcb representation // of numbers, will return the dcb representation of the number r=a+b... // Note that the addition is done prior to the call... // // int64 dcbAddAdjust(int64 r, int64 b); // r=a+b // a += 0x0666666666666666 ULL; // preadjust as if carries occur // u64 s = a + b; // compute the sum // b = a ^ b ^ s; // find the carries // b = ~b & 0x1111111111111111 ULL; // compute mask for non-carries // return s - ((b >> 4)*6); // subtract out 6 * non-carries
dcbAddAdjust:
ldr r12, cte66666666 // load 66666666 in lr to perform the a+666666666666666
/* 10hex = 16decimal, so the difference between the BCD representation of 10 and the hex value of 10 is 6. Adding 6 to each digit of one of the 2 numbers correspond to assuming that the addition will generate a carry for each digit and preemptively adds the 6 to each digit. This need to be done so that we can easily detect which digit addition really had a carry. Then the algorithm will remove the extra "6" from these digits. The carry for each digit does appear as an extra '1' added to the last bit of each digit. This means that the parity of each digit is equal to parity of digit in an exclusive-or parity of digit in b exclusive-or carry from previous digit. This means that calculating a xor b xor (a+b) and only looking at the first bit of each digit will tell us for each digit if there is a carry or not. Note that the fact that we added 6 to each digit of a does not affect the calculation of parity for a as 6 is an even number */
adds r0, r0, r12 // a+=6666666666666666 adc r1, r1, r12 ldr r12, cte88888888 // preload 888888888888; preloading it earlier than // needed will reduce 1 wait state later on...
/* Step 2 of the algorithm. We now have pre adjusted a, we calculate the sum of a+b. however, here we pay attention to keeping the carry out of that calculation in the CPU carry flag (this is why we use the adcs instruction as in add r1 and r3 and the current carry, AND keep the carry out in the flag register), so that we can later remove the 6 from the last digit of the result if that carry is clear or, in our case, clear the bit in the bitfield representing the digits where we need to remove 6 for digit 16 if the carry is set.*/
adds r4, r0, r2 // s= a+b adcs r5, r1, r3 // keep carry!!! eor r2, r0, r2 // b=a^b.
/* This calculates the combined parity of each digits of a and b. */
eor r3, r1, r3 // Note that only the first bit of each digit has any interest. // The others will be removed later. eor r2, r2, r4 // b=a^b^s. 'Removing' the combined parity of each eor r3, r3, r5 // digit of a and b from the parity of the sum of a and b // gives the carry bit for each digit.
/* The next 3 instructions perform 3 64-bit operations at once (a normal non-ARM CPU would need 6 instructions to do so!), so please follow! We now have a set of 64 bits, 16 of them are of interest to us (the least significant bit of each digit) as it indicates the carry. So, we need to:
1. Clear all the other bits (the b&1111111111111111 in the C code) 2. If the carry bit for digit 'n' is NOT set (i.e., no carry), it means that we need to remove 6 from digit n-1. So we need to invert each carry bit 3. We need to shift our bitfield to get the bit indicating the carry caused by digit n located in the same region of the registers as digit n (for the moment, the bit for digit n is the least significant bit of digit n+1) */
/* Thanks to the ARM CPU instruction set, we can do this in only 3 instructions. - as far as 1 and 2 are concerned we can use the bic instruction (Bit Clear) which performs a "a and ~b" operation so we can combine them - a shift operation normally takes 3 operations, but if we decide to shift the bitfield by only 1 bit (which would place the carry for digit n in the bit 3 of digit n), then we can use the shift ability of the ARM to perform in 1 operation the 88888888 and (~b shift 1) for the lower 32 bits of b, then use 1 instruction (the sub) to handle the carry for digit 7 (which is now held as the least significant bit of the register holding the upper 32 bits of b) and finish the work by handling the upper 32 bits of b. because the shift is done 'on the fly'; we now need to and b - not by 11111..., but by 1111... shifted by 1 or 8888...... Note that if the ARM had 64 bit registers, we could do the whole thing in only 1 instruction */
bic r2, r12, r2 lsr #1 // (~b>>1) & 0x888888888888888 sub r2, r2, r3 asl #31 bic r3, r12, r3 lsr #1
/* Handles carry that we lost from the register due to 64 bit limitation during the addition of a+b but that was kept in the flag register of the ARM CPU and never modified since, if the carry is SET (i.e., there is a carry on the last digit), then we do NOT need to remove 6 from the last digit, and the bit corresponding to the need to remove that 6 is cleared from the register. */
subcs r3, r3, #0x80000000
/* The last step of the algorithm is to remove 6 (or remove 2 and remove 4; as 2+4=6) from each digit where there were no carry. For each digit, we have a bit (bit 3 to be precise) in variable b that is set if 6 needs to be removed from this digit. So, we need to remove b>>1 (correspond to carry bit*4) and b>>2 (correspond to carry bit *2) from the sum to get our result. Note that because we know that only 1 bit in each group of 4 bit is potentially set, there is no need to handle bit movement from higher 32 bits to lower 32 bits of b. If the ARM had 64 bit registers, this would be a non issue.*/
subs r4, r4, r2 lsr #2 // s-(b>>2)*3 = s-b<<2-b<<3 sbc r5, r5, r3 lsr #2 subs r0, r4, r2 lsr #1 sbc r1, r5, r3 lsr #1
END of Function
cte66666666: DC32 0x66666666 cte88888888: DC32 0x88888888 cte11111111: DC32 0x11111111


Re: New 12c loop count speed test
Message #41 Posted by Eric Smith on 1 May 2009, 12:36 a.m.,
in response to message #40 by Andr�s C. Rodr�guez (Argentina)

For comparison, here's the C code I sent to Cyrille that inspired his ARM assembly code:

uint64_t bcd15d_add (uint64_t a, uint64_t b, bool carry_in) { if (carry_in) b++; a += 0x0666666666666666UL; // preadjust as if carries occur uint64_t s = a + b; // compute the sum b = a ^ b; // find the carries b = ~(s ^ b) & 0x1111111111111110UL; // compute mask for non-carries return s - ((b >> 2) | (b >> 3)); // subtract out 6 * non-carries }
This works for 15-digit BCD addition, but not for 16 digits, because in C code there is no simple portable way to obtain the carry out from an addition of 64-bit unsigned integers. Cyrille wrote ARM assembly code based on this, and in ARM assembly it's easy to obtain the carry out.
Note that when given non-BCD input, neither the C code nor the ARM code will match the results given by the HP Nut or Saturn processors. My latest code first does an efficient parallel test for valid BCD digits of each operand, and chooses the fast BCD addition for valid operands, or a digit-by-digit method for invalid operands. I've sent the C code for the efficient parallel BCD test to Cyrille in case he wants to do something similar in other HP calculators such as the 50g.


Re: New 12c loop count speed test
Message #42 Posted by Scott Newell on 1 May 2009, 6:43 p.m.,
in response to message #41 by Eric Smith

Quote:
For comparison, here's the C code I sent to Cyrille that inspired his ARM assembly code:

uint64_t bcd15d_add (uint64_t a, uint64_t b, bool carry_in) { if (carry_in) b++; a += 0x0666666666666666UL; // preadjust as if carries occur uint64_t s = a + b; // compute the sum b = a ^ b; // find the carries b = ~(s ^ b) & 0x1111111111111110UL; // compute mask for non-carries return s - ((b >> 2) | (b >> 3)); // subtract out 6 * non-carries }

This reminds me very much of the tricks described in the book "Hacker's Delight". (Web site at http://www.hackersdelight.org/)

Re: New 12c loop count speed test
Message #43 Posted by Eric Smith on 1 May 2009, 8:11 p.m.,
in response to message #42 by Scott Newell

Excellent book! I seem to have misplaced my copy (or maybe lent it out and forgotten?), so I might have to buy another one.

Re: New 12c loop count speed test
Message #44 Posted by J-F Garnier on 1 May 2009, 3:51 a.m.,
in response to message #40 by Andr�s C. Rodr�guez (Argentina)

When I started to design my Saturn emulation engine (for Emu71), back in 1995, I investigated the most efficient way to implement BCD operations on x86 processors in 16-bit "real" mode (Emu71 was, and still is, a pure 16-bit "DOS" program...). I quickly recognized that I had to handle the nibbles in packed form, i.e. 2 nibbles per byte to have some degrees of parallelism. I used the native BCD support of the Intel processors. The result is a quite efficient code for a 16-bit processor, and this explains most of the speed of Emu71:

; BCD 16 nibble addition: [dest] += [src] ; di points to dest, si points to src _addd_w: mov ax,[di] mov bx,[si] add al,bl daa xchg al,ah adc al,bh daa xchg al,ah mov [di],ax mov ax,[di+2] mov bx,[si+2] adc al,bl daa xchg al,ah adc al,bh daa xchg al,ah mov [di+2],ax mov ax,[di+4] mov bx,[si+4] adc al,bl daa xchg al,ah adc al,bh daa xchg al,ah mov [di+4],ax mov ax,[di+6] mov bx,[si+6] adc al,bl daa xchg al,ah adc al,bh daa xchg al,ah mov [di+6],ax update_carry
If I had to rewrite Emu71 in 32-bit mode, I would do differently, maybe using Eric/Cyrille method. Is it a public domain method, or is it an innovative code of yours?
J-F
Edited: 1 May 2009, 3:54 a.m.

Re: New 12c loop count speed test
Message #45 Posted by Egan Ford on 1 May 2009, 11:27 a.m.,
in response to message #44 by J-F Garnier

Quote:
When I started to design my Saturn emulation engine (for Emu71), back in 1995 ...
A million thank yous for Emu71.
Quote:
If I had to rewrite Emu71 in 32-bit mode, I would do differently ...
And a million more if you rewrite it AND open source it. Given current processor speeds I would hesitate to require assembly optimizations in the event someone wanted to port it to a different architecture.

Re: New 12c loop count speed test
Message #46 Posted by Marcus von Cube, Germany on 1 May 2009, 11:32 a.m.,
in response to message #45 by Egan Ford

What makes EMU71 useful for me is the hardware HP-IL support. This would need to be ported as well. This is probably the hardest part of the deal: write a Win32 or Linux device driver for the hardware (HP's or Christoph Klug's or the yet to appear PIL box.)

Re: New 12c loop count speed test
Message #47 Posted by Egan Ford on 1 May 2009, 1:14 p.m.,
in response to message #46 by Marcus von Cube, Germany

USB-based PIL box. No problem. I think JFG has a plan, a great plan.
Floyd: "What's going to happen?", Bowman: "Something wonderful." -- 2010

Edited: 1 May 2009, 1:27 p.m.

Re: New 12c loop count speed test
Message #48 Posted by PeterP on 1 May 2009, 2:15 p.m.,
in response to message #45 by Egan Ford

[Start of Dream] what about a i71 for the iPhone? Another project could be to connect the iPhone to IL now that Apple has released the SDK for the connector pin... I don't know anything about hardware and frustratingly little about the 41/71 as well, but I think it would be a super cool project to connect the iPhone to HP-IL and have the i41X (or a yet to be written i71) connected to real devices... [End of Dream]
Cheers
Peter

Re: New 12c loop count speed test
Message #49 Posted by Egan Ford on 1 May 2009, 2:49 p.m.,
in response to message #48 by PeterP

I share your dream. My plans:

Get a PIL Box to work with Windows, Linux, Mac.
Leverage the work of Khanh-Dang Nguyen Thu Lam (http://pagesperso-orange.fr/kdntl/hp41/nonpareil-patch-doc.html) and create a TCP front-end to the PIL BOX. I'll also support the virtual IL devices create by KDNT. This way IL can be added to any emulator and TCP can be used to get to the PIL-BOX for physical devices.
Beg i41CX+ author to add TCP/IL support.


Re: New 12c loop count speed test
Message #50 Posted by PeterP on 1 May 2009, 3:05 p.m.,
in response to message #49 by Egan Ford

I'd be happy to help with 3.) (and any beta-testing that might be helpful).
[Dream Continues...

Re: New 12c loop count speed test
Message #51 Posted by Andr�s C. Rodr�guez (Argentina) on 1 May 2009, 7:29 p.m.,
in response to message #48 by PeterP

The HP-41 had real conectivity via HP-IL 25 years ago, even with data-acquisition units. I would like to say (rather fanatically, excuse me) that it had better real connectivity than the iPhone has today... OK, WiFi is nice on the iPhone side...
:-))

Re: New 12c loop count speed test
Message #52 Posted by David Hayden on 1 May 2009, 4:34 p.m.,
in response to message #23 by Katie Wasserman

Quote:
You have a hyper-speed 12C! I only get to about 30,000 in 60 seconds. The factor of 1.5 difference agrees with your statement several months ago that the new 12C runs 90 times faster, I (and Don too) found that it's "only" 60 times faster.
Because of the emulator, I wonder if the speed is sensitive to how the program is aligned in memory. There's a good chance that reading something on a 4-byte boundary is faster than reading something that starts at an odd nibble address.
Dave

Re: New 12c loop count speed test
Message #53 Posted by Don Shepherd on 28 Apr 2009, 10:16 p.m.,
in response to message #22 by Gene Wright

Like Katie, mine got to 30,382. Don

Re: New HP-12C Review
Message #54 Posted by Damir on 29 Apr 2009, 1:42 p.m.,
in response to message #1 by Katie Wasserman

Thanks. How many program steps?
DamirV
Edited: 29 Apr 2009, 2:24 p.m.

Re: New HP-12C Review
Message #55 Posted by Dewdman42 on 1 May 2009, 7:32 p.m.,
in response to message #54 by Damir

So has HP upgraded the 12cp at all? How does the current 12cp compare to this new 12c?

Re: New HP-12C Review
Message #56 Posted by Don Shepherd on 1 May 2009, 11:17 p.m.,
in response to message #55 by Dewdman42

I don't think the 12cp has been upgraded. I have the 25th anniversary edition. How does it compare to the new 12c? The cp has algebraic mode, 400 lines of program space, more cash flows, and x². The new 12c has pure RPN and raw speed.

[ Return to Index | Top of Index ]

Go back to the main exhibit hall