I wish (ope?) I could be the soul of your thoughts... (V 1.2) Message #44 Posted by Vieira, Luiz C. (Brazil) on 11 May 2003, 1:28 p.m., in response to message #43 by glynn
Hi, Glynn. thanks for the nice words!
I posted this message a few hours ago, but I revised it for errors and it's been posted again, O.K.?
Hey, Glynn; Let's first talk about a few important things: internal core, data bus and SW precision.
Remember the ancient 8086 and its partner, the 8088? The 8086 was introduced before the 8088, and the 8086 is a 16-bit data bus processor, while the 8088 is an 8-bit data processor. After introducing the 8086, Intel realized that all commercially existing 8-bit architecture should be changed to accept the new 16-bit data bus processor, and industry would take a bit long to incorporate it in product lines. We know industry takes some time (nowadays it's shorter) to incorporate new technology and make it a consuming line. So, Intel introduced the 8088 later, and it was a lot better sold. The 80286 took a better advantage.
In both cases, either the 8086 or the 8088 were able to run, say, C- and PASCAL-related compilers, and both could handle a lot more than 8 or 16 bits processing. And dealing with BCD data and straight binary representation is just a matter of handling data. In both cases, BCD and straight binary, either the 8088 or the 8086 already had their "hidden aces": both offer (and almost all general purpose processor, too) carry-bit for BCD and straight binary math operations. The carry bit, also found in the HP16C (the actual Das Kleine Wunder), allows "expanding borders" when dealing with "long-integer" or "double-precision" bigger guys. Just the carry bit. And you may know that the HP16C handles 64-bit # (not data) but also goes to 128-bit # when performing [DBL÷], [DBL×] and [DBLR] (DouBLe Remainder after division]. How come? It uses Y an Z registers to hold a 128-bit input data and X and Y registers to hold a 128-bit result, if Word-Size is set to 64, the maximum available.
But we know voyagers deal with 56-bits wide data numbers. Why is the HP16C able to handle 128-bit data?
Software. And when you bond internal design to software, or even better, you "design" internal structure having a software goal in mind, you custom chips will show exceptional performance when running the software they were molded to run. And that's what we've been seeing in almost all HP (I mean "H" "P", actually) calculators: exceptional performance with custom "chipset". Dealing with RPN custom chips means designing chips with stack registers availability or stack manipulation "made easy", as we found in both 8088 and 8086 when talking about BCD coded numbers and carry-bit handling. Imagine Voyagers chips (each R2D2 for each model) already have stack manipulation "made easy", register arithmetic "made-easy" (except for the HP16C, that does not offer register arithmetic and I wonder that would be hard to handle variable size registers with arithmetic abilities...). Even in specific cases, like HP15C matrices operations, optimizing them based in internal architecture would be a matter of knowing internal architecture enough to make the best of it. And you may be sure internal Voyagers design had in mind not a general purpose chipset, instead a number-dealing, KBD-LCD I/O restricted processing unit. And I imagine it is a single eight-bit unit, while a heavily optimized four-bit would also do the job alright.
Problems? Commercial problems! HP would never do what the company that sells the microprocessors used in the HP9G (what the name of it, d.. it? K...) is doing: offer their chips to be used elsewhere. They are too much restricted, or too well focussed, that using a Voyager processor to hold an algebraic operating system would mean a complete redesign, perhaps a new project... just because they are not general-purpose, VLSI, 32-bit processors.
What called my attention is that they have probably ported existing HP12C software and created sort of morph-coding layer (does anybody know where is Linus Torvalds?) to convert HP12C's code to internal ARM7 or similar. That would, indeed, degrade performance if the conversion needs to emulate stack state for each single operation. That would mean a lot faster single operations and exponentially slowing-down loops and the like. If you use specific internal resources in programs, like solving cash-flow problems in a loop-controlled situation, your program will spend a lot more time in the Platinum than in a regular 12C. And simply increasing clock will reduce battery life, so HW and SW balance must be a lot well managed.
There are other circumstances, but I believe these are the ones that directly affect overall performance. I'd like to invite others to come to discussion, as you wish and want, so we may largely consider new technology. My last in-deep research goes a bit beyond RISC internals, but I briefly read about new technology a couple of years ago. As you may notice, Glynn's knowledge is fairly updated if compared to mine.
This is my not-enough US$ 0.01 contribution.
Luiz C. Vieira - Brazil
|