Threaded Mode | Linear Mode

C.Ret · (This post was last modified: 09-06-2018 04:41 AM by C.Ret.)

(09-04-2018 06:53 PM)Thomas Klemm Wrote: This took me a while to figure out.
Here's how to initialize the registers, say for m = 10,000.

4 STO 1 ; 4 instead of 5
10000 STO 2
7 STO 3 ; STO 3 instead of STO 1
200 STO 4

Otherwise register 1 is incremented in line 88 to 6 and then the heap starts there.
But in line 29 it's hard-coded to use register 5.

Good catch! You are right Thomas!

I have now to modify the explanations I gave in the Silicium forum.

Yes, register 5 is "hard coded" to be the root of the min-heap, and I mistakenly indicate to initiate it to 5. I was developing several versions of this code, having great troubles to make it fit into the 99 steps of program memory. A previous version (not published) increase R1 after storing the new square and prime value into register – so I mess up starting value needed in register R1.

Sorry for that and the wrong label for register R3 (I wrote 1 in the text when the correct register number is 3 as draw in the flow-chart). The first prime has to be put in register R3.

You may have spent a few time to figure out the bug! Please accept my apologies. If we meet, please remind me to offer you a large glass of beer (or any local beverage you prefer).

You also perfectly understand the philosophy of this algorithm. I try to reduce the number of tests needed to print the whole list of prims up to 10’000. Detecting prims by trying divisibility factors need a bunch of divisions. As you propose storing prime factor in memory is a great enhancement; the HP-19/29 has enough register to store all prime factors to achieve testing odd integer candidate up to 10’000.

In the same time, filtering candidate to test may reduce the effort. That what you made, generating a sequence {+4 +2 +4 +2 +4 +2 … } drastically reduce the number of composite candidates not prime. Unfortunately, detecting prime by successive factor divisibility is most efficient for composite (multiple) numbers; as soon as a divisor is found, we jump to next candidate. For prime, we must test all the possible factors along (i.e. up to the square root); no shortcut to next candidate.

Using a more sophisticated sequence, such as the one I have used ( i.e. {+4 +2 +4 +2 +4 +6 +2 +6 …} ) don’t help much; it reduce only by a few the “composite candidates” which are easy to detect and keep all the “long-testing true prime candidates” for divisibility tests.

This observation lead me to try to find another way for detecting prims. A way which may be as simple as a test. If the HP-19/29 had have more than 10’000 registers a sieve of Eratosthene may have been a easy way. So I look around to see if other sieve or tricky algorithms may help.

The sieve of Sundaram appears to be especially adapted as far as an adapted data structure can be bring out. As you correctly explain, we can't store in registers all the multiples tables produced by Sundaram’s algorithm. We only need to store the first value of all the sequences generated for each factor. We don’t need more register than for the divisibility testing method. But we need a convenient way to keep track of progress and to always know what the minimal value is. This can be a challenge and can ruin the effort.

A data structure can be of great advantage here, since it allows keeping track of minimum value with an optimal number of tests. The min-Heap structure is maintained so that his root (register 5) always contains the minimal composite candidate generated by any of the already encounted primes. Please note that this data-structure was only invented a few years before the HP-19C was built. Implementing a min-Heap in the HP-19/29 in 1972 may have been consider as really top new technology (Tip top Hype ! What was the words at this time ?)

By luck, the extra work to maintain the min-Heap will not ruin the effort because of two happy-facts:
First, we don’t need a full implementation of the min-Heap; since the Sundaram algorithm makes that any new sequence entry (for a new discovered prime factor) start as far as the square of this factor. So we don’t need “pop up” procedure, any new sequence entry can be securely add at the very end of the heap (pointed by value in register R1).
Second, the heap is short and not fully sorted. It is a binary structure, each node have two children’s which are greater (or egual). Less than 31 values mean that this binary tree is less than 4 levels high. When the tested candidate reach the min-heap root value, this value is increased by two times the factor. Then less than 4 values in the heap must be tested in order to replace (“push down”) the update root value.
In fact, only the top of the min-heap is active, middle range is quite passive (each level divide by 2 activity due to binary structure) and the bottom is a long time in stand-by until square values are reaches.

This made the code quite competitive despite the cost of extra test and the operations needed to maintain the heap in the case of a sophisticated filtered candidate. In contrary to the divisibility test methods, sieve methods didn’t lose the advantage of a few composite candidate number.

If we test all integers or all odd integers in a row, clearly all the over-costs for min-Heap operations make this algorithm slow. With sophisticated filtered candidate, the min-Heap method gains advantage since primality is detected with only one test. The way the Sundaram sequences are built spare operating time of the min-Heap.

But a lot of time is lost since the HP-19/29 have only one indirect register. Having two indirect registers may have simplify the operations for testing and swapping nodes in the binary heap and may have consequently greatly reduce running time.

Another difficulty is that the heap didn’t start at R0: or R1:, the off-set make the computation of node's and children's indices a bit Strange: j = 2*i or j = 2*i+1 is transformed as: j = (i-2)*2 or j = (i-2)*2+1

As Albert Chan notice it, my code ignore overlapping value. This is true and done this way expressly; overlapping value (i.e. same value in the sequence of different factors) intersecting the sophisticated filtered candidate sequence are quite rare and the expenses of code and supplementary tests needed to detect them didn’t enter the program memory. Moreover, due to the fact than the candidate are unequally spaced (+2 +4 or +6), for each candidate, more than one pre-compute multiple may have to be update. That why after the “min-Heap maintenance loop” the program returns to label 5 where actual candidate is tested again versus the new min-Heap root (aka Register 5) which may be an overlapping value or not. This make no difference. In theory, avoiding overlapping value will reduce heap stack operations number. In practice, even with more elaborated code no significant gain was observed.