# HP Forums

Full Version: [VA] SRC #013 - Pi Day 2023 Special
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2

Hi, all,

Just in case you hadn't noticed, today it's March, 14 aka $$\pi$$ Day, so

Happy Pi Day 2023 and Welcome to my SRC #13 - Pi Day 2023 Special

(We just enjoyed this nice Pi Day's Oreo cake for lunch, courtesy of my daughter Laura's haute cuisine abilities)

This SRC #13 is intended to commemorate once more this most ubiquitous constant, $$\pi$$. There are many other threads about Pi Day 2023 but this one is mine. After posting a number of threads over the years about $$\pi$$ Day, such as these, it would seem problematic to find new, intriguing appearances of the little critter but far from it, $$\pi$$ is well-nigh inexhaustible and to prove the point let me introduce a couple' additional appearances for your enjoyment, which will appear one after another so that you can focus on just one at a time. Let's begin with #1 ...
Note: No hard rules so no need for a parallel thread, post here whatever you want as long as it's on topic and NO CODE PANELS, but I'd appreciate it if you'd use vintage HP calcs (physical/virtual), otherwise I'll consider you to have failed the challenge whatever your results/timings.

1. Let's count ...

$$\pi$$'s value can be obtained by evaluating a plethora of transcendental functions, infinite summations and products, definite integrals, stochastic processes, etc., but if you don't remember any of them you can still get a nice approximation to the value of $$\pi$$ (exact as N goes to infinity) by following these simple steps:
1. Choose a positive integer N

2. Tally up how many integers in the range 1...N have no repeated prime factors

3. Output
For example, for N = 10 we find that the seven integers 1, 2, 3, 5, 6, 7 and 10 have no repeated prime factors, so Count = 7 and you get ~ 2.9277 as an approximation to $$\pi$$ (err ~ 6.8%).

Now write your very own program and try N = 12,345, 100,000, 567,890 and 1,000,000 to see if you get the following results, which I obtained using this little witty 4-line (217-byte) HP-71B program I wrote for the occasion (uses Math and JPC ROMs; 179 bytes without USING "image"):
1  DESTROY ALL @ INPUT T @ SETTIME 0 @ ...
2  ...
3  ...
4  DISP USING "2(3DC3DC3DC3D,2X),2(Z.8D,X),5DZ.2D";T,S,SQR(6*T/S),ABS(PI-RES),TIME

>RUN ->  ? 12345  -> ... , etc.

N      Count    $$\pi$$ approx    |Error|     go71b  Emu71/Win  Physical
@128x    @976x     HP-71B
------------------------------------------------------------------------
12,345    7,503  3.14198205  0.00038939    0.10"    0.01"       13"
100,000   60,794  3.14155933  0.00003333    0.28"    0.04"       36"
567,890  345,237  3.14158684  0.00000582    0.69"    0.09"    1' 28"
1,000,000  607,926  3.14159550  0.00000285    0.93"    0.12"    1' 59"
However, as the procedure is so simple, the difficulty here lies not so much in the programming as in the efficiency, thus the challenge consists mainly in achieving correct results in times as good or better than the ones given above, using exclusively vintage HP calcs, physical or virtual (indicate emulation's speed and timings for the virtual/physical calcs and try to avoid prematurely spoiling it all for other people.)

Well, see if you can deliver and, if feeling venturous and your calc is up to it, post also the results and timings for N = 10 million, 25 million and 33 million (which gives an approximation to $$\pi$$ correct to 8 digits.)

If I see interest, I'll post my original solution & comments in a few days and part #2 next April, 1st.

That's all. Any and all constructive and on-topic comments will be most welcome and appreciated.

V.
Edit: some errors corrected.
Hi Valentin,

Edit: I have corrected the formatting (thanks, Valentin for the explanation) and added a column for run times using the (much faster) Sys RPL version of the Möbius function written by Gerald H. This is more than twice as fast as the HP-71B times you have listed.

I attempted this because it is simpler than your usual problems and you were complaining about the lack of RPL solutions for your last challenge Using a brute force solution on a physical HP 50g, I get decent run times (13' 20'' for 33 million [5' 21" for the Sys RPL version]) as shown below. Although I am not sure if the 50g qualifies as a vintage calculator, of course.

Runtime (seconds)
Number        Count  Approximation  User RPL  Sys RPL
-------------  -----------  -------------  --------  -------
10            7  2.92770021885      0.44     0.29
12,345        7,503  3.14198204634     15.52     6.02
100,000       60,794  3.14155932716     45.08    16.93
567,890      345,237  3.14158683822    110.40    40.39
1,000,000      607,926  3.14159550063    147.82    54.99
10,000,000    6,079,291  3.14158749068    469.88   139.11
25,000,000   15,198,180  3.14159239999    692.35   277.43
33,000,000   20,061,593  3.14159276017    800.43   321.11
100,000,000   60,792,694  3.14159307180       n/a   448.08
1,000,000,000  607,927,124  3.14159259637       n/a  1453.24

I used the equation S(n) = Sum(i from 1 to sqrt(n); mu(i) * int(n / i * i)) to get the count of square free numbers less than or equal to 'n'. In the equation, mu is the Möbius function.

The code (VA is the problem solution, MOB is the Möbius function):
VA
« → n
« 0. 1. n √ IP
FOR i
i MOB n i SQ / IP * +
NEXT
DUP 6. n * SWAP / √
»
»

MOB (User RPL version)
« R→I
IF DUP 1 > THEN
FACTOR
IF DUP TYPE 9. SAME THEN
DUP →STR
IF "^" POS THEN
DROP 0
ELSE
SIZE 1. + 2. / 1 SWAP 2. MOD { NEG } IFT
END
ELSE
DROP -1
END
END
»

MOB (System RPL version)
::
CK1&Dispatch
# FF
::
{
ROTDROPSWAP
%1
EQUALcase
FPTR2 ^RNEGext
FPTR2 ^DROPZ0
}
1LAMBIND
::
FPTR2 ^ZAbs
FPTR2 ^DupQIsZero?
caseSIZEERR
FPTR2 ^DupZIsOne?
?SEMI
FPTR2 ^MZSQFF
#2/
ZINT 1
SWAP
ZERO_DO
1GETLAM
COMPEVAL
LOOP
;
ABND
;
;

Sudhir
In case you are not aware of it, our fellow member Gerald H wrote a very fast version of MOB which is in this thread.

I'm not sure whether external programs are allowed in these challenges but I do recall several HP-71 programs using LEX files, so I would think that they are fair game for other HP's as well.
.
Hi, John,

(03-16-2023 07:23 PM)John Keith Wrote: [ -> ]I'm not sure whether external programs are allowed in these challenges but I do recall several HP-71 programs using LEX files, so I would think that they are fair game for other HP's as well.

Absolutely. There are no rules other than no code panels, and using vintage HP models (such as the HP 50g) is highly encouraged (but not mandatory,) after all this is the Museum of HP Calculators ...

In short, external programs/libraries/LEX files/binaries/etc. are allowed and welcome.

Thanks for your interest and Regards.
V.
I'm not really contributing, I just want to say a public thank-you to Valentin from me and my wife. This topic was completely new to us. She (high school teacher of mathematics and programming) and her students had great fun playing with the approximation!
My initial attempt was, instead of counting the square-free numbers, to count the numbers that have a square prime factor, and to subtract this amount from N.
The count of numbers <=N having the 2^2 factor is IP(N/2^2), those having the factor 3^2 is IP(N/3^2), and so on,
so the count of square-free numbers should be S = N - IP(N/2^2) - IP(N/3^2) - IP(N/5^2) ...
In that way, I only had to explore primes up to SQR(N), so it was pretty fast.
Unfortunately, numbers that are multiple of several square primes such as (2*3)^2 are counted several times, and the result is underestimated:

20 N=12345 ! test case
30 S=N @ P=1
40 P=FPRIM(P+1) @ S=S-IP(N/P^2) @ IF P<SQR(N) THEN 40
50 DISP N;S

>RUN
12345 6793 (true result=7503)

I didn't find a way to easily manage the numbers with multiple square prime factors.
An improvement was to use the current sum S instead of N: S = N ; S=S-IP(S/2^2); S=S-IP(S/3^2) ... but it's still an approximation at most:

40 P=FPRIM(P+1) @ S=S-IP(S/P^2) @ IF P<SQR(N) THEN 40

>RUN
12345 7529 (true result=7503)

So, I resorted to the formula using the Möbius function as disclosed in the post #2 above.
Matter of fact, this formula starts with the same terms as my first attempt:
S = N - IP(N/2^2) - IP(N/3^2) - IP(N/5^2) ...
but "magically" manages the numbers with multiple square prime factors with terms such as +IP(N/(2*3)^2) !

Here is my implementation, (correct) results and timings on my Emu71/DOSBox, at about 150x speed:

10 ! SRC13
20 INPUT N
25 T=TIME
30 S=N @ FOR I=2 TO SQR(N) @ S=S+FNM(I)*IP(N/I^2) @ NEXT I
40 T=TIME-T
50 DISP N;S;T
80 !
90 DEF FNM(N) ! Moebius function
110 C=-1 @ Q=1
120 P=PRIM(N) @ IF P=0 THEN P=N
130 IF P=Q THEN FNM=0 @ END
140 IF P<N THEN C=-C @ N=N/P @ Q=P @ GOTO 120
150 FNM=C @ END DEF

>RUN
12345 7503 .23
10000 60794 .65
567890 345237 1.37
1000000 607926 1.8

However, I'm not fully satisfied by applying a formula without understanding how and why it works.

Now, I'm curious to read Valentin's solution and explanations !

J-F
(03-18-2023 10:21 AM)J-F Garnier Wrote: [ -> ]I didn't find a way to easily manage the numbers with multiple square prime factors.

I was trying the same approach using the inclusion-exclusion principle, i.e., that the true count would be given by:

count = n/2^2 + n/3^2 + n/5^2 + ...
- n/(2^2*3^2) - n/(2^2*5^2) - n(3^2*5^2) - ...
+ n/(2^2*3^2*5^2) + ...
- ...
or (rearranging)
count = n/2^2 - n/(2^2*3^2) - n(2^2*5^2) - ... + n/(2^2*3^2*5^2) + ... - n/(2^2*3^2*5^2*7^2) - ...

This involves generating all combinations of the squares of the primes but what makes it feasible is that the terms rapidly go to zero since the product of squares grows so rapidly and the search tree can be pruned whenever the product of the squares is greater than the input number. In fact, I do have an implementation that computes the count in a couple of seconds for 1e6 (if given a list of primes up to 1000) on a physical HP 50g. Unfortunately, there is bug in my backtracking process after pruning that I have not been able to resolve and the program does not generate the correct count after 1763 (1764=2^2*3^2*7^2 is the first number where the backtracking is needed).

Sudhir
.
Hi, 2old2randr, vaklaff and J-F Garnier,

(03-16-2023 10:16 AM)2old2randr Wrote: [ -> ][Sorry the tables and code have messed up formatting - I couldn't figure out how to get that right without using code blocks.

Here's how: specifying font 'Courier' and replacing spaces by the non-breaking character " ", like this:

Number      Count    Approximation Runtime (Seconds)
--------------------------------------------------------
10           7  2.92770021885    0.44
12,345       7,503  3.14198204634   15.52
100,000      60,794  3.14155932716   45.08
567,890     345,237  3.14158683822  110.40
1,000,000     607,926  3.14159550063  147.82
10,000,000   6,079,291  3.14158749068  469.88
25,000,000  15,198,180  3.14159239999  692.35
33,000,000  20,061,593  3.14159276017  800.43

2old2randr Wrote:I attempted this because it is simpler than your usual problems and you were complaining about the lack of RPL solutions for your last challenge

Certainly. Glad to see some brave RPL-user actually using his vintage RPL calc to solve my challenge. Much appreciated.

2old2randr Wrote:Using a brute force solution on a physical HP 50g, I get decent run times (max. of 13' 20'') as shown below. Although I am not sure if the 50g qualifies as a vintage calculator, of course.

As for your second statement, yes, the HP 50g fully qualifies as a vintage HP calculator, thus no problem. And BTW, all your results are correct.

vaklaff Wrote:I'm not really contributing, I just want to say a public thank-you to Valentin from me and my wife. This topic was completely new to us. She (high school teacher of mathematics and programming) and her students had great fun playing with the approximation!

Of course you're contributing ! In particular, to boost my morale, which helps me keep on creating and posting these challenges. Much appreciated, and glad to know that you and your wife (and her students !) enjoyed the topic and learned something new. Please give my best regards to your charming wife.

J-F Garnier Wrote:Here is my implementation, (correct) results and timings on my Emu71/DOSBox, at about 150x speed:[...]

Thanks for your continued interest and results, J-F. Please post your best estimates for the runtimes when using a physical HP-71B, for comparison purposes. Also, try and include the results/timings for N = 10, 25 and 33 million, if you can.

J-F Garnier Wrote:However, I'm not fully satisfied by applying a formula without understanding how and why it works. Now, I'm curious to read Valentin's solution and explanations !

However, in this post of yours in my recent SRC #012e - Then and Now: Roots thread, you did use Bornemann's formula, about which you posted, I quote:
"I didn't fully understand the underlying math, but was able to decipher the formula and translate it into 71B code".
Same here, right ?

As for reading my solution and explanations, I'll provide both but you know well that I don't usually give formal proofs, references or lengthy math lectures, so keep your expectations accordingly, Ok ?

I'll post my solution (actually two of them, optimized for different purposes) either next Sunday or Monday, around 10 PM GMT+1. Thanks to all of you for your interest and contributions.

Best regards.
V.
I could only find a recursive implementation for the inclusion/exclusion method but this turns out to be much slower than the earlier brute force solution (even given the list of primes a priori). Just for giggles, here are the run times for the first two numbers using this approach - I did not bother running for the others.

Number      Count    Approximation Runtime (Seconds)
--------------------------------------------------------
12,345       7,503  3.14198204634   140.36
100,000      60,794  3.14155932716   902.63

The code - in case someone wants to try converting it to an iterative solution which should be much faster.
« → number
« primes 2. ^ DUP SIZE 0 → squares nsquares count
« « → prefix startpos add?
« IF prefix number ≤ THEN
startpos nsquares FOR i
IF squares i GET number > THEN
nsquares 1 + 'i' STO @ exit loop
ELSE
prefix squares i GET * DUP
IF number > THEN
DROP
ELSE
DUP number SWAP / IP
IF add? THEN count + ELSE count SWAP - END 'count' STO
i 1 + add? NOT combinations EVAL
END
END
NEXT
END
»
» → combinations
« 1 1 1 combinations EVAL
number count - DUP number 6 * SWAP / √
»
»
»
»

Sudhir
(03-18-2023 11:37 PM)Valentin Albillo Wrote: [ -> ]
J-F Garnier Wrote:However, I'm not fully satisfied by applying a formula without understanding how and why it works.
However, in this post of yours in my recent SRC #012e - Then and Now: Roots thread, you did use Bornemann's formula
[..]
Same here, right ?

Understood !

Quote:Please post your best estimates for the runtimes when using a physical HP-71B, for comparison purposes. Also, try and include the results/timings for N = 10, 25 and 33 million, if you can.

Certainly.
Below are the timings for a HP-71B 1BBBB (636kHz), a HP-71B 2CDCC (650kHz) and Emu71/Win in Authentic Speed, respectively.
The speed of HP-71B can easily vary by 5 to 10% due to component tolerance and battery condition.
The 71B clock can be checked with the SYSTEM("CLOCK") command provided in the SYSTEMFN LEX or in my ULIB52 collection.
On the other hand, Emu71/Win in Authentic Calculator Speed mode gives reproducible timings (provided the system is not heavily loaded).

N    Count  Timings (71B/1B, 71B/2C, Emu71/Auth.)
12345   7503 26s/23s/24s
100000  60794 78s/70s/73s
567890 345237 194s/174s/182s
1000000 607926 260s/233s/244s

So your implementation seems to be about 2x faster than mine. I see some ways to gain maybe 25%, but not much more. Interesting.

More results on Emu71/Win only:
N    Count      PI approx.        Timings (fast/auth.)
10E6  6079291 3.14158749068 0.8s/813s
25E6 15198180 3.14159239999 1.2s/1308s
33E6 20061593 3.14159276017 1.4s/1511s

J-F
I have updated the run times in my original post using the Sys RPL version of the Möbius function written by Gerald H (thanks, John Keith). This turns out to be twice as fast as the times obtained on the physical HP-71B.

Sudhir
.
Hi, J-F and 2old2randr,

J-F Garnier Wrote:
I Wrote:[...] Same here, right ?

Understood !

Good.

J-F Garnier Wrote:Certainly.

Below are the timings for a HP-71B 1BBBB (636kHz), a HP-71B 2CDCC (650kHz) and Emu71/Win in Authentic Speed, respectively. [...] Emu71/Win in Authentic Calculator Speed mode gives reproducible timings [...] So your implementation seems to be about 2x faster than mine. I see some ways to gain maybe 25%, but not much more. Interesting.

More results on Emu71/Win only:

N    Count      PI approx.        Timings (fast/auth.)
10E6  6079291 3.14158749068 0.8s/813s
25E6 15198180 3.14159239999 1.2s/1308s
33E6 20061593 3.14159276017 1.4s/1511s

Very good, thanks. In reciprocity, here are more of my results for you, obtained with my original Solution 1 (my original Solution 2 is slower):
N         Count    $$\pi$$ approx    |Error|     go71b  Emu71/Win  Physical
@128x    @976x     HP-71B
----------------------------------------------------------------------------
10,000,000   6,079,291  3.14158749  0.00000516    3.03"    0.40"     6' 28"
25,000,000  15,198,180  3.14159240  0.00000025    4.87"    0.64"    10' 23"
33,000,000  20,061,593  3.14159276  0.00000011    5.61"    0.74"    11' 58"

1E8     60,792,694  3.14159307  0.00000042    9.93"    1.30"    21' 11"
1E9    607,927,124  3.14159260  0.00000006   32.45"    4.26"    69' 14"
It would seem that my results for Emu71/Win @ 976x are ~ 2x faster than yours but please provide the speed factor (976x in mine) for comparison purposes. Also and for the same reason, please provide your timings for N = 108 and 109, if at all possible.

2old2randr Wrote:I have updated the run times in my original post using the Sys RPL version of the Möbius function written by Gerald H (thanks, John Keith). This turns out to be twice as fast as the times obtained on the physical HP-71B.

Thanks but unless I'm mistaken (I don't know the first word about RPL, let alone Sys RPL) you didn't include the Sys RPL code in your edited post and I think you should, lest the posted code and timings aren't synchronized with one another.

Also, as I told J-F above, please provide your timings for N = 108 and 109, if at all possible.

Best regards.
V.
(03-20-2023 02:22 AM)Valentin Albillo Wrote: [ -> ]In reciprocity, here are more of my results for you, obtained with my original Solution 1 (my original Solution 2 is slower):
N         Count    $$\pi$$ approx    |Error|     go71b  Emu71/Win  Physical
@128x    @976x     HP-71B
----------------------------------------------------------------------------
10,000,000   6,079,291  3.14158749  0.00000516    3.03"    0.40"     6' 28"
25,000,000  15,198,180  3.14159240  0.00000025    4.87"    0.64"    10' 23"
33,000,000  20,061,593  3.14159276  0.00000011    5.61"    0.74"    11' 58"

1E8     60,792,694  3.14159307  0.00000042    9.93"    1.30"    21' 11"
1E9    607,927,124  3.14159260  0.00000006   32.45"    4.26"    69' 14"
It would seem that my results for Emu71/Win @ 976x are ~ 2x faster than yours but please provide the speed factor (976x in mine) for comparison purposes. Also and for the same reason, please provide your timings for N = 108 and 109, if at all possible.

Estimating the speed ratio of Emu71/Win on modern platforms is a subject by itself, but I estimate a peak ratio of ~1500x on my latest Ryzen5 laptop.

Quote:please provide your timings for N = 108 and 109, if at all possible.

Execution timings on the physical HP-71B are estimated in Emu71/Win Authentic mode.

N         Count     PI approx.  Timings (fast, auth.)
1E8      60792694  3.14159307180 2.2s 46min
1E9     607927124  3.14159259637 7.3s 2h32min
1E10   6079270942  3.14159267337  24s N/A
1E11  60792710280  3.14159265115  81s N/A
1E12 607927102274  3.14159265250 276s N/A

J-F
(03-20-2023 02:22 AM)Valentin Albillo Wrote: [ -> ]It would seem that my results for Emu71/Win @ 976x are ~ 2x faster than yours...

Since Valentin didn't post his solutions yet, here is my optimized code for speed.
It is much less readable than my first version, but still not as obscure as SysRPL :-)

10 ! SRC13 verC
20 INPUT N
50 T=TIME @ M=SQR(N) @ S=N-IP(N/4)-IP(N/9)
60 FOR I=4 TO M @ GOSUB 110 @ GOSUB 110 @ GOSUB 110 @ NEXT I
80 T=TIME-T @ DISP N;S;SQR(6*N/S);T
90 END
100 !
110 I=I+1 @ R=I @ C=-1 @ Q=1
120 P=PRIM(R) @ IF P=Q THEN RETURN ELSE IF P THEN C=-C @ R=R/P @ Q=P @ GOTO 120
130 IF R=Q THEN RETURN ELSE S=S+C*IP(N/(I*I)) @ RETURN

Execution times are now close to Valentin's results:

N      Count  Timings (HP-71B)
12345   7503 11s
100000  60794 35s
567890 345237 90s
1000000 607926 122s

More results:

N         Count     PI approx.  Timings (Emu71 fast, auth.)
1E7       6079291  3.14158749068 0.4s 6min59s
1E8      60792694  3.14159307180 1.3s 24min
1E9     607927124  3.14159259637 4.1s 81min
1E10   6079270942  3.14159267337  14s N/A
1E11  60792710280  3.14159265115  48s N/A
1E12 607927102274  3.14159265250 172s N/A

J-F

Hi, all,

About a week has elapsed since I posted this $$\pi$$ Day 2023 Special challenge, thank you very much to those few of you who contributed your valuable solutions and/or comments, namely 2old2randr, J-F Garnier, vaklaff and John Keith, really much appreciated.

On the one hand, it's a pity that it failed to attract the attention of other people usually interested in my productions but that's life. On the other hand, this time I got RPL (and even Sys RPL ! ) solutions, which were conspicuously absent from past challenges.

Now's the time for my detailed sleuthing process and resulting original solutions, plus additional comments:

My sleuthing process

First of all, if naively taking at face value the problem's statement, you'd might be tempted to just program a simple loop from 1 to N, compute the factorization for each number and tally up the ones which have repeated factors. This is as simple as it gets but for large N (say 1 million) it's grossly inefficient and here we're most interested in raw speed.

What to do ? Search in references or the Internet for a better algorithm, that's what, in particular one which is better than O(N), as any such loop will take ages for large N, even if nothing were done inside the loop. We need something which is at least O(√N), thus transforming a 1,000,000-cycle loop into a 1,000-cycle one.

To search for that, we notice that asking for numbers whose factorization has no repeated prime factors is tantamount to asking for numbers who aren't divisible by any square other than 1, which are usually called square-free integers, and a quick Google search for the term reveals many relevant links, among them one at Wolfram Research's MathWorld, where we find a formula with the required order (i.e. the upper limit of the summation goes up to √N,) equivalent to this one:

which also appears in text form as a(n) = Sum_{k = 1..floor(sqrt(n))} mu(k)*floor(n/k^2) at OEIS A013928 Number of (positive) squarefree numbers < n, and many other sites and documents. The simple proof of this formula can be obtained using the inclusion-exclusion principle.

As for the Möbius function, it's a very important number-theoretical function which is easily computed in various ways, like this simple HP-71B user-defined function FNM (which should be placed at the end of any program using it):

1  DEF FNM(N) @ IF N=1 THEN FNM=1 @ END ELSE F=1
2  D=PRIM(N) @ IF NOT D THEN FNM=(-1)^F ELSE IF MOD(N,D*D) THEN F=F+1 @ N=N/D @ GOTO 2

>FOR N=96673 TO 96686 @ FNM(N); @ NEXT N @@       ►  1 1 0 0 0 0 0 0 1 1 1 0 -1 -1
>S=0 @ FOR N=1 TO 1109 @ S=S+FNM(N) @ NEXT N @ S  ►  -15   { Mertens(1109) }

but the important thing here is that we only need √N evaluations of it instead of N, which makes all the difference in the world.

My original solutions

Yes, solutions, plural, as I created two of them, optimized for different cases. My first solution is optimized for speed, regardless of memory use, and uses the S(n) formula above, but instead of computing each individual value for the Möbius function inside the loop, it does compute all √N values en masse before starting the summation loop, which can be done without factoring, multiplications or divisions by using a trivial sieve procedure similar to the well-known ancient Sieve of Eratosthenes algorithm used to find all prime numbers up to a limit.

The sieve procedure used here runs fast, ultimately winning over for large N, and all needed values of Möbius(n) are left stored in an array to be later used inside the summation loop.

First solution: speed

This little 4-line, 217-byte HP-71B program implements the procedure, first sieving and storing in INTEGER array M all √N Möbius values needed, then computing the summation and outputting results and timings: (uses Math and JPC ROMs)

1  DESTROY ALL @ INPUT T @ SETTIME 0 @ N=SQR(T) @ INTEGER M(N) @ MAT M=CON @ P=2 @ WHILE P<=N
2  S=P*P @ FOR K=S TO N STEP S @ M(K)=0 @ NEXT K @ FOR K=P TO N STEP P @ M(K)=-M(K) @ NEXT K
3  P=FPRIM(P+1) @ END WHILE @ S=T @ FOR K=2 TO N @ S=S+M(K)*(T DIV (K*K)) @ NEXT K
4  DISP USING "2(3DC3DC3DC3D,2X),2(Z.8D,X),5DZ.2D";T,S,SQR(6*T/S),ABS(PI-RES),TIME
N         Count    $$\pi$$ approx    |Error|     go71b  Emu71/Win  Physical
@128x    @976x     HP-71B
----------------------------------------------------------------------------
12,345       7,503  3.14198205  0.00038939    0.10"    0.01"        13"
100,000      60,794  3.14155933  0.00003333    0.28"    0.04"        36"
567,890     345,237  3.14158684  0.00000582    0.69"    0.09"     1' 28"
1,000,000     607,926  3.14159550  0.00000285    0.93"    0.12"     1' 59"

10,000,000   6,079,291  3.14158749  0.00000516    3.03"    0.40"     6' 28"
25,000,000  15,198,180  3.14159240  0.00000025    4.87"    0.64"    10' 23"
33,000,000  20,061,593  3.14159276  0.00000011    5.61"    0.74"    11' 58"

1E8     60,792,694  3.14159307  0.00000042    9.93"    1.30"    21' 11"
1E9    607,927,124  3.14159260  0.00000006   32.45"    4.26"    69' 14"
The caveat is, storing all Möbius values in INTEGER array M does require large amounts of memory if N is huge, e.g.:
- For N = 1E8, the program uses ~ 30,055 bytes of RAM (M has 10,000 elements, 30,000 bytes)
- For N = 1E9, the program uses ~ 94,927 bytes of RAM (M has 31,623 elements, 94,869 bytes)

Larger values of N are problematic, e.g. N = 1E10 would require in excess of 300 Kb of RAM, at the very limit of what the HP-71B can manage, which leads us to my second solution, where I'll turn from computing all Möbius values en masse to computing them individually, thus avoiding the large memory requirements, albeit at the cost of speed.

Second solution: memory

This even smaller 3-line, 178-byte HP-71B program uses in-lined Möbius function code (as oposed to an user-defined function) for speed, and needs only ~ 63 bytes of RAM to run no matter the size of N, all the way to a trillion (1012) - 1, i.e. 999,999,999,999, which is the largest integer the HP-71B can represent: (uses JPC ROM)

1 DESTROY ALL @ INPUT T @ SETTIME 0 @ U=-1 @ S=T @ FOR K=2 TO SQR(T) @ N=K @ F=1
2 D=PRIM(N) @ IF NOT D THEN S=S+U^F*IP(T/(K*K)) ELSE IF MOD(N,D*D) THEN F=F+1 @ N=N/D @ GOTO 2
3 NEXT K @ DISP USING "2(3DC3DC3DC3D,2X),2(Z.8D,X),5DZ.2D";T,S,SQR(6*T/S),ABS(PI-RES),TIME
N         Count    $$\pi$$ approx    |Error|     go71b  Emu71/Win  Physical
@128x    @976x     HP-71B
----------------------------------------------------------------------------
12,345       7,503  3.14198205  0.00038939    0.09"    0.01"        12"
100,000      60,794  3.14155933  0.00003333    0.26"    0.03"        33"
567,890     345,237  3.14158684  0.00000582    0.67"    0.09"     1' 26"
1,000,000     607,926  3.14159550  0.00000285    0.91"    0.12"     1' 56"

10,000,000   6,079,291  3.14158749  0.00000516    3.15"    0.41"     6' 43"
25,000,000  15,198,180  3.14159240  0.00000025    5.15"    0.68"    10' 59"
33,000,000  20,061,593  3.14159276  0.00000011    5.98"    0.78"    12' 45"

1E8       60,792,694  3.14159307  0.00000042   10.81"    1.42"    23'  4"
1E9      607,927,124  3.14159260  0.00000006   36.93"    4.84"    78' 47"

1E10    6,079,270,942  3.14159267  0.00000002   2'  8"   16.82"     4h 33'
1E11   60,792,710,280  3.14159265  0.00000000*  7' 37"   59.95"    16h 15'
1E12-1 607,927,102,274  3.14159265  0.00000000* 28'  9"   3' 41"    60h  2'
* the 12-digit values are 3.14159265115, 0.00000000244 and 3.14159265250, 0.00000000109, respectively.

Thus, although this second solution can achieve the maximum possible integer range 1 .. 1012 - 1 with no memory problems and even seems to be a little faster for N up to 1 million, this is only because the HP-71B hardware and BASIC language are optimized for fast mathematical operations performed entirely in assembler without user loops or branching, which are pretty slow and thus slow down the sieving procedure.

This is why a matrix inversion (MAT INV) or a Fast Fourier Transform (FOUR) are performed at speeds competing with much faster CPUs, while an empty FOR-NEXT construct does only ~100 loops/sec. on a physical HP-71B. Even so, the second solution will get increasingly slower than the first one for large N.

● In the very distant past (2004), I posted the pair of threads:
where I pitted the HP-71B vs. the HP49G+, executing various quite hard computations, and the resulting times showed that the HP49G+ was usually about 4.6x to 6.4x faster than the HP-71B, so any timings must take this into account in order to ascertain the relative performance of the various solutions running on different hardware.

● You may be wondering why I selected N = 33,000,000 as a test case. It's an innocuous number in and of itself but I found it while reading this truly excellent book eons ago:
The Mathematical Experience by Phillip J. Davis and Reuben Hersh
(464 pages, ISBN-10: ‎ 9780395929681)
where they say:
"A more refined piece of natural scientific research into prime Numbers was reported in a paper by I. J. Good and R. F. Churchhouse in 1967 [...]"
and they elaborate in page 367, I quote:
"To check their probabilistic reasoning, Good and Churchhouse did some numerical work. [...] In a separate calculation, they found that the total number of zeros of μ(n) [VA: the Möbius function] for n between 0 and 33,000,000 is 12,938,407.

The "expected number" is 33.000.000*(1 -6/$$\pi$$2). which works out to 12,938.405.6. They call this "an astonishingly close fit, better than we deserved." A nonrigorous argument has predicted a mathematical result to 8 place accuracy. In physics or chemistry. experimental agreement with theory to 8 place accuracy would be regarded as a very strong confirmation of the theory. Here, also, it is impossible to believe that such agreement is accidental. The principle by which the calculation was made must be right.
"
I was sure it took them considerable time to get the tally for 33,000,000 using some 1967-era computer and wanted to know how long it would take a 1984-era handheld HP-71B to get the result, which is less than 12 min. for a physical machine and less than a second for a modern emulation (Emu71/Win).

Both timings would be reduced at least an order of magnitude if rewritten in assembler and run in the same hardware (perhaps J-F Garnier would obligue and create a MAT M=MOB matrix statement which would fill up array M with the results of the sieving procedure implemented in BASIC in my first solution ...   )

By the way, the aforementioned 1967 paper by I. J. Good and R. F. Churchhouse is:
The Riemann Hypothesis and Pseudo-random Features of the Möbius Sequence
where among other very interesting things, they say:
"Thus the Möbius sequence contains arbitrarily long runs of zeros, but these long runs presumably occur extremely rarely."
I showed above such a run of zeros from N = 96675 to 96680, six consecutive zeros in all. It would make for an interesting challenge to locate longer runs.

● These are the exact counts S(10N) for selected N:
N  S(10N)
-----------------------------------------
0  1
1  7
2  61
3  608
4  6083
5  60794
6  607926
7  6079291
8  60792694
9  607927124
10  6079270942
11  60792710280
12  607927102274

13  6079271018294
14  60792710185947
16  6079271018540405
18  607927101854022750
24  607927101854026628773299
30  607927101854026628663278087296
36  607927101854026628663276779463775476
6/$$\pi$$2 = 0.607927101854026628663276779258365833...
So you see, you only need to tally up numbers up to 1036 with no repeated prime factors and you'll get an approximation to $$\pi$$ correct to 27 decimal digits !

That's all for now, I hope you enjoyed it. Regrettably, due to a number of factors (pun intended,) I'm now taking a leave of absence from further challenges for a while. Cya.

V.
Edit: rephrased one sentence.
Nice challenge Valentin, based on an interesting finding! I've posted some reflections over here.

Hi, all,

Casually looking at my code for solution 1 I realized that I did overlook a small but obvious optimization, namely not trying to update the sum S if the Möbius value M(K) is zero.

The slightly modified code is now 225 bytes instead of 217, and runs about 5,4% faster:

1  DESTROY ALL @ INPUT T @ SETTIME 0 @ N=SQR(T) @ INTEGER M(N) @ MAT M=CON @ P=2 @ WHILE P<=N
2  S=P*P @ FOR K=S TO N STEP S @ M(K)=0 @ NEXT K @ FOR K=P TO N STEP P @ M(K)=-M(K) @ NEXT K
3  P=FPRIM(P+1) @ END WHILE @ S=T @ FOR K=2 TO N @ IF M(K) THEN S=S+M(K)*(T DIV (K*K))
4  NEXT K @ DISP USING "2(3DC3DC3DC3D,2X),2(Z.8D,X),5DZ.2D";T,S,SQR(6*T/S),ABS(PI-RES),TIME
N         Count    $$\pi$$ approx    |Error|     go71b  Emu71/Win  Physical
@128x    @976x     HP-71B
----------------------------------------------------------------------------
12,345       7,503  3.14198205  0.00038939    0.09"    0.01"        12"
100,000      60,794  3.14155933  0.00003333    0.27"    0.04"        35"
567,890     345,237  3.14158684  0.00000582    0.65"    0.09"     1' 23"
1,000,000     607,926  3.14159550  0.00000285    0.87"    0.11"     1' 51"

10,000,000   6,079,291  3.14158749  0.00000516    2.86"    0.38"     6'  6"
25,000,000  15,198,180  3.14159240  0.00000025    4.59"    0.60"     9' 48"
33,000,000  20,061,593  3.14159276  0.00000011    5.29"    0.69"    11' 17"

1E8     60,792,694  3.14159307  0.00000042    9.37"    1.23"    19' 59"
1E9    607,927,124  3.14159260  0.00000006   30.70"    4.03"    65' 30"
V.
BTW, the short paper The Riemann Hypothesis and Pseudorandom Features of the Möbius Sequence by Good and Churchhouse can be found here. (It’s an example of a large scale pure mathematics computation, in 1967, running in the background on the Atlas at Chilton. We’re not told the runtime.)
So this time we got two solutions from Valentin, plus as often interesting comments and references.
The first solution is nice and quite unexpected, but my preference goes to the second solution that has no limitation due to memory.

My last solution posted just a few hours before Valentin's final solutions probably didn't catch much attention.
Yet I was quite happy with the tricks I used to speed up my code by almost a factor of 2.

One trick was to skip the unnecessary evaluation of the Möbius function for all multiples of 4, i.e. one number over 4 or a 25% gain.
But this was not enough to beat Valentin's timings, and even more with his last 5% optimization of his fastest version.

So here is another attempt, pushing the optimisation further towards speed with just a slightly larger code, but no huge memory requirements.

5 ! SRC13 verE
10 INPUT N
15 T=TIME @ M=(SQR(N)+3) DIV 4*4 @ S=0
20 FOR I=M TO 5 STEP -1
25 I=I-1 @ R=I @ C=-1
30 P=PRIM(R) @ IF NOT P THEN S=S+C*(N DIV (I*I)) ELSE IF MOD(R,P*P) THEN C=-C @ R=R/P @ GOTO 30
35 I=I-1 @ R=I/2 @ C=1
40 P=PRIM(R) @ IF NOT P THEN S=S+C*(N DIV (I*I)) ELSE IF MOD(R,P*P) THEN C=-C @ R=R/P @ GOTO 40
45 I=I-1 @ R=I @ C=-1
50 P=PRIM(R) @ IF NOT P THEN S=S+C*(N DIV (I*I)) ELSE IF MOD(R,P*P) THEN C=-C @ R=R/P @ GOTO 50
55 NEXT I
60 S=S-(N DIV 9)-(N DIV 4)+N
65 T=TIME-T @ DISP N;S;SQR(6*N/S);T

Execution times are now closer ... no, better (slightly) than Valentin's results :-)

N      Count  Timings (HP-71B)
12345   7503 7.4s
100000  60794 24s
567890 345237 64s
1000000 607926 87s

More results on Emu71/Win, in fast and Authentic modes:

N         Count     PI approx.  Timings (Emu71 fast, auth.)
1E7       6079291  3.14158749068 0.3s 5min11s
1E8      60792694  3.14159307180 0.9s 18min8s
1E9     607927124  3.14159259637 3.1s 63min1s
1E10   6079270942  3.14159267337  11s N/A
1E11  60792710280  3.14159265115  39s N/A
1E12 607927102274  3.14159265250 144s N/A

Do we have to stop at 1E12?
No! The computing loop uses numbers from 2 to SQR(N) which is no problem.
Just the count must be carried carefully, starting from smallest terms, to the highest that can exceed 1E12.
Also some operations must be written carefully, for instance replacing IP(N/X) with N DIV X.
This explains the changes in the program above.

Of course, the final count will have only 12 significant digits, but if is correct within a few ULPs as expected, then we can get an approximation of pi with the same accuracy.

Results on Emu71/Win full speed only:

N         Count         PI approx.  Timings (Emu71 fast)
1E13 6.07927101830E12 3.14159265365 9min27s
1E14 6.07927101860E13 3.14159265357 39min
1E15 6.07927101854E14 3.14159265359 166min

J-F
Bravo!

I'm thinking, on a platform which lacks the very handy PRIM predicate, one might proceed first with a sieve and then lookup into that for the PRIM calls. It will potentially take a lot of storage, and packing the bits will slow down access.

It might be that saving the primes in this way is very similar to Valentin's program which records the Möbius values, although in that case I think we still need an implementation of FPRIM.
Pages: 1 2
Reference URL's
• HP Forums: https://www.hpmuseum.org/forum/index.php
• :