50g Mini-Challenge: Number of positive divisors of x!
09-29-2017, 11:49 PM
Post: #1
 Joe Horn Senior Member Posts: 1,629 Joined: Dec 2013
50g Mini-Challenge: Number of positive divisors of x!
The HP 50g program << ! DIVIS SIZE >> returns the number of positive divisors of the input factorial. However, this brute-force program becomes impossibly slow for medium-sized inputs, and runs out of memory for any input that's interestingly large. Just look how this slows down:

<< 12 ! DIVIS SIZE >> returns 792 in 6.1 seconds
<< 13 ! DIVIS SIZE >> returns 1584 in 17.5 seconds
<< 14 ! DIVIS SIZE >> returns 2592 in 39.3 seconds
<< 15 ! DIVIS SIZE >> returns 4032 in 86.2 seconds!

(Unrelated note: Prime's CAS returns 4032 for size(idivis(15!)) in 0.7 seconds)

The winner of this challenge will be the 49G/49g+/50g RPL program that returns the exact number of positive divisors of X! the fastest. Testing will focus on large inputs.

Although half the fun of this challenge will consist in thinking up different ways of doing it (obviously avoiding the factorial and DIVIS functions), some algorithmic hints can be found here if you totally get stuck: https://oeis.org/A027423

Many thanks to Gerald H for his many OEIS-related postings and programs which inspired this mini-challenge.

<0|ɸ|0>
-Joe-
09-30-2017, 01:48 AM (This post was last modified: 09-30-2017 01:49 AM by Thomas Okken.)
Post: #2
 Thomas Okken Senior Member Posts: 993 Joined: Feb 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
I realize that this doesn’t qualify since it’s a 42S program, but I couldn’t resist:

Code:
00 { 67-Byte Prgm } 01▸LBL "PF" 02 STO 00 03 1 04 STO 01 05 STO 04 06 2 07 GTO 02 08▸LBL 00 09 2 10 STO+ 01 11 1 12 STO 02 13▸LBL 01 14 2 15 STO+ 02 16 RCL 02 17 X↑2 18 RCL 01 19 X<Y? 20 GTO 02 21 LASTX 22 MOD 23 X=0? 24 GTO 00 25 GTO 01 26▸LBL 02 27 STO 03 28 RCL 00 29 X<Y? 30 GTO 04 31 1 32 X<>Y 33▸LBL 03 34 RCL÷ 03 35 IP 36 STO+ ST Y 37 X≠0? 38 GTO 03 39 R↓ 40 STO× 04 41 GTO 00 42▸LBL 04 43 RCL 04 44 END

Constant memory use; run time something like O(n*sqrt(n)), I think, but that analysis is harder than writing the program. :-)
09-30-2017, 04:14 AM
Post: #3
 Joe Horn Senior Member Posts: 1,629 Joined: Dec 2013
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 01:48 AM)Thomas Okken Wrote:  I realize that this doesn’t qualify since it’s a 42S program, but I couldn’t resist...

Wow, that's not only very fast, but Free42 (decimal version) even gets the full correct answers up to an input of 289 (of course, the SHOW key must be pressed to see the full answer). Good job!

<0|ɸ|0>
-Joe-
09-30-2017, 10:39 AM (This post was last modified: 09-30-2017 10:39 AM by Gerald H.)
Post: #4
 Gerald H Senior Member Posts: 1,442 Joined: May 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
Very nice Thomas.

On real 33s input 66 correctly processed in < 4s.
09-30-2017, 02:01 PM (This post was last modified: 10-01-2017 12:19 AM by Thomas Okken.)
Post: #5
 Thomas Okken Senior Member Posts: 993 Joined: Feb 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
HP-67: 52 → 8087040000 in 69 s.

(I really should get that card reader restored. It’s been 40 years since the last time I used a calculator that would forget everything when you turned it off!)

UPDATE:

HP-25: 52 → 8087040000 in 54 s. :-)

Code:
01     23 00  STO 0 02        01  1 03     23 01  STO 1 04     23 04  STO 4 05        02  2 06     13 24  GTO 24 07        02  2 08  23 51 01  STO + 1 09        01  1 10     23 02  STO 2 11        02  2 12  23 51 02  STO + 2 13     24 02  RCL 2 14     15 02  x^2 15     24 01  RCL 1 16     14 41  x<y 17     13 24  GTO 24 18     14 73  LASTx 19        71  ÷ 20     15 01  FRAC 21     15 71  x=0 22     13 07  GTO 07 23     13 11  GTO 11 24     23 03  STO 3 25     24 00  RCL 0 26     14 41  x<y 27     13 40  GTO 40 28        01  1 29     23 05  STO 5 30        22  R↓ 31     24 03  RCL 3 32        71  ÷ 33     14 01  INT 34  23 51 05  STO + 5 35     15 61  x≠0 36     13 31  GTO 31 37     24 05  RCL 5 38  23 61 04  STO × 4 39     13 07  GTO 07 40     24 04  RCL 4 41     13 00  GTO 00
09-30-2017, 02:26 PM
Post: #6
 Gerald H Senior Member Posts: 1,442 Joined: May 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
I have a 50g SysRPL programme that does 100 in 1.4s & bestfit gives the input time relationship as

t=0.0020*N^1.5191

I would love to see a 50g version of Thomas' programme but I can't get my head around the 42S version.

Code:
::   CK1&Dispatch   # FF   ::     PTR 2EF44     {       ROTDROPSWAP       %1+       FPTR2 ^R>Z       FPTR2 ^QMul     }     1LAMBIND     ::       FPTR2 ^ZAbs       FPTR2 ^DupQIsZero?       caseSIZEERR       FPTR2 ^DupZIsOne?       ?SEMI       FPTR2 ^MZSQFF       #2/       ZINT 1       SWAP       ZERO_DO       1GETLAM       COMPEVAL       LOOP     ;     ABND   ; ;
09-30-2017, 03:33 PM (This post was last modified: 09-30-2017 03:44 PM by Thomas Okken.)
Post: #7
 Thomas Okken Senior Member Posts: 993 Joined: Feb 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
My algorithm is based on the prime factorization of n!. When a number has prime factorization Π(i=1,j,p[i]^k[i]), i.e. it is the product of j distinct primes and each prime p[i] occurs k[i] times in the factorization, then the number of divisors is Π(i=1,j,k[i]+1).

To find this prime factorization of n!, realize that it can only contain primes that are less than or equal to n, so the search space is pleasantly small. And how often does a prime p occur in n!? Answer: floor(n/p) of the factors 1, 2, 3, ... , n are divisible by p; floor(n/p^2) are divisible by p^2, etc. Keep dividing n by p and rounding toward zero, until you reach zero; the sum of all the intermediate results equals the exponent k[i] of the prime p[i], and the number of possible contributions of p[i] to divisors of n! is thus k[i]+1. Multiply all those k[i]+1 factors together, and you end up with the total number of divisors.

The first part of the program finds primes using a simple but reasonably efficient algorithm, only checking divisors <= sqrt(p). Hence the O(n*sqrt(n)) part of my time estimate. Once a prime has been found, the repeated divisions take O(log(n)). Working this out more accurately would require dusting off my rusty calculus skills... Maybe later. :-)
09-30-2017, 04:29 PM
Post: #8
 Gerald H Senior Member Posts: 1,442 Joined: May 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
Thank you, Thomas.

Here a SysRPL programme following Thomas' suggestion which correctly processes 100 in

0.5604s

a definite improvement on my first attempt.

Size: 147.0000

CkSum: # FDBAh

Code:
::   CK1&Dispatch   # FF   ::     ZINT 1     SWAP     ZINT 0     BEGIN     FPTR2 ^Prime+     2DUP     Z>=     WHILE     ::       2DUP       ZINT 0       OVER       2SWAP       BEGIN       2DUP       FPTR2 ^ZQUOText       DUP       ZINT 0       Z>       WHILE       ::         5ROLL         FPTR2 ^RADDext         4UNROLL         3PICK         FPTR2 ^RMULText       ;       REPEAT       4DROP       ZINT 1       FPTR2 ^RADDext       4ROLL       FPTR2 ^RMULText       3UNROLL     ;     REPEAT     2DROP   ; ;
09-30-2017, 04:43 PM
Post: #9
 Gerald H Senior Member Posts: 1,442 Joined: May 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
Also the time characteristic is better, bestfit claims a linear relationship on the input.
09-30-2017, 08:17 PM
Post: #10
 Joe Horn Senior Member Posts: 1,629 Joined: Dec 2013
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 04:29 PM)Gerald H Wrote:  Here a SysRPL programme following Thomas' suggestion which correctly processes 100 in 0.5604s, a definite improvement on my first attempt.
Size: 147.0000
CkSum: # FDBAh

Wow, very impressive! Two quick questions:

(1) FPTR 6 119 (^RADDext) and FPTR 6 118 (^QAdd) both resolve to address 6:5253B, so they are really identical functions. Is there a reason that you prefer using ^RADDext instead of ^QAdd in this program? Same question for the identical functions ^RMULText and ^QMul. (It might help if I knew what the "Q" and the "R" stand for in those function names.)

(2) Your second System RPL program evaluates an input of 100 in half a second, so when I keyed up the same program in User RPL (exactly the same, step-by-step, just in URPL), I expected it to run slower, but it ran MUCH slower than I anticipated. URPL is always slower than SysRPL, but this difference seems crazy:

320 SysRPL --> 1.5 seconds
320 URPL --> 18.6 seconds!

Why is this SysRPL program so much faster? Is it simply because SysRPL is always faster, especially when looping is involved?

<0|ɸ|0>
-Joe-
09-30-2017, 09:31 PM (This post was last modified: 09-30-2017 09:34 PM by Gerald H.)
Post: #11
 Gerald H Senior Member Posts: 1,442 Joined: May 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
The User programme below took

14.9141s

to process 320.

To question 1, ^RMULText etc is just habit, to q 2 I have surely less idea than you.

How great a speed improvement occurs is variable, loops do seem to go much faster in Sys.

Code:
« 1 SWAP 0   WHILE NEXTPRIME DUP2 ≥   REPEAT DUP2 0 OVER 4 ROLL 4 ROLL     WHILE DUP2 IQUOT DUP 0 >     REPEAT 5 ROLL + 4 ROLLD PICK3 *     END 4 DROPN 1 + 4 ROLL * UNROT   END DROP2 »
09-30-2017, 09:52 PM
Post: #12
 Arno K Senior Member Posts: 432 Joined: Mar 2015
RE: 50g Mini-Challenge: Number of positive divisors of x!
So for now I wrote my own in URPL, 15 takes 1.7614 seconds, 100 needs 15,2s and 320 nearly 75 seconds. seems I need to improve my code a bit
Code:
 %%HP: T(3)A(R)F(.); \<< 1 OVER NDUPN \->LIST \-> L   \<< 2     FOR K K FACTORS REVLIST OBJ\-> 2 / 1 SWAP       START L OVER GET ROT + L UNROT PUT 'L' STO       NEXT -1     STEP L DUP \PILIST   \>> \>>
moving the list to memory and back is the main slowdown I think.
Arno
09-30-2017, 10:52 PM
Post: #13
 DavidM Senior Member Posts: 768 Joined: Dec 2013
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 08:17 PM)Joe Horn Wrote:  (2) Your second System RPL program evaluates an input of 100 in half a second, so when I keyed up the same program in User RPL (exactly the same, step-by-step, just in URPL), I expected it to run slower, but it ran MUCH slower than I anticipated. URPL is always slower than SysRPL, but this difference seems crazy:

320 SysRPL --> 1.5 seconds
320 URPL --> 18.6 seconds!

Why is this SysRPL program so much faster? Is it simply because SysRPL is always faster, especially when looping is involved?

In addition to the inherent loop structure speedup that SysRPL gives, I think if you look at what's inside those loops you'll see another reason for the speed advantage that Gerald's SysRPL version has: lots of "combo" commands that directly manipulate the stack without having to validate the arguments given. 5ROLL, 4UNROLL, 2DROP, etc. go about their business without bothering to check that the stack can accommodate their action or that the numeric constant applies. So in many of the "equivalent" UserRPL steps, you've not only got the usual overhead for argument count/type checking (and LASTARG/cached stack saving), you've also got additional RPL stream maintenance and stack validation occurring that aren't present when executing the SysRPL code.

Those combo commands for manipulating the stack can be a pain to remember in SysRPL but they really pay off sometimes in inner loops.
10-01-2017, 02:35 AM (This post was last modified: 10-01-2017 02:37 AM by Paul Dale.)
Post: #14
 Paul Dale Senior Member Posts: 1,616 Joined: Dec 2013
RE: 50g Mini-Challenge: Number of positive divisors of x!
This one is cheating but it is very very fast:

<< [ 1 2 4 8 16 30 60 96 160 270 540 792 1584 2592 4032 5376 10752 14688 29376 41040 60800 96000 192000 242880 340032 532224 677376 917280 1834560 2332800 4665600 5529600 7864320 12165120 16422912 19595520 39191040 60466176 85100544 102435840 204871680 258048000 516096000 677376000 819624960 1258709760 2517419520 2876670720 3698576640 4464046080 6210846720 8087040000 16174080000 18559756800 23984916480 28217548800 39016857600 59609088000 119218176000 137106432000 274212864000 418535424000 492139929600 543050956800 695105224704 850195906560 1700391813120 2190889451520 3012472995840 3543845437440 7087690874880 7848891187200 15697782374400 23878316851200 ] SWAP GET >>

And it works on a 28S. The next value in the sequence cannot be represented exactly, so I stopped.

Pauli
10-01-2017, 02:48 AM
Post: #15
 Joe Horn Senior Member Posts: 1,629 Joined: Dec 2013
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 09:31 PM)Gerald H Wrote:  The User programme below took 14.9141s to process 320.

Code:
« 1 SWAP 0   WHILE NEXTPRIME DUP2 ≥   REPEAT DUP2 0 OVER 4 ROLL 4 ROLL     WHILE DUP2 IQUOT DUP 0 >     REPEAT 5 ROLL + 4 ROLLD PICK3 *     END 4 DROPN 1 + 4 ROLL * UNROT   END DROP2 »

Exactly the same program* on my 50g takes 18.6 seconds for an input of 320, so your 50g must be faster than mine. *"Exactly the same" ... well almost; I always use reals for the arguments to stack commands (e.g. 4. ROLL instead of 4 ROLL) because it's a tad faster that way. With integers in their place, it takes 0.3 seconds longer for an input of 320.

(09-30-2017 10:52 PM)DavidM Wrote:  In addition to the inherent loop structure speedup that SysRPL gives, I think if you look at what's inside those loops you'll see another reason for the speed advantage that Gerald's SysRPL version has: lots of "combo" commands that directly manipulate the stack without having to validate the arguments given. 5ROLL, 4UNROLL, 2DROP, etc. go about their business without bothering to check that the stack can accommodate their action or that the numeric constant applies. So in many of the "equivalent" UserRPL steps, you've not only got the usual overhead for argument count/type checking (and LASTARG/cached stack saving), you've also got additional RPL stream maintenance and stack validation occurring that aren't present when executing the SysRPL code.

Those combo commands for manipulating the stack can be a pain to remember in SysRPL but they really pay off sometimes in inner loops.

Good point. In his System RPL book, Jim Donnelly wrote, "The System-RPL example will run faster than the User-RPL program, because all the argument checking code has been bypassed." Nice to know that the speed increase can be this dramatic!

<0|ɸ|0>
-Joe-
10-01-2017, 05:09 AM
Post: #16
 Gerald H Senior Member Posts: 1,442 Joined: May 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
(10-01-2017 02:48 AM)Joe Horn Wrote:  Exactly the same program* on my 50g takes 18.6 seconds for an input of 320, so your 50g must be faster than mine. *"Exactly the same" ... well almost; I always use reals for the arguments to stack commands (e.g. 4. ROLL instead of 4 ROLL) because it's a tad faster that way. With integers in their place, it takes 0.3 seconds longer for an input of 320.

Yes, down from 14.9s to 14.6s with real arguments for the stack commands
10-01-2017, 02:09 PM
Post: #17
 John Keith Senior Member Posts: 541 Joined: Dec 2013
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 10:52 PM)DavidM Wrote:  In addition to the inherent loop structure speedup that SysRPL gives, I think if you look at what's inside those loops you'll see another reason for the speed advantage that Gerald's SysRPL version has: lots of "combo" commands that directly manipulate the stack without having to validate the arguments given. 5ROLL, 4UNROLL, 2DROP, etc. go about their business without bothering to check that the stack can accommodate their action or that the numeric constant applies. So in many of the "equivalent" UserRPL steps, you've not only got the usual overhead for argument count/type checking (and LASTARG/cached stack saving), you've also got additional RPL stream maintenance and stack validation occurring that aren't present when executing the SysRPL code.

Those combo commands for manipulating the stack can be a pain to remember in SysRPL but they really pay off sometimes in inner loops.

A related issue is that User RPL stack commands that take a numeric argument (PICK, ROLL, etc) take about 7 times as long as simple stack commands such as SWAP or ROT. I go to great lengths to avoid using those slow commands inside loops.

Also, WHILE and UNTIL loops are slower than START loops in User RPL whereas WHILE is very fast in SysRPL.

John
10-01-2017, 03:47 PM
Post: #18
 pier4r Senior Member Posts: 2,019 Joined: Nov 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
(10-01-2017 02:09 PM)John Keith Wrote:  A related issue is that User RPL stack commands that take a numeric argument (PICK, ROLL, etc) take about 7 times as long as simple stack commands such as SWAP or ROT. I go to great lengths to avoid using those slow commands inside loops.

Also, WHILE and UNTIL loops are slower than START loops in User RPL whereas WHILE is very fast in SysRPL.

John

John it seems that you have some knowledge of speed of some commands (considering also the other thread where you today wrote that CEIL is faster than 1 + IP ). By chance do you have some measurements in time for each command tested by you?

Wikis are great, Contribute :)
10-02-2017, 05:30 AM
Post: #19
 Joe Horn Senior Member Posts: 1,629 Joined: Dec 2013
RE: 50g Mini-Challenge: Number of positive divisors of x!
(10-01-2017 03:47 PM)pier4r Wrote:  John it seems that you have some knowledge of speed of some commands (considering also the other thread where you today wrote that CEIL is faster than 1 + IP ). By chance do you have some measurements in time for each command tested by you?

Note that you can determine such timings yourself using the 50g's TEVAL command. For example, put << 1. 1000. START 1.5 DROP NEXT >> on the stack and execute TEVAL to time it. Now insert either CEIL or 1. + IP between the 1.5 and the DROP, and use TEVAL to time each. Divide the times by 1000 to see the difference. I keep << -40 CF MEM DROP 0.5 WAIT TEVAL -40 SF >> assigned to a key for accurate timings.

<0|ɸ|0>
-Joe-
10-02-2017, 06:43 AM
Post: #20
 Gerald H Senior Member Posts: 1,442 Joined: May 2014
RE: 50g Mini-Challenge: Number of positive divisors of x!
(10-02-2017 05:30 AM)Joe Horn Wrote:
(10-01-2017 03:47 PM)pier4r Wrote:  John it seems that you have some knowledge of speed of some commands (considering also the other thread where you today wrote that CEIL is faster than 1 + IP ). By chance do you have some measurements in time for each command tested by you?

Note that you can determine such timings yourself using the 50g's TEVAL command. For example, put << 1. 1000. START 1.5 DROP NEXT >> on the stack and execute TEVAL to time it. Now insert either CEIL or 1. + IP between the 1.5 and the DROP, and use TEVAL to time each. Divide the times by 1000 to see the difference. I keep << -40 CF MEM DROP 0.5 WAIT TEVAL -40 SF >> assigned to a key for accurate timings.

Why do you have MEM in the programme? TEVAL does garbage collection.
What's the point of 0.5 WAIT?
 « Next Oldest | Next Newest »

User(s) browsing this thread: 1 Guest(s)