Post Reply 
50g Mini-Challenge: Number of positive divisors of x!
09-29-2017, 11:49 PM
Post: #1
50g Mini-Challenge: Number of positive divisors of x!
The HP 50g program << ! DIVIS SIZE >> returns the number of positive divisors of the input factorial. However, this brute-force program becomes impossibly slow for medium-sized inputs, and runs out of memory for any input that's interestingly large. Just look how this slows down:

<< 12 ! DIVIS SIZE >> returns 792 in 6.1 seconds
<< 13 ! DIVIS SIZE >> returns 1584 in 17.5 seconds
<< 14 ! DIVIS SIZE >> returns 2592 in 39.3 seconds
<< 15 ! DIVIS SIZE >> returns 4032 in 86.2 seconds!

(Unrelated note: Prime's CAS returns 4032 for size(idivis(15!)) in 0.7 seconds)

The winner of this challenge will be the 49G/49g+/50g RPL program that returns the exact number of positive divisors of X! the fastest. Testing will focus on large inputs.

Although half the fun of this challenge will consist in thinking up different ways of doing it (obviously avoiding the factorial and DIVIS functions), some algorithmic hints can be found here if you totally get stuck: https://oeis.org/A027423

Many thanks to Gerald H for his many OEIS-related postings and programs which inspired this mini-challenge.

X<> c
-Joe-
Visit this user's website Find all posts by this user
Quote this message in a reply
09-30-2017, 01:48 AM (This post was last modified: 09-30-2017 01:49 AM by Thomas Okken.)
Post: #2
RE: 50g Mini-Challenge: Number of positive divisors of x!
I realize that this doesn’t qualify since it’s a 42S program, but I couldn’t resist:

Code:
00 { 67-Byte Prgm }
01▸LBL "PF"
02 STO 00
03 1
04 STO 01
05 STO 04
06 2
07 GTO 02
08▸LBL 00
09 2
10 STO+ 01
11 1
12 STO 02
13▸LBL 01
14 2
15 STO+ 02
16 RCL 02
17 X↑2
18 RCL 01
19 X<Y?
20 GTO 02
21 LASTX
22 MOD
23 X=0?
24 GTO 00
25 GTO 01
26▸LBL 02
27 STO 03
28 RCL 00
29 X<Y?
30 GTO 04
31 1
32 X<>Y
33▸LBL 03
34 RCL÷ 03
35 IP
36 STO+ ST Y
37 X≠0?
38 GTO 03
39 R↓
40 STO× 04
41 GTO 00
42▸LBL 04
43 RCL 04
44 END

Constant memory use; run time something like O(n*sqrt(n)), I think, but that analysis is harder than writing the program. :-)
Find all posts by this user
Quote this message in a reply
09-30-2017, 04:14 AM
Post: #3
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 01:48 AM)Thomas Okken Wrote:  I realize that this doesn’t qualify since it’s a 42S program, but I couldn’t resist...

Wow, that's not only very fast, but Free42 (decimal version) even gets the full correct answers up to an input of 289 (of course, the SHOW key must be pressed to see the full answer). Good job!

X<> c
-Joe-
Visit this user's website Find all posts by this user
Quote this message in a reply
09-30-2017, 10:39 AM (This post was last modified: 09-30-2017 10:39 AM by Gerald H.)
Post: #4
RE: 50g Mini-Challenge: Number of positive divisors of x!
Very nice Thomas.

On real 33s input 66 correctly processed in < 4s.
Find all posts by this user
Quote this message in a reply
09-30-2017, 02:01 PM (This post was last modified: 10-01-2017 12:19 AM by Thomas Okken.)
Post: #5
RE: 50g Mini-Challenge: Number of positive divisors of x!
HP-67: 52 → 8087040000 in 69 s.

(I really should get that card reader restored. It’s been 40 years since the last time I used a calculator that would forget everything when you turned it off!)

UPDATE:

HP-25: 52 → 8087040000 in 54 s. :-)

Code:
01     23 00  STO 0
02        01  1
03     23 01  STO 1
04     23 04  STO 4
05        02  2
06     13 24  GTO 24
07        02  2
08  23 51 01  STO + 1
09        01  1
10     23 02  STO 2
11        02  2
12  23 51 02  STO + 2
13     24 02  RCL 2
14     15 02  x^2
15     24 01  RCL 1
16     14 41  x<y
17     13 24  GTO 24
18     14 73  LASTx
19        71  ÷
20     15 01  FRAC
21     15 71  x=0
22     13 07  GTO 07
23     13 11  GTO 11
24     23 03  STO 3
25     24 00  RCL 0
26     14 41  x<y
27     13 40  GTO 40
28        01  1
29     23 05  STO 5
30        22  R↓
31     24 03  RCL 3
32        71  ÷
33     14 01  INT
34  23 51 05  STO + 5
35     15 61  x≠0
36     13 31  GTO 31
37     24 05  RCL 5
38  23 61 04  STO × 4
39     13 07  GTO 07
40     24 04  RCL 4
41     13 00  GTO 00
Find all posts by this user
Quote this message in a reply
09-30-2017, 02:26 PM
Post: #6
RE: 50g Mini-Challenge: Number of positive divisors of x!
I have a 50g SysRPL programme that does 100 in 1.4s & bestfit gives the input time relationship as

t=0.0020*N^1.5191

I would love to see a 50g version of Thomas' programme but I can't get my head around the 42S version.

Thomas, could you please explain your algorithm?

Code:
::
  CK1&Dispatch
  # FF
  ::
    PTR 2EF44
    {
      ROTDROPSWAP
      %1+
      FPTR2 ^R>Z
      FPTR2 ^QMul
    }
    1LAMBIND
    ::
      FPTR2 ^ZAbs
      FPTR2 ^DupQIsZero?
      caseSIZEERR
      FPTR2 ^DupZIsOne?
      ?SEMI
      FPTR2 ^MZSQFF
      #2/
      ZINT 1
      SWAP
      ZERO_DO
      1GETLAM
      COMPEVAL
      LOOP
    ;
    ABND
  ;
;
Find all posts by this user
Quote this message in a reply
09-30-2017, 03:33 PM (This post was last modified: 09-30-2017 03:44 PM by Thomas Okken.)
Post: #7
RE: 50g Mini-Challenge: Number of positive divisors of x!
My algorithm is based on the prime factorization of n!. When a number has prime factorization Π(i=1,j,p[i]^k[i]), i.e. it is the product of j distinct primes and each prime p[i] occurs k[i] times in the factorization, then the number of divisors is Π(i=1,j,k[i]+1).

To find this prime factorization of n!, realize that it can only contain primes that are less than or equal to n, so the search space is pleasantly small. And how often does a prime p occur in n!? Answer: floor(n/p) of the factors 1, 2, 3, ... , n are divisible by p; floor(n/p^2) are divisible by p^2, etc. Keep dividing n by p and rounding toward zero, until you reach zero; the sum of all the intermediate results equals the exponent k[i] of the prime p[i], and the number of possible contributions of p[i] to divisors of n! is thus k[i]+1. Multiply all those k[i]+1 factors together, and you end up with the total number of divisors.

The first part of the program finds primes using a simple but reasonably efficient algorithm, only checking divisors <= sqrt(p). Hence the O(n*sqrt(n)) part of my time estimate. Once a prime has been found, the repeated divisions take O(log(n)). Working this out more accurately would require dusting off my rusty calculus skills... Maybe later. :-)
Find all posts by this user
Quote this message in a reply
09-30-2017, 04:29 PM
Post: #8
RE: 50g Mini-Challenge: Number of positive divisors of x!
Thank you, Thomas.

Here a SysRPL programme following Thomas' suggestion which correctly processes 100 in

0.5604s

a definite improvement on my first attempt.

Size: 147.0000

CkSum: # FDBAh

Code:
::
  CK1&Dispatch
  # FF
  ::
    ZINT 1
    SWAP
    ZINT 0
    BEGIN
    FPTR2 ^Prime+
    2DUP
    Z>=
    WHILE
    ::
      2DUP
      ZINT 0
      OVER
      2SWAP
      BEGIN
      2DUP
      FPTR2 ^ZQUOText
      DUP
      ZINT 0
      Z>
      WHILE
      ::
        5ROLL
        FPTR2 ^RADDext
        4UNROLL
        3PICK
        FPTR2 ^RMULText
      ;
      REPEAT
      4DROP
      ZINT 1
      FPTR2 ^RADDext
      4ROLL
      FPTR2 ^RMULText
      3UNROLL
    ;
    REPEAT
    2DROP
  ;
;
Find all posts by this user
Quote this message in a reply
09-30-2017, 04:43 PM
Post: #9
RE: 50g Mini-Challenge: Number of positive divisors of x!
Also the time characteristic is better, bestfit claims a linear relationship on the input.
Find all posts by this user
Quote this message in a reply
09-30-2017, 08:17 PM
Post: #10
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 04:29 PM)Gerald H Wrote:  Here a SysRPL programme following Thomas' suggestion which correctly processes 100 in 0.5604s, a definite improvement on my first attempt.
Size: 147.0000
CkSum: # FDBAh

Wow, very impressive! Two quick questions:

(1) FPTR 6 119 (^RADDext) and FPTR 6 118 (^QAdd) both resolve to address 6:5253B, so they are really identical functions. Is there a reason that you prefer using ^RADDext instead of ^QAdd in this program? Same question for the identical functions ^RMULText and ^QMul. (It might help if I knew what the "Q" and the "R" stand for in those function names.)

(2) Your second System RPL program evaluates an input of 100 in half a second, so when I keyed up the same program in User RPL (exactly the same, step-by-step, just in URPL), I expected it to run slower, but it ran MUCH slower than I anticipated. URPL is always slower than SysRPL, but this difference seems crazy:

320 SysRPL --> 1.5 seconds
320 URPL --> 18.6 seconds!

Why is this SysRPL program so much faster? Is it simply because SysRPL is always faster, especially when looping is involved?

Always eager to learn! Thanks in advance for your insights!

X<> c
-Joe-
Visit this user's website Find all posts by this user
Quote this message in a reply
09-30-2017, 09:31 PM (This post was last modified: 09-30-2017 09:34 PM by Gerald H.)
Post: #11
RE: 50g Mini-Challenge: Number of positive divisors of x!
The User programme below took

14.9141s

to process 320.

To question 1, ^RMULText etc is just habit, to q 2 I have surely less idea than you.

How great a speed improvement occurs is variable, loops do seem to go much faster in Sys.

Code:
« 1 SWAP 0
  WHILE NEXTPRIME
DUP2 ≥
  REPEAT DUP2 0
OVER 4 ROLL 4 ROLL
    WHILE DUP2
IQUOT DUP 0 >
    REPEAT 5 ROLL +
4 ROLLD PICK3 *
    END 4 DROPN 1 +
4 ROLL * UNROT
  END DROP2
»
Find all posts by this user
Quote this message in a reply
09-30-2017, 09:52 PM
Post: #12
RE: 50g Mini-Challenge: Number of positive divisors of x!
So for now I wrote my own in URPL, 15 takes 1.7614 seconds, 100 needs 15,2s and 320 nearly 75 seconds. seems I need to improve my code a bit
Code:
 %%HP: T(3)A(R)F(.);
\<< 1 OVER NDUPN \->LIST \-> L
  \<< 2
    FOR K K FACTORS REVLIST OBJ\-> 2 / 1 SWAP
      START L OVER GET ROT + L UNROT PUT 'L' STO
      NEXT -1
    STEP L DUP \PILIST
  \>>
\>>
moving the list to memory and back is the main slowdown I think.
Arno
Find all posts by this user
Quote this message in a reply
09-30-2017, 10:52 PM
Post: #13
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 08:17 PM)Joe Horn Wrote:  (2) Your second System RPL program evaluates an input of 100 in half a second, so when I keyed up the same program in User RPL (exactly the same, step-by-step, just in URPL), I expected it to run slower, but it ran MUCH slower than I anticipated. URPL is always slower than SysRPL, but this difference seems crazy:

320 SysRPL --> 1.5 seconds
320 URPL --> 18.6 seconds!

Why is this SysRPL program so much faster? Is it simply because SysRPL is always faster, especially when looping is involved?

In addition to the inherent loop structure speedup that SysRPL gives, I think if you look at what's inside those loops you'll see another reason for the speed advantage that Gerald's SysRPL version has: lots of "combo" commands that directly manipulate the stack without having to validate the arguments given. 5ROLL, 4UNROLL, 2DROP, etc. go about their business without bothering to check that the stack can accommodate their action or that the numeric constant applies. So in many of the "equivalent" UserRPL steps, you've not only got the usual overhead for argument count/type checking (and LASTARG/cached stack saving), you've also got additional RPL stream maintenance and stack validation occurring that aren't present when executing the SysRPL code.

Those combo commands for manipulating the stack can be a pain to remember in SysRPL but they really pay off sometimes in inner loops.
Find all posts by this user
Quote this message in a reply
10-01-2017, 02:35 AM (This post was last modified: 10-01-2017 02:37 AM by Paul Dale.)
Post: #14
RE: 50g Mini-Challenge: Number of positive divisors of x!
This one is cheating but it is very very fast:

<< [ 1 2 4 8 16 30 60 96 160 270 540 792 1584 2592 4032 5376 10752 14688 29376 41040 60800 96000 192000 242880 340032 532224 677376 917280 1834560 2332800 4665600 5529600 7864320 12165120 16422912 19595520 39191040 60466176 85100544 102435840 204871680 258048000 516096000 677376000 819624960 1258709760 2517419520 2876670720 3698576640 4464046080 6210846720 8087040000 16174080000 18559756800 23984916480 28217548800 39016857600 59609088000 119218176000 137106432000 274212864000 418535424000 492139929600 543050956800 695105224704 850195906560 1700391813120 2190889451520 3012472995840 3543845437440 7087690874880 7848891187200 15697782374400 23878316851200 ] SWAP GET >>

And it works on a 28S. The next value in the sequence cannot be represented exactly, so I stopped.


Pauli
Find all posts by this user
Quote this message in a reply
10-01-2017, 02:48 AM
Post: #15
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 09:31 PM)Gerald H Wrote:  The User programme below took 14.9141s to process 320.

Code:
« 1 SWAP 0
  WHILE NEXTPRIME
DUP2 ≥
  REPEAT DUP2 0
OVER 4 ROLL 4 ROLL
    WHILE DUP2
IQUOT DUP 0 >
    REPEAT 5 ROLL +
4 ROLLD PICK3 *
    END 4 DROPN 1 +
4 ROLL * UNROT
  END DROP2
»

Exactly the same program* on my 50g takes 18.6 seconds for an input of 320, so your 50g must be faster than mine. *"Exactly the same" ... well almost; I always use reals for the arguments to stack commands (e.g. 4. ROLL instead of 4 ROLL) because it's a tad faster that way. With integers in their place, it takes 0.3 seconds longer for an input of 320.

(09-30-2017 10:52 PM)DavidM Wrote:  In addition to the inherent loop structure speedup that SysRPL gives, I think if you look at what's inside those loops you'll see another reason for the speed advantage that Gerald's SysRPL version has: lots of "combo" commands that directly manipulate the stack without having to validate the arguments given. 5ROLL, 4UNROLL, 2DROP, etc. go about their business without bothering to check that the stack can accommodate their action or that the numeric constant applies. So in many of the "equivalent" UserRPL steps, you've not only got the usual overhead for argument count/type checking (and LASTARG/cached stack saving), you've also got additional RPL stream maintenance and stack validation occurring that aren't present when executing the SysRPL code.

Those combo commands for manipulating the stack can be a pain to remember in SysRPL but they really pay off sometimes in inner loops.

Good point. In his System RPL book, Jim Donnelly wrote, "The System-RPL example will run faster than the User-RPL program, because all the argument checking code has been bypassed." Nice to know that the speed increase can be this dramatic!

X<> c
-Joe-
Visit this user's website Find all posts by this user
Quote this message in a reply
10-01-2017, 05:09 AM
Post: #16
RE: 50g Mini-Challenge: Number of positive divisors of x!
(10-01-2017 02:48 AM)Joe Horn Wrote:  Exactly the same program* on my 50g takes 18.6 seconds for an input of 320, so your 50g must be faster than mine. *"Exactly the same" ... well almost; I always use reals for the arguments to stack commands (e.g. 4. ROLL instead of 4 ROLL) because it's a tad faster that way. With integers in their place, it takes 0.3 seconds longer for an input of 320.

Yes, down from 14.9s to 14.6s with real arguments for the stack commands
Find all posts by this user
Quote this message in a reply
10-01-2017, 02:09 PM
Post: #17
RE: 50g Mini-Challenge: Number of positive divisors of x!
(09-30-2017 10:52 PM)DavidM Wrote:  In addition to the inherent loop structure speedup that SysRPL gives, I think if you look at what's inside those loops you'll see another reason for the speed advantage that Gerald's SysRPL version has: lots of "combo" commands that directly manipulate the stack without having to validate the arguments given. 5ROLL, 4UNROLL, 2DROP, etc. go about their business without bothering to check that the stack can accommodate their action or that the numeric constant applies. So in many of the "equivalent" UserRPL steps, you've not only got the usual overhead for argument count/type checking (and LASTARG/cached stack saving), you've also got additional RPL stream maintenance and stack validation occurring that aren't present when executing the SysRPL code.

Those combo commands for manipulating the stack can be a pain to remember in SysRPL but they really pay off sometimes in inner loops.

A related issue is that User RPL stack commands that take a numeric argument (PICK, ROLL, etc) take about 7 times as long as simple stack commands such as SWAP or ROT. I go to great lengths to avoid using those slow commands inside loops.

Also, WHILE and UNTIL loops are slower than START loops in User RPL whereas WHILE is very fast in SysRPL.

John
Find all posts by this user
Quote this message in a reply
10-01-2017, 03:47 PM
Post: #18
RE: 50g Mini-Challenge: Number of positive divisors of x!
(10-01-2017 02:09 PM)John Keith Wrote:  A related issue is that User RPL stack commands that take a numeric argument (PICK, ROLL, etc) take about 7 times as long as simple stack commands such as SWAP or ROT. I go to great lengths to avoid using those slow commands inside loops.

Also, WHILE and UNTIL loops are slower than START loops in User RPL whereas WHILE is very fast in SysRPL.

John

John it seems that you have some knowledge of speed of some commands (considering also the other thread where you today wrote that CEIL is faster than 1 + IP ). By chance do you have some measurements in time for each command tested by you?

Wikis are great, Contribute :)
Find all posts by this user
Quote this message in a reply
10-02-2017, 05:30 AM
Post: #19
RE: 50g Mini-Challenge: Number of positive divisors of x!
(10-01-2017 03:47 PM)pier4r Wrote:  John it seems that you have some knowledge of speed of some commands (considering also the other thread where you today wrote that CEIL is faster than 1 + IP ). By chance do you have some measurements in time for each command tested by you?

Note that you can determine such timings yourself using the 50g's TEVAL command. For example, put << 1. 1000. START 1.5 DROP NEXT >> on the stack and execute TEVAL to time it. Now insert either CEIL or 1. + IP between the 1.5 and the DROP, and use TEVAL to time each. Divide the times by 1000 to see the difference. I keep << -40 CF MEM DROP 0.5 WAIT TEVAL -40 SF >> assigned to a key for accurate timings.

X<> c
-Joe-
Visit this user's website Find all posts by this user
Quote this message in a reply
10-02-2017, 06:43 AM
Post: #20
RE: 50g Mini-Challenge: Number of positive divisors of x!
(10-02-2017 05:30 AM)Joe Horn Wrote:  
(10-01-2017 03:47 PM)pier4r Wrote:  John it seems that you have some knowledge of speed of some commands (considering also the other thread where you today wrote that CEIL is faster than 1 + IP ). By chance do you have some measurements in time for each command tested by you?

Note that you can determine such timings yourself using the 50g's TEVAL command. For example, put << 1. 1000. START 1.5 DROP NEXT >> on the stack and execute TEVAL to time it. Now insert either CEIL or 1. + IP between the 1.5 and the DROP, and use TEVAL to time each. Divide the times by 1000 to see the difference. I keep << -40 CF MEM DROP 0.5 WAIT TEVAL -40 SF >> assigned to a key for accurate timings.

Why do you have MEM in the programme? TEVAL does garbage collection.
What's the point of 0.5 WAIT?
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 2 Guest(s)