Post Reply 
(34S) Prime Factors
02-23-2015, 08:02 PM (This post was last modified: 02-23-2015 08:22 PM by Dirk E..)
Post: #13
RE: (WP-34S) Prime Factors - performance challenge
With the subroutine-driven MOD30 loop shown here I felt challenged to find some optimal between speed, resources used and rubbish left on the SBR-stack, but it's quite hard to decide...

From looking through the 'PF' in the library I decided to remain with the RPN stack registers. Since the WP 34S offers so much enhanced TEST commands, you just have to remember, when the stack lifts or drops and when not. It reminded me a little an ancient predecessor of the Rubik's cube called "Schiebefax" in german - 4x4 squares less one in a square frame which were to be put into a certain order.

Here's the very shortest (but also slowest) I found:
Code:
WP 34S (3.3 3678)   13.Feb.2015
 Example: 7 x 38447 x 62119 - 1.-2.Factor 1'35.3"

Find prime factors for (unknown)
- Version using stack (X, Y, L) only
- the shortest but slowest - trying every number from 2

usage: (unknown) XEQ 'Pr' - R/S [- R/S] ... '1'
- Registers:
RPN-stack

--------------------------------------------------------------------------------
001  LBL 'Pr'                             ; ( L );  X ,  Y
002  2          ;for even test            ; (   );  2 , 'X'
003  x><y                                 ; (   ); 'X', 'T'
004  SKIP 002
             ;main loop: 6(7) lines per 1(1) candidates
             ;i.e. 180(210) per 30(30)
005  INC Y      ;next candidate           ; ('X'); 'X/T0', 'T1'
006  x>< L                                ; (   ); 'X', 'T'
007  RCL/ Y                               ; ('X'); 'X/T', 'T'
008   x<y?      ;'X/T'<'T' i.e. 'T'^2>'X'
009   SKIP 005  ; -> end
010  FP?        ;remainder != 0 ?
011  BACK 006   ; -> next test factor

012  VIEW Y
013  STOP       ;show factor
014  BACK 007

015  x>< L                                ; (   ); 'X', 'T'
016  STOP
017  RCL/ X                               ; ('X');  1 , 'T'
018  END

Adding one line to the loop and a few for filtering out the even the required time lowers by a third:
Code:
WP 34S (3.3 3678)   13.Feb.2015
 Example: 7 x 38447 x 62119 - 1.-2.Factor 53.2"

Find prime factors for (unknown)
- Version using stack (X, Y, L) only, testing 2 and every odd number

usage: (unknown) XEQ 'Pr' - R/S [- R/S] ... '1'
- Registers:
RPN-stack

--------------------------------------------------------------------------------
001  LBL 'Pr'                             ; ( L );  X ,  Y
002  2          ;for even test            ; (   );  2 , 'X'

003  x><y                                 ; (   ); 'X', 'T'
004  RCL/ Y                               ; ('X'); 'X/T', 'T'
005  FP?
006  SKIP 004   ;remainder != 0 -> try '3'
007  VIEW Y
008  STOP       ;show factor 2
009  BACK 005   ;try factor again
             ;main loop: 7(8) lines per 1(2) candidates
             ;i.e. 105(120) per 15(30)
010  INC Y                                ; ('X'); 'X/T0', 'Tx'
011  INC Y      ;next odd candidate       ; ('X'); 'X/T0', 'T1'
012  x>< L                                ; (   ); 'X', 'T'
013  RCL/ Y                               ; ('X'); 'X/T', 'T'
014   x<y?      ;'X/T'<'T' i.e. 'T'^2>'X'
015   SKIP 005  ; -> end
016  FP?        ;remainder != 0 ?
017  BACK 007   ; -> next test factor

018  VIEW Y
019  STOP       ;show factor 2n+1
020  BACK 007

021  x>< L                                ; (   ); 'X', 'T'
022  STOP
023  RCL/ X                               ; ('X');  1 , 'T'
024  END

Still using only x, y and L, but leaving out divisions by multiples of 3, even more speed is gained, but on the cost of almost doubled line count:
Code:
WP 34S (3.3 3678)   13.Feb.2015
 Example: 7 x 38447 x 62119 - 1.-2.Factor 34"

Find prime factors for (unknown)
- Version using stack (X, Y, L) only, trying 2, 3, excluding their multiples

usage: (unknown) XEQ 'Prm' - R/S [- R/S] ... '1'
- Registers:
RPN-stack

--------------------------------------------------------------------------------
001  LBL 'Prm'                            ; ( L );  X ,  Y 
002  2          ;for even test            ; (   );  2 , 'X'

003  x><y                                 ; (   ); 'X', 'T'

004  RCL/ Y                               ; ('X'); 'X/T', 'T'
005  FP?
006  SKIP 003   ;remainder != 0
007  VIEW Y
008  STOP       ;show factor
009  BACK 005   ;try factor again

010  CLx                                  ; ('X');  0 , 'T'
011  3                                    ; ('X');  3 , 'T'
012  x=? Y
013  SKIP 003
014  x><y                                 ; ('X'); 'T',  3
015  x>< L                                ; ('T'); 'X',  3
016  BACK 012

017  STO/ Y                               ; ('X');  3 ,  1
                ;main Loop: 13(15) lines for 2(6) candidates
                ;i.e. 65(75) for 10(30)
018  CLx                                  ; ('X');  0 , 'T'
019  4                                    ; ('X');  4 , 'T0'
020  STO+ Y                               ; ('X');  4 , 'T1'
021  x>< L                                ; ( 4 ); 'X', 'T'
022  RCL/ Y                               ; ('X'); 'X/T', 'T'
023  INT?
024  SKIP 011

025  INC Y                                ; ('X');    , 'Tx'
026  INC Y                                ; ('X');    , 'T1'
027  x>< L                                ; (   ); 'X', 'T'
028  RCL/ Y                               ; ('X'); 'X/T', 'T'
029   x<? Y     ; 'X/T'<'T' i.e. 'T'^2>'X'
030   SKIP 008  ; goto End
031  FP?
032  BACK 014

033  VIEW Y
034  STOP       ;show factor
035  BACK 007

036  VIEW Y
037  STOP       ;show factor
038  BACK 016

039  x>< L                                ; ('X/T'); 'X', 'T'
040  STOP
041  RCL/ X                               ; ('X');  1 , 'T'
042  END

Now I became curious whether the MOD 30 algorithm, i.e. leaving out the multiples of 5, too, would be worth the effort. Formerly I was always convinced, the overhead to organize the loop would eat up the speed. Now I tried - and found the way that Dave presented here is definitely the fastest.

Here it is, adapted to use the RPN stack x, y, z, t and L only. Since for a faster end condition test the root of the 'unknown' is also maintained and integer division employed, it uses a few lines and registers more:
Code:
WP 34S (3.3 3678)   17.Feb.2015
 Example: 7 x 38447 x 62119 - 1.-2.Factor 28.2"

Find prime factors for (unknown)
- Version using stack (X, Y, Z, T, L) only,
 trying mod 30 chain via a subroutine, holding sqrt(x) for end condition

usage: (unknown) XEQ 'Pr' - R/S [- R/S] ... '1'
- Registers: RPN-stack (ssize4 dbloff recommended)

--------------------------------------------------------------------------------
001  LBL 'Pr'                             ; ( L );  X ,  Y ,  Z ,  T
002  SKIP 021

003  LBL 10                               ; ('T0'); 'S', 'X', 'X', [sqrt('X')]
004  RCL+ L                               ; ('S'); 'T1', 'X', 'X', [sqrt('X')]
005  RMDR                                 ; ('T');  ? , 'X', [sqrt('X')]
006  x=0?      ;factor found?
007  SKIP 003
008  CLx                                  ; ('T');  0 , 'X', [sqrt('X')]
009  RCL Y     ;duplicate 'X'             ; ('T'); 'X', 'X', [sqrt('X')]
010  RTN       ;-> try next
011  x>< L                                ; ( 0 ); 'T', 'X0', [sqrt('X')]
012  STO/ Y                               ; (   ); 'T', 'X1', [sqrt('X0')]
013  STOP      ;show factor
014  XEQ 11                               ; (   ); 'T', 'X', 'X', [sqrt('X')]
015  BACK 010  ;try factor again

016  LBL 11
017  x>< Z                                ; (   ); [sqrt('X0')], 'X1', 'T'
018  RCL Y                                ; (   ); 'X1', [sqrt('X0')], 'X1', 'T'
019  STO Y     ;duplicate 'X'             ; (   ); 'X1', 'X1, 'X1', 'T'
020  SQRT()                               ; (   ); sqrt('X'), 'X', 'X', 'T'
021  IP        ;end condition value     ;(sqrt('X')); [sqrt('X')], 'X', 'X', 'T'
022  x>< T                                ; (   );, 'T', 'X', 'X', [sqrt('X')]
023  RTN

024  RCL X                                ; (   ); 'X', 'X'
025  1                                    ; (   );  1 , 'X', 'X'
026  XEQ 11                               ; (   );  1 , 'X', 'X', [sqrt('X')]
027  STO L                                ; ( 1 );  1 , 'X', 'X', [sqrt('X')]
028  XEQ 10
029  1                                    ; ('T');  1 , 'X', 'X', [sqrt('X')]
030  XEQ 10
031  2                                    ; ('T');  2 , 'X', 'X', [sqrt('X')]
032  XEQ 10
033  2                                    ; ('T');  2 , 'X', 'X', [sqrt('X')]
034  XEQ 10                               ; ('T'); 'X', 'X', [sqrt('X')]
035  ENTER^                               ; ('T'); 'X', 'X', 'X', [sqrt('X')]

          ;main loop 8x 8(10) + 4 lines
          ;i.e. 68 (84 w.'LBL'+skipped) for 8 ("10"/30) tests
036  CLx                                  ; ('T');  0 , 'X', 'X', [sqrt('X')]
037  4                                    ; ('T');  4 , 'X', 'X', [sqrt('X')]
038  XEQ 10
039  2                                    ; ('T');  2 , 'X', 'X', [sqrt('X')]
040  XEQ 10
041  4                                    ; ('T');  4 , 'X', 'X', [sqrt('X')]
042  XEQ 10
043  2                                    ; ('T');  2 , 'X', 'X', [sqrt('X')]
044  XEQ 10
045  4                                    ; ('T');  4 , 'X', 'X', [sqrt('X')]
046  XEQ 10
047  6                                    ; ('T');  6 , 'X', 'X', [sqrt('X')]
048  XEQ 10
049  2                                    ; ('T');  2 , 'X', 'X', [sqrt('X')]
050  XEQ 10
051  6                                    ; ('T');  6 , 'X', 'X', [sqrt('X')]
052  XEQ 10                               ; ('T'); 'X', 'X', [sqrt('X')]
053  RCL L                                ; ('T'); 'T', 'X', 'X', [sqrt('X')]
054  x<? T     ;'X'>'T'^2 -> continue
055  BACK 019

056  x>< Y                                ; ('T'); 'X', 'T', 'X', [sqrt('X')]
057  STOP
058  RCL/ X                               ; ('X');  1 , 'T', 'X', [sqrt('X')]
059  END

The time used by the equivalent routine, using global registers instead, is the same.

While testing the speed I usually terminated after the second factor and started over from the beginning - soon I found out that the WP 34S keeps it's subroutine stack, differently from HP 35s and HP 42s that purge it upon a manual XEQ, GTO or OFF.
Thus I thought it would be preferrable to avoid subroutines during which the program could be interrupted and terminated - I tried to use indirect addressing for a very short subroutine returning just the step between the division 'candidates' 1 - 2 - 2 - 4 - 2 - 4 - 2 - 4 - 6 - 2 - 6 (then continuing repeating the 8 elements starting from the first '4'). And I tried another way doing just an indirect branch, since the step value would be needed always at the same position in the main loop. Both were much slower than the following MOD 6 approach, that saves some time by employing integer arithmetics and keeping the root for the end test:
Code:
WP 34S (3.3 3678)   17.Feb.2015
 Examples: 7 x 38447 x 62119 - 1.-2.Factor 29.5"
   or 33013x33013 1.Factor 25.7 s

Find prime factors for (unknown)
- Version holding root(x) for end condition
 using stack (X, Y, Z, T, L) only

usage: (unknown) XEQ 'Prm' - R/S [- R/S] ... '1'
- Registers: RPN-stack (ssize4 dbloff recommended)

--------------------------------------------------------------------------------
001  LBL 'Prm'
002  RCL X     ;push stack up       ; (   ), 'X', 'X'
003  2         ;test-factor 'T'     ; (   ), 'T', 'X', 'X' 

004  XEQ 01                         ; (   ), 'T', 'x', 'x', sqrt('x')
005  RMDR                           ; ('T'), ? , 'X', sqrt('X')
006  x!=0?
007  SKIP 012
008  x>< L                          ; ( 0 ), 'T', 'X0', sqrt('x0")
009  STO/ Y                         ; (   ), 'T', 'x1', sqrt('x1')
010  STOP      ;show factor 2 or 3
011  BACK 007

012  LBL 01                         ; (   ), 'T', 'X1', sqrt('X0')
013  x>< Z                          ; (   ), sqrt('X0'), 'X1', 'T'
014  RCL Y                          ; (   ), 'X1', sqrt('X0'), 'X1', 'T'
015  STO Y                          ; (   ), 'X1', 'X1', 'X1', 'T'
016  SQRT()                         ; ('x1'), sqrt('x1'), 'x1', 'x1', 'T'
017  IP                             ; (sqrt'x1'), [sqrt('x1')], 'x1', 'x1', 'T'
018  x>< T                          ; (   ), 'T', 'x1', 'x1', [sqrt('x1')]
019  RTN

020  CLx                            ; ('T'), 0 , 'X', sqrt('X')
021  3                              ; ('T0'), 3 , 'X', sqrt('X')
022  x!=? L
023  BACK 019
024  STO/ L                         ; ( 1 ), 3 , 'X', sqrt('X')
               ;main Loop: 14(16) lines for 2(6) candidates
               ;i.e. 70(80) for 10(30)
025  CLx                            ; ('T'), 0 , 'X', sqrt('X')
026  RCL Y     ;duplicate           ; ('T'), 'X', 'X', sqrt('X')
027  4                              ; ('T0'),  4 , 'X', 'X', sqrt('X')
028  RCL+ L    ;test-factor         ; ( 4 ), 'T1', 'X', 'X', sqrt('X')

029  RMDR                           ; ('T'), ? , 'x', sqrt('x')
030  x=0?
031  SKIP 014

032  CLx                            ; ('T'), 0 , 'x', sqrt('x')
033  RCL Y     ;duplicate           ; ('T'), 'x', 'x', sqrt('x')
034  2                              ; ('T0'),  2 , 'x', 'x', sqrt('x')
035  RCL+ L    ;test-factor         ; ( 2 ), 'T1', 'X', 'X', sqrt('X')

036   x>? T     ;'T'>sqrt('X')?
037   SKIP 013  ;-> End
038  RMDR                           ; ('T'), ? , 'x', sqrt('x'), sqrt('x')
039  x!=0?
040  BACK 015
041  x>< L                          ; ( 0 ), 'T', 'X0', sqrt('X0')
042  STO/ Y                         ; ( 0 ), 'T', 'X1', sqrt('X0')
043  STOP      ;factor 6n-1
044  XEC 01                         ; (   ), 'T', 'x1', 'x1', sqrt('x1')
045  BACK 009

046  x>< L                          ; ( 0 ), 'T', 'X0', sqrt('X0')
047  STO/ Y                         ; ( 0 ), 'T', 'X1', sqrt('X0')
048  STOP      ;factor 6n+1
049  XEQ 01                         ; (   ), 'T', 'x1', 'x1', sqrt('x1')
050  BACK 021

051  x><y      ;END                 ; (   ), 'X', 'T', 'X', sqrt('X')
052  STOP
053  RCL/ X                         ; ('X'),  1 , 'T', 'X', sqrt('X')
054  END

This is almost as fast as the MOD 30 above and even 5 lines shorter. It also demonstrates that despite the main loop running one line more than the MOD 6 program with RPN stack x, y, L only cited above, the integer arithmetics RMDR and comparison with an integer gain a measurable amount of time.

For the use of indirectly addressed local registers can be reviewed here, I'd like to show here my experiment using another indirect addressing for the factor chain step.
I found it not too clear in the manual, but I understand that the global registers are indirect 00-99, the 12 stack and helper registers indirect 100-111, and local registers starting .00 become indirect 112...
Here it is, the local registers are released before showing a factor that was found, then re-allocated; anyway, if the program is interrupted, they'd need to be freed manually:
Code:
WP 34S (3.3 3678)   8.Feb.2015
 Example: 7 x 38447 x 62119 - 1.-2.Factor 32"

Find prime factors for (unknown)
- Version using indirect local registers for MOD 30 chain
 and holding root(x) for end condition

usage: (unknown) XEQ 'PR' - R/S [- R/S] ... '1'
- Registers:
00 - unknown 'X', 01 - tester 'T', 02 - sqrt('X'), 03 - pointer
RPN-stack, Local registers .00-.10 - factor sequence steps

--------------------------------------------------------------------------------
001  LBL 'PR'
002  STO 00
003  # 111      ;loop end value
004  STO 03     ;counter/pointer
005  3
006  10^x
007  STO/ 03
008  # 122      ;loop start value
009  STO+ 03
010  2
011  STO 01     ;first test candidate
012  SKIP 004

013  RCL 01
014  PopLR      ;free local registers
015  STO/ 00    ;leave remaining nnn
016  STOP       ;show found factor
017  RCL 00
018  SQRT()
019  IP
020  STO 02     ;root(nnn) for end check
021  LocR 011   ;build factor sequence steps
022  1
023  STO .10
024  4
025  STO .07
026  STO .05
027  STO .03
028  2
029  STO .09
030  STO .08
031  STO .06
032  STO .04
033  STO .01
034  6
035  STO .02
036  STO .00

037  RCL 00     ;main loop 7x8(9)+12(14) lines
038  RCL 01     ;i.e. 68 for 8(10/30) candidates
039  RMDR
040  x=0?
041  BACK 028   ;factor found
042  RCL ->03
043  STO+ 01    ;next candidate
044  DSE 03
045  BACK 008
046  8
047  STO+ 03
048  RCL 01
049  x<=? 02    ;factor<=squrt(nnn)?
050  BACK 013   ;-> continue

051  RCL 00     ;last Factor 'X'
052  BACK 038
053  END

Maybe, this reply is a bit late, but I received my WP 34S only these weeks, and it was also my first program to do my own prime factorization - thank you for the interesting challenge with the 'PF' and the MOD 30 extention!
Hopefully someone finds something interesting in it, too!


On my somewhat slower android mobile on free42 I started exploring the WP34S's integer space and factorized 2^50+3 = 7 x 3 435 997 x 46 811 113.
This ran about 3 minutes(!) till the second factor on the mobile, while the test 7 x 38447 x 62119 ran rather 3 seconds on it. Thus I'd be a little afraid running out of batteries or patience factorizing any double precision integer on the WP 34S.
The 'PF' from the library ran more than 5 minutes factorizing 33013x33013 when I lost patience and stopped it before it found the first factor...

Cheers!
Dirk
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
(34S) Prime Factors - Dave Britten - 05-30-2014, 01:53 AM
RE: (WP-34S) Prime Factors - Thomas Klemm - 05-30-2014, 03:30 AM
RE: (WP-34S) Prime Factors - Dave Britten - 05-30-2014, 03:54 AM
RE: (WP-34S) Prime Factors - Thomas Klemm - 05-30-2014, 08:24 AM
RE: (WP-34S) Prime Factors - Thomas Klemm - 05-30-2014, 07:06 PM
RE: (WP-34S) Prime Factors - Dave Britten - 05-30-2014, 08:07 PM
RE: (WP-34S) Prime Factors - Thomas Klemm - 05-30-2014, 08:58 PM
RE: (WP-34S) Prime Factors - Dave Britten - 05-30-2014, 09:42 PM
RE: (WP-34S) Prime Factors - Thomas Klemm - 05-30-2014, 10:57 PM
RE: (WP-34S) Prime Factors - Dave Britten - 05-30-2014, 11:58 PM
RE: (WP-34S) Prime Factors - Dave Britten - 05-31-2014, 01:09 AM
RE: (WP-34S) Prime Factors - Dave Britten - 06-01-2014, 06:11 PM
RE: (WP-34S) Prime Factors - performance challenge - Dirk E. - 02-23-2015 08:02 PM



User(s) browsing this thread: 1 Guest(s)