Re: HHC 2012 RPN Programming Contest Message #65 Posted by Paul Dale on 23 Sept 2012, 5:16 p.m., in response to message #1 by gene wright
Interesting problem. Speed and step count. The WP-34S will be hard to beat here unless there are some long constants involved in the solution. The only faster RPN device is the 15c LE and no other has such a rich suite of programmer friendly instructions.
My first thought is to calculate the distance from the points to an axis, figure the distance between the axis points out and sum these. Can we do better? Unsurprisingly, yes. Instead calculate the distance to the origin and subtract.
Each diagonal consists of a set of points (x, y) where x+y is constant. We want f(x,y) the distance from the origin of the point (x, y). Moving the (x, y) point down the diagonal back to the x axis takes y steps. Thus:
f(x, y) = f(x+y, 0) + y.
Now to work out f(n, 0). It is fairly easy to show that this is the nth triangular number. The first few values strongly indicate this:
n f(n, 0)
0 0
1 1
2 3
3 6
4 10
5 15
The formula for f(n, 0) is:
f(n, 0) = n (n+1) / 2 = (n^2 + n) / 2 = Combinations(n+1, 2)
This explains the maximum limit on the coordinates that was given. The limit squared doesn't overflow. Nice, we're likely on the right track.
The direct formula for f(x, y) is
f(x, y) = (x+y) (x+y+1) / 2 + y = ((x+y)^2 + (x+y) + 2y) / 2 = Combinations(x+y+1, 2)
We need to produce:
f(T, Z) - f(Y, X)
A subroutine that calculates the value of f(x, y) without disturbing the stack is required. I don't see how to use the final Combinations(n+1, 2) without stack destruction, plus this function is very slow on the 34S and a solution using this doesn't execute instantly -- still fast though. Instead we'll use the (n^2 + n) / 2 form.
The first quick program:
01 LBL A y2 x2 y1 x1
02 XEQ 00 f(x2 y2)
03 [<->] YZXT y1 x1 f(x2 y2)
04 XEQ 00 f(x1 y1) f(x2 y2)
05 - our answer
06 RTN
07 LBL 00 x y z t
08 x[<->] Y x y z t
09 RCL+ Y x+y y z t
10 x[^2] (x+y)^2 y z t
11 RCL+ L (x+y)^2+(x+y) y z t
12 RCL+ Y (x+y)^2+(x+y)+y y z t
13 + (x+y)^2+(x+y)+2y z t t
14 # 002 2 sum z t
15 / ((x+y)^2+(x+y)+2y)/2 z t t
Can we do shorter and faster? Of course. Well faster is difficult, the program executes essentially instantly even in SLOW mode (which isn't the default). Factor out the "2 /" so they are only executed once only and change the divide to a multiply will save some milliseconds but that is about it without profiling individual instructions. Remove the label A since the given example requires R/S to start execution. This means a GTO . 000 is required before execution but that's liveable. The 41 series might have problems getting repeatability here with its weird RTN behaviour.
Replace the x[<->] Y in the subroutine with a [<->] YXTZ at the start of the program -- this doesn't save any instructions but executes one less. This will definitely not be noticeable.
01 [<->] YXTZ
02 XEQ 00
03 [<->] YZXT
04 XEQ 00
05 -
06 # 1/2
07 [times]
08 RTN
09 LBL 00
10 RCL+ Y
11 x[^2]
12 RCL+ L
13 RCL+ Y
14 +
Timing this using TICKS generally gives zero as the result. I.e. a tenth of a second or less. Step count becomes important since it is the final decision criteria. How to reduce steps further?
Pity Walter insisted the BSRF instruction was removed from the 34S or we could have saved another step. Hang on, the instruction is there so the assembler can be tricked into producing it for us, probably against the rules or at least the spirit of them. I've got a 34S running older firmware somewhere, I knew I kept it for a reason :-) Yes, found it. Drat, it doesn't have the 1/2 constant. Oh well, speed really isn't important here. A version that works on that calculator saving one more step:
01 [<->] YXTZ
02 BSRF 006
03 [<->] YZXT
04 BSRF 004
05 -
06 # 002
07 /
08 RTN
09 RCL+ Y
10 x[^2]
11 RCL+ L
12 RCL+ Y
13 +
Thirteen steps, zero execution time.
As a complete aside and mostly for the record, a version using the combinations form is possible:
01 [<->] YXTZ
02 [cmplx]z[<->] J
03 XEQ 00
04 [cmplx]RCL J
05 XEQ 00
06 -
07 RTN
08 LBL 00
09 RCL+ Y
10 INC X
11 # 002
12 COMB
13 +
Again, on the older device, one step can be spared:
01 [<->] YXTZ
02 [cmplx]z[<->] J
03 BSRF 004
04 [cmplx]RCL J
05 BSRF 002
06 -
07 RTN
08 RCL+ Y
09 INC X
10 # 002
11 COMB
12 +
Unfortunately, these programs take 8 - 10 TICKS to execute (i.e. about a second). To my mind, this doesn't qualify as nearly identical but if it does, one fewer step is possible. They also don't deal with the origin as one of the points, so these two aren't worthy of further consideration.
Finally, add one step to all of these if the mandatory END is included -- this isn't really an instruction, the firmware fakes it and it doesn't occupy a step of RAM if it is the final END in RAM.
- Pauli
Edited: 23 Sept 2012, 6:08 p.m. after one or more responses were posted
|