Threaded Mode | Linear Mode

Joe Horn · (This post was last modified: 07-09-2018 01:29 PM by Joe Horn.)

All the HILBERT programs for the HP 48 series which I've seen, and the HILBERT function in the HP 49/50 series, use nested FOR loops to generate every element of a Hilbert matrix. This is slow and inefficient, since a much faster method exists which uses no nested loops.

Your Mini-Challenge, should you choose to accept it, is to write a better, faster HILBERT in User RPL. It should take any whole number n as its input and return an n×n Hilbert matrix much faster than the brute-force nested-loop method.

HP 49/50 users: Bonus points if your program returns a real matrix in approximate mode and a symbolic matrix in exact mode, as the HILBERT function does.

Reality check: The HP 50g's HILBERT function takes about 90 seconds to create a 50×50 Hilbert matrix of reals. A User-RPL HILBERT program can do the same job over 100 times faster (less than one second). My current attempt does so, and is 114.5 bytes long.

Disclaimer: This is a mini-challenge rather than simply an entry in the software library because nobody needs fast Hilbert matrices. Figuring out the faster method and then coding it up is presented here merely as a fun challenge.

Thomas Okken · (This post was last modified: 07-09-2018 04:21 PM by Thomas Okken.)

I'm guessing the point of the exercise is to avoid unnecessary calculations. This will do so, at the cost of putting n^2 objects on the stack:

Code:

<< DUP 1 - -> n n1

   << 1 n FOR i i INV NEXT

      n 1 + n n1 + FOR i n1 DUPN i INV NEXT

      n DUP 2 ->LIST ->ARRY

   >>

>>

119.5 bytes.
I don't have a real RPL calculator so I can't really time this, unfortunately.

**rprosperi** · 07-09-2018, 04:32 PM

(07-09-2018 03:16 PM)Thomas Okken Wrote: I'm guessing the point of the exercise is to avoid unnecessary calculations. This will do so, at the cost of putting n^2 objects on the stack:

Code:

<< DUP 1 - -> n n1 << 1 n FOR i i INV NEXT n 1 + n n1 + FOR i n1 DUPN i INV NEXT n DUP 2 ->LIST ->ARRY >> >>

119.5 bytes.
I don't have a real RPL calculator so I can't really time this, unfortunately.

On a 48GX, this clocks in at 2.741 Sec for n=50.

ttw · (This post was last modified: 07-09-2018 05:34 PM by ttw.)

I think I can speed it up so I'll post some code later. HP50g user RPL for Hilbert matrix order 50 took 52 seconds first attempt. I might bring it down to the low 40s. This was integer matrix (symbolic).

I just noticed that my next code version was essentially the same as Thomas Okken's above so I'll bow out. Tom's is very good.

Joe Horn · 07-09-2018, 05:25 PM

(07-09-2018 03:16 PM)Thomas Okken Wrote: I'm guessing the point of the exercise is to avoid unnecessary calculations.

Exactly.

(07-09-2018 03:16 PM)Thomas Okken Wrote: This will do so, at the cost of putting n^2 objects on the stack:

Code:

<< DUP 1 - -> n n1 << 1 n FOR i i INV NEXT n 1 + n n1 + FOR i n1 DUPN i INV NEXT n DUP 2 ->LIST ->ARRY >> >>

119.5 bytes.

I like your idea of pre-calculating n-1 and storing it in a local variable (n1) rather than doing "n 1 -" over and over inside the loop (which is what I did). Your program is therefore faster than mine: 50×50 in 0.71 seconds instead of my 0.8 seconds on an HP 50g in approx mode. And I was surprised that your "n DUP 2 ->LIST ->ARRY" is shorter than my "{ n n } ->ARRY". Very cool! Smile

Valentin Albillo · 07-09-2018, 09:46 PM

(07-09-2018 05:25 PM)Joe Horn Wrote:
(07-09-2018 03:16 PM)Thomas Okken Wrote: I'm guessing the point of the exercise is to avoid unnecessary calculations.

Exactly.

Ah well, I thought that the point of the execise was to avoid unnecessary loops, nested loops in particular.

This can be cleverly accomplished using vector operations but I can't elaborate the concept further as (1) I don't indulge in cryptolanguages, and (2) I don't know if the various cryptodialects do include the necessary vector operations in their respective instruction sets.

Regards.
V.
.

Thomas Okken · 07-09-2018, 11:03 PM

(07-09-2018 09:46 PM)Valentin Albillo Wrote:
(07-09-2018 05:25 PM)Joe Horn Wrote: Exactly.

Ah well, I thought that the point of the execise was to avoid unnecessary loops, nested loops in particular.

This can be cleverly accomplished using vector operations but I can't elaborate the concept further as (1) I don't indulge in cryptolanguages, and (2) I don't know if the various cryptodialects do include the necessary vector operations in their respective instruction sets.

Regards.
V.
.

I started out thinking about this problem in terms of vector operations, HP-42S style, and what I came up with was this:

1. Create a row vector containing the numbers 1 through n
2. Use the 10^X function to obtain a row vector containing the numbers 10 through 10^n
3. Calculate the outer product of said row vector and the corresponding column vector
4. Use the LOG function to obtain a matrix where each element with indices (m, n) equals m+n
5. Subtract 1 from each element
6. Take the reciprocal of each element

This is pretty compact and reasonably efficient: after creating the initial [ 1 2 ... n ] vector (step 1), all the following steps are basic 42S logic: 10^X TRANS LASTX * LOG 1 - 1/X. This works because applying 10^X, LOG, and 1/X to a vector or matrix all work element-wise, and so does the subtraction of a scalar from a matrix. The LOG and 10^X operations are efficient, too, in this case, because we're applying 10^X to integers and applying LOG to integral powers of 10.

However, the challenge that was presented here was about speed, and the above algorithm fails in that regard, just like the obvious algorithm with two nested loops, because they use O(n^2) floating-point operations.

Since the Hilbert matrix contains only 2n-1 distinct numbers, O(n^2) FLOPS is really bad, at least when the cost of a FLOP is significant compared to the cost of allocating elements.

If you have a vector-based solution that outperforms my code, please share!

Joe Horn · 07-09-2018, 11:43 PM

(07-09-2018 09:46 PM)Valentin Albillo Wrote: Ah well, I thought that the point of the execise was to avoid unnecessary loops, nested loops in particular.

Also correct. A great way to avoid unnecessary calculations is by avoiding unnecessary nested loops. Thomas' program accomplishes both.

Thomas Okken · (This post was last modified: 07-10-2018 01:32 AM by Thomas Okken.)

(07-09-2018 11:43 PM)Joe Horn Wrote:
(07-09-2018 09:46 PM)Valentin Albillo Wrote: Ah well, I thought that the point of the execise was to avoid unnecessary loops, nested loops in particular.

Also correct. A great way to avoid unnecessary calculations is by avoiding unnecessary nested loops. Thomas' program accomplishes both.

Yes and no. Depending on the coding style, you may have two nested loops, with n^2 iterations of the inner loop, or two sequential loops with n iterations each (as in the RPL code that we wrote), or one loop with n iterations (as in the 42S code that I described), but no matter which coding style you use, you're still creating and populating an n*n matrix, so there will always be O(n^2) operations somewhere.

What the algorithms we came up with accomplish is to keep the number of floating-point divisions to a minimum, computing only 2n-1 reciprocals, rather than n^2 reciprocals as in the nested-loop algorithm, but they still perform O(n^2) operations overall because the DUPN calls are O(n) each.

Thomas Okken · (This post was last modified: 07-10-2018 11:55 AM by Thomas Okken.)

To elaborate a bit...

A naive Java implementation would be like this:

Code:

double[][] hilbert(int n) {

    double[][] h = new double[n][n];

    for (int i = 0; i < n; i++)

        for (int j = 0; j < n; j++)

            h[i][j] = 1.0 / (i + j + 1);

    return h;

}

while the division-minimizing approach could also be coded with nested loops, like this:

Code:

double[][] hilbert(int n) {

    double[][] h = new double[n][n];

    for (int j = 0; j < n; j++)

        h[0][j] = 1.0 / (j + 1);

    for (int i = 1; i < n; i++) {

        for (int j = 0; j < n - 1; j++)

            h[i][j] = h[i - 1][j + 1];

        h[i][n - 1] = 1.0 / (i + n);

    }

    return h;

}

Werner · 07-10-2018, 07:20 AM

Hi, Joe!
Unless I'm mistaken, this was a MC (Mini Challenge) ages ago, and I have this in my 48/49 ever since:

Code:

\<<

  \-> N

  \<<

    1 N DUP2 - -

    FOR I

      I N > { N 1 - DUPN } IFT

      I INV

    NEXT

    N DUP 2 \->LIST

    \->ARRY

  \<<

\>>

98.5 bytes, # 5FDAh

Cheers, Werner

Gerald H · (This post was last modified: 07-10-2018 09:52 AM by Gerald H.)

(07-09-2018 03:16 PM)Thomas Okken Wrote: I'm guessing the point of the exercise is to avoid unnecessary calculations. This will do so, at the cost of putting n^2 objects on the stack:

Code:

<< DUP 1 - -> n n1 << 1 n FOR i i INV NEXT n 1 + n n1 + FOR i n1 DUPN i INV NEXT n DUP 2 ->LIST ->ARRY >> >>

119.5 bytes.
I don't have a real RPL calculator so I can't really time this, unfortunately.

Faster for integers on 50g if you replace

INV

by

-1 ^

but slower for reals.

John Keith · 08-05-2018, 02:05 PM

(07-09-2018 01:28 PM)Joe Horn Wrote: Disclaimer: This is a mini-challenge rather than simply an entry in the software library because nobody needs fast Hilbert matrices.

Perhaps not every day Smile

, but the same method can be used to generate any Hankel or Cauchy matrix, of which the Hilbert matrix are sub-types.

As an example, the following variation on Thomas's program returns the N x N Catalan Hankel matrix as described here.
It pre-calculates a list of Catalan numbers using an efficient recurrence.

The 50 x 50 matrix is returned in about 5.7 seconds (exact mode). Note that the resulting matrix is about 48K bytes in size.

Code:

\<< DUP 1 - DUP2 + \-> n n1 n2

  \<< 1 DUPDUP n2 2 -

    FOR n n DUP 4 * 2 + PICK3 * SWAP 2 + /

    NEXT n2 \->LIST \-> c

    \<< c 1. n SUB EVAL n DUP 1 + SWAP n1 +

      FOR i n1 DUPN c i GET

      NEXT n DUP 2. \->LIST \->ARRY

    \>>

  \>>

\>>

Of course, I am equally sure that nobody needs a 50 x 50 matrix of Catalan numbers either. Smile

Juan14 · 08-05-2018, 06:31 PM

A solution using DOSUBS:

«
→ n
«
{} 1 n 2 * 1 - FOR j j INV + NEXT
n « n →LIST » DOSUBS AXL
»
»

The first line generates a list of the inverse of numbers from 1 to 2*n-1.
The second line creates a list of lists, the first lists takes n elements starting at the first element, the second list take n elements too, but starts at the second element and so on until all the lists of n elements are exhausted, finally AXL converts the list of lists into a matrix. I can't check the running time.

John Keith · 08-05-2018, 07:40 PM

On my hp 50g:

Approx. mode: 4.09 seconds
Exact mode: 5.46 seconds.