Threaded Mode | Linear Mode

pier4r · (This post was last modified: 02-14-2019 12:57 PM by pier4r.)

Spin off of the thread "processing lists!" (somewhen there will be also processing strings)

In particular due to the observation that editing a pre allocated vector on the 50g (RPL) emulator is much faster than editing lists.

Now for the little that I know the data structure with the most expressive tools on the RPL platform are lists. Especially with the additional libraries LSORT, goferlist, listExt plus the very useful built in RPL commands (DOSUBS, DOLIST, STREAM, \GSLIST (sigma list), ETC). (if one knows more, please contribute here).

The following challenges can be extended also to other systems (prime, 71 basic, RPN models and what not) if those lack functions to achieve the wanted result.

The basic approach is to "fill the gap" when compared to list operations.

Note(s)
- array = normal vector. Row or column vector are actually matrices.
- If possible, avoid going through an operation using lists as manipulative mean.
- presume that the input has at least 100 elements (unless unlikely or otherwise stated).

#1 Sum the elements in a vector
Input: a vector.
Output: the sum of the elements in the vector.
Inspiration: sigma list or \GSLIST

Example(s):
[ 1 2 3 ] -> 6

#2 Difference between adjacent values of the vector
Input: a vector.
Output: a vector of where the element in position i is equal to the difference of inputVector[i+1] - inputVector[i]
Inspiration: delta list or \GDLIST

Example(s):
[ 1 3 5 ] -> [ 2 2 ]

#3 Product of the elements
Input: a vector.
Output: the product of the elements in the vector
Inspiration: pi list or \GPLIST

Example(s):
[ 1 3 5 ] -> 15

#4 reverse the elements of a vector
Input: a vector.
Output: the input vector with the elements reversed
Inspiration: REVLIST

Example(s):
[ 1 3 5 ] -> [ 5 3 1 ]

#5 (hard) iterating through a vector and apply a certain logic through it
Input:
L3: a vector.
L2: number of elements to process together
L1: logic to apply (a program)
Output:
L1: another vector result from the logic applied. If the result is only one number, a vector that contains it.
Inspiration: STREAM, DOSUBS (without NSUB, ENDSUB), MAP, List parallel processing (50g AUR section F)

#6 Max in a vector
input:
L1: a vector.
output
L1: the maximum element in it.
Inspiration: LMAX (listExt). Also AUR 50g 2-16 as possible comparison

#7 Min in a vector
input:
L1: a vector.
output
L1: the minimum element in it.
Inspiration: LMIN (listExt). Also AUR 50g 2-16 as possible comparison

#8 Median of a vector
input:
L1: a vector.
output
L1: the median element in it
Inspiration: http://www.hpmuseum.org/forum/thread-114...#pid104441

#8.1 Percentile of a vector
input:
L1: a vector.
L2: a value between 0 and 100
output
L1: the n-th percentile in the vector.
Inspiration: http://www.hpmuseum.org/forum/thread-114...#pid104441

#9 SORT a vector
input:
L1: a vector.
output
L1: the vector sorted with the values ascending.
Inspiration: SORT, LSORT

#10 Create a vector given an integer sequence
input
L3: minValue
L2: maxValue
L1: step
output
L1: a vector
Inspiration: LSEQR (listExt)

#11 Create a vector given a function
input
L4: minValue
L3: maxValue
L2: step
L1: program that implements f(x) with x in [minValue, maxvalue]
output
L1: a vector in the form [ f(minValue), f(minvalue+step), f(minValue+2*step), ..., f(maxValue)]
Inspiration: SEQ

# Hopefully many more will be added.

grsbanks · (This post was last modified: 02-14-2019 12:11 PM by grsbanks.)

Given what a vector actually represents I'm not sure what practical applications these things could have, but on the 50g at least you could always use AXL to convert to a list, do whatever you need on the resulting list and then use AXL to convert back to a vector.

Paul Dale · 02-14-2019, 12:25 PM

These are without use. I can think of justification for each of these operations:

Sum of elements is one path to the arithmetic mean.
Differences of adjacent elements is the method of differences (Babbage differenct engine anyone).
Produce of elements leads to the geometric mean.
REVLIST might do fun ñ' ñ is a square matrix. ñ' REV(ñ) likewise.

Pauli

pier4r · (This post was last modified: 02-14-2019 01:24 PM by pier4r.)

(02-14-2019 12:10 PM)grsbanks Wrote: Given what a vector actually represents I'm not sure what practical applications these things could have, but on the 50g at least you could always use AXL to convert to a list, do whatever you need on the resulting list and then use AXL to convert back to a vector.

Well exactly that is to be avoided (that is, actually ending up working with lists).
With large vectors it may not pay off.

All this is born from the observation that manipulating large vectors may be faster (orders of magnitude faster) than manipulating large lists. If you do such operations over end over, the difference in time is marked.

For small number of elements (hundreds?), the list data structure is still the best - I think - due to the large amount of functions available.

And in general filling the gap for missing functions on vectors and matrices is a nice challenge.

edit: also the "practical applications" appears once one has the tools. If the same operation makes sense on a list, then it makes sense on the same sequence of number in general, being it represented as a list, a vector/matrix or a string (are there any other built-in data structures in the RPL/RPN world?).

edit2: I also remember to have said "I don't mind if a process is slow, if it is faster than my pace in processing the solutions". It is still the case. If I know I won't be able to process the result of a procedure in the next weeks, the procedure can as well last a week (I have many 50g exactly for this). Nonetheless in the long run striving to optimize this or that bit of code is not bad (in terms of speed, features, reliability, ease of maintenance, ease to read, accuracy, searchability, whateverbility).

Albert Chan · 02-14-2019, 02:16 PM

Bjarne Stroustrup: Why you should avoid linked lists

ttw · 02-14-2019, 07:00 PM

One problem (on the HP50g at least) with "arrays" is that these are limited to one and two dimensions. They cannot contain binary integers (why aren't binary integers treated as ordinary integers?).

In "theory" arrays are much faster than lists especially in more than one dimension. Arrays come with some structure. For example, each dimension has the same length. In two dimensions, each column vector (subarray) is independent of the others; same for rows; all columns are dependent on all rows and vice versa. This feature allows parallel operation on arrays or on scalar processors, one need not check for as many dependences. An array should not be treated as a list of lists.

One can march down any dimension of an array using constant increments; this need not be the case for lists.

Matrices are two-dimensional arrays with additional structure; addition, subtraction, multiplication and sometimes inverse (thus division) are defined.

Indices in arrays are dummy. In Fortran, upper and lower indices can be any numbers with lower <= upper.

John Keith · 02-14-2019, 07:02 PM

For vectors, CNRM returns the sum of the absolute values of the elements, and RNRM returns the element with the largest absolute value. Similar to \GSLIST and \<< MAX \>> STREAM respectively as long as all elements of the vector are non-negative.

John Keith · 02-14-2019, 07:19 PM

(02-14-2019 07:00 PM)ttw Wrote: One problem (on the HP50g at least) with "arrays" is that these are limited to one and two dimensions. They cannot contain binary integers (why aren't binary integers treated as ordinary integers?).

In "theory" arrays are much faster than lists especially in more than one dimension. Arrays come with some structure. For example, each dimension has the same length. In two dimensions, each column vector (subarray) is independent of the others; same for rows; all columns are dependent on all rows and vice versa. This feature allows parallel operation on arrays or on scalar processors, one need not check for as many dependences. An array should not be treated as a list of lists.

One can march down any dimension of an array using constant increments; this need not be the case for lists.

Matrices are two-dimensional arrays with additional structure; addition, subtraction, multiplication and sometimes inverse (thus division) are defined.

Indices in arrays are dummy. In Fortran, upper and lower indices can be any numbers with lower <= upper.

As explained in recent posts in the processing lists thread, arrays are only faster than lists if they contain real or complex numbers. For arrays of exact integers there is no speed advantage.

pier4r · (This post was last modified: 02-17-2019 08:57 PM by pier4r.)

Interesting, it seems that "parallel processing" between vectors is not possible (or I am doing it wrong).

For example this is possible between lists in RPL.
{ 0 1 2 } { 1 2 3 } <

instead
[ 0 1 2 ] [ 1 2 3 ] <
seems not directly possible. Maybe there is another way but at the moment I do not know it.

In the case it wouldn't be there, that would be a major obstacle to pass as the parallel processing (available for lists) is quite neat.

John Keith · 02-17-2019, 09:20 PM

No, "automatic list processing" as it is also called, is only for lists, not arrays.

However, the AXL command which converts lists to arrays and vice versa, is quite fast. This allows you to have the best of both worlds.

ttw · (This post was last modified: 02-17-2019 10:23 PM by ttw.)

(02-14-2019 12:10 PM)grsbanks Wrote: Given what a vector actually represents I'm not sure what practical applications these things could have, but on the 50g at least you could always use AXL to convert to a list, do whatever you need on the resulting list and then use AXL to convert back to a vector.

This is what I do. Many things in the HP50g are strange Vectors can be reals or integers but REPL doesn't work with reals. Vectors cannot handle binary integers (although lists can) so I have to do lots of real (actually integers) to binary converts and the like to do logic.

I haven't checked if two dimensional lists work better or worse than two dimensional arrays. Of course, once I go to 3 or 4 dimensions I have to use lists. Sub only works on one-dimensional lists.

On the other hand, 2-d lists do allow extraction of a "column" (or row) easily; one doesn't get to go both ways without being able to do TRAN or TRN (neither of which work on all types.) It would be nice to get a single row or column of a matrix; currently I just convert the matrix to rows (or columns) and make a list of these. This can take lots of stacksmanship.

DavidM · 02-18-2019, 12:43 PM

(02-17-2019 10:18 PM)ttw Wrote: It would be nice to get a single row or column of a matrix; currently I just convert the matrix to rows (or columns) and make a list of these. This can take lots of stacksmanship.

ROW- and COL- provide you with both the modified source array as well as the specified row or column that was removed. So something like OVER SWAP COL- NIP could be used to return a specified column of an array without having to do a lot of conversions.

ttw · 02-18-2019, 03:15 PM

Good point. I'll give the COL- and ROW- a try. It looks like I can use COL+ etc. to put things back. The only trick left to get is to get a block out of a matrix (of course, this is not so easy to do efficiently in any language; it's easy to program in high level languages but the access pattern is tough.) There are some irregular patterns too (T or + shaped in various orientations), not much one can do with these.

pier4r · 02-18-2019, 05:07 PM

if I am not mistaken SUB could be used to access sub matrices.

John Keith · (This post was last modified: 02-18-2019 05:23 PM by John Keith.)

The ROW and COL type commands tend to be slow, although some are worse than others. I find that lists are faster to work with overall than arrays, and more flexible as well. I really only use arrays when I need linear algebra functions like inverse , determinant, etc.

If you have ListExt, the following short program is faster than TRAN, even if you use AXL twice to convert between matrix and list:

Code:

\<< DUP LXIL SWAP SIZE LDIST

\>>

pier4r · (This post was last modified: 02-18-2019 05:32 PM by pier4r.)

John I am sure that counting development time the lists are faster thanks to their available functions. Considering only running time could be that lists are fast in your case because you use small sizes? (less than 250 elements)

Up to a certain length I have no doubt that arrays and lists for real numbers are practically the same in computational cost. And of course if one needs a complex data structure lists are the way to go. Arrays can hold only few numeric types.
I am a bit less sure when you work with a large amount of elements.

Also I think that some commands for matrices, vectors (and not only) are not that optimized. So they do the work reliably, but not in the fastest way possible. That leaves room for improvements as it was done for lists commands.

pier4r · 12-27-2022, 10:39 PM

Is there a way, built in or with additional libraries, to have the equivalent of POS (and also MPOS from ListExt) for arrays? Of course I can have it converting the array to a list, but I would like to avoid the conversion overhead, especially for long arrays.

toml_12953 · 12-27-2022, 11:58 PM

(02-14-2019 02:16 PM)Albert Chan Wrote: Bjarne Stroustrup: Why you should avoid linked lists

He deliberately ignored binary searches and direct searches and then complains that linear searches take up the lion's share of the time. Why? This reminds me of a particular math class. The prof put up an equation with no exponents. A student said it was an exponential equation. The prof said it was not to which the student replied, "Yes it is. If you add an exponent, it's exponential." We all stared at him! The professor said, "If my Grandmother had wheels, she'd be a bike." We all cracked up.

John Keith · 12-28-2022, 08:16 PM

I am wary of anyone saying "Avoid X" or "Always use Y". There are good reasons for having many different data structures (or programming languages, etc.). It is much more useful to say something along the lines of "think carefully, and choose the best tool for the job" Unfortunately, bombastic declarations get more clicks and views than calm, rational and thoughtful expositions.

John Keith · 12-28-2022, 08:32 PM

(12-27-2022 10:39 PM)pier4r Wrote: Is there a way, built in or with additional libraries, to have the equivalent of POS (and also MPOS from ListExt) for arrays?

Not unless David manages to add that functionality to ListExt. Big Grin

Seriously though, AXL n MPOS is quite fast and efficient. I tend to use lists rather than arrays unless I need linear algebra operations (matrix inversion, determinants, etc.).

There is one issue to watch out for when using AXL however. AXL always converts a list into a type 29 (symbolic) array. The command AXM will convert a type 29 array into an approximate (type 3 or 4) array but it is slow. If you need to convert a list of reals into a type 3 array, it may be better just to use OBJ-> and ->ARRY.