HP Forums - newRPL: Simplified matrices proposal

userRPL historically had 4 different objects for vectors and matrices:
Vectors (1D)
Matrices (numeric only)
Symbolic Vectors (implemented as lists)
Symbolic Matrices (implemented as lists of lists)

Here we propose to simplify all this into a single object type: Matrix.

A matrix may contain real or complex numbers, or symbolic expressions mixed in. There is no difference between a symbolic matrix and a numeric one. Operations in newRPL follow generic overloadable operators, so addition and multiplication of individual elements can be done through the same generic operators and will automatically handle cross-type operations.
Vectors are not distinguishable from matrices in newRPL. They share the same object type and same properties and operations. When working with matrices, the following intuitive rules apply:

Operations on matrices follow the standard mathematical rules for addition/multiplication of matrices.
All 1xN and Nx1 matrices are also considered one-dimensional vectors.
One-dimensional vectors can be considered as row or column vectors (1xN and Nx1 matrices respectively) when operating with a matrix, whichever orientation makes the operation possible.
Vector-vector additions are performed as matrix additions, but always producing a 1xN (row vector) when orientations don't match.
Vector-vector multiplications are performed as matrix multiplications, and when orientations don't allow a valid operation, one vector is transposed to produce a (1xN)*(Nx1) = (1x1) multiplication.

This proposal wouldn't break existing code, except that all conversions between matrix/vector and algebraic/numeric matrix become unnecessary.
In userRPL, multiplying a matrix by a vector works as described above, so it's nothing revolutionary but takes this concept a step further.
In newRPL a row vector (1xN) can be transposed to get a column vector (Nx1) (unlike userRPL, where it has to be converted to a matrix first, then transposed).

Multiplying two vectors of the same size becomes a dot product of vectors in newRPL (unless you specifically transpose one to make a (Nx1)x(1xN) multiplication, producing the expected (NxN) result).

The storage model for unified matrices is already implemented, but the matrix/vector fusion is not (yet), and I think a little feedback could help refine the idea.

Thoughts?

Vectors and matrices in RPL are the same object type DOARRY. One of the first parameters in its definition is the number of dimensions - you can create tensors if you like (in ML).
Accessing numeric arrays is much faster than accessing symbolic ones, because the elements are all the same length. For that reason alone, I'd vote against the Grand Unification.
There IS another array type that is not exposed to RPL: the linked array. The type of all elements is still the same, but their lengths may vary (eg decimal integers), and the arry starts with the NxM offsets of all elements, followed by all element 'bodies';

My 2 cents ;-)

Cheers, Werner

This approach is likely to run into difficulties when you start implementing the matrix functions. What algorithms will you use to invert a matrix? There are lots of them for numeric matrices with a variety of different characteristics (speed, stability, easy reuse). For symbolic, far less so.

Then consider things like Eigen solutions and singular value decompositions? Nice numeric solutions, pretty horrible symbolically.

You are going to have to distinguish numeric matrices somewhere along the line, if not the various special form numerics (row, tri-diagonal, triangular, blocked).

Still, delaying these decisions so the user is oblivious to them might be beneficial.

- Pauli

(06-01-2015 08:48 PM)Werner Wrote: [ -> ]Vectors and matrices in RPL are the same object type DOARRY. One of the first parameters in its definition is the number of dimensions - you can create tensors if you like (in ML).

OK, yes, sort of, same prolog maybe, but it's not the same object by the fact it's a one-dimensional matrix. Commands that work on vectors are different than for matrices. And from the user perspective, matrices use double [[ ]] and vectors use single, so they even appear different too.

(06-01-2015 08:48 PM)Werner Wrote: [ -> ]Accessing numeric arrays is much faster than accessing symbolic ones, because the elements are all the same length. For that reason alone, I'd vote against the Grand Unification.

You hit it right in the nail. In newRPL, real numbers are arbitrary precision and therefore different length, so that advantage is gone even on numeric matrices.

(06-01-2015 08:48 PM)Werner Wrote: [ -> ]There IS another array type that is not exposed to RPL: the linked array. The type of all elements is still the same, but their lengths may vary (eg decimal integers), and the arry starts with the NxM offsets of all elements, followed by all element 'bodies';

Yes, newRPL matrices are already (by necessity) implemented as linked arrays, because of the variable size of the objects. The offsets is the only way to get some decent speed with such a format. I even went a step further, allowing the bodies to be shared by more than one element. For instance, an identity matrix only needs to store one zero and one 1, then the links point to the same objects to save space.

(06-01-2015 08:48 PM)Werner Wrote: [ -> ]My 2 cents ;-)

Your comments are always worth a lot more than that...

(06-01-2015 10:24 PM)Paul Dale Wrote: [ -> ]This approach is likely to run into difficulties when you start implementing the matrix functions. What algorithms will you use to invert a matrix? There are lots of them for numeric matrices with a variety of different characteristics (speed, stability, easy reuse). For symbolic, far less so.

Then consider things like Eigen solutions and singular value decompositions? Nice numeric solutions, pretty horrible symbolically.

Yes, yes, and yes. It could get nasty, I know.

(06-01-2015 10:24 PM)Paul Dale Wrote: [ -> ]You are going to have to distinguish numeric matrices somewhere along the line, if not the various special form numerics (row, tri-diagonal, triangular, blocked).

Yes, I thought that checking if a matrix is numeric is relatively quick and only needs to be done once before starting the solver. For diagonal, trangular, etc. having a different prolog won't help me either, so there's no advantage, I'll have to scan element by element.
It should be possible to do the more generic methods, and simply check if the matrix is numeric, and use more optimized methods then, without too much overhead.

Even if I used different types, then I would be trading the speed of checking if it's numeric for the mess of the "cross" routines, that need support for 2 types of objects and all their cross-operations between them (vector * matrix, vector * vector, matrix * vector, etc. would all be different routines dealing with the difference in internal format between objects).

(06-01-2015 10:24 PM)Paul Dale Wrote: [ -> ]Still, delaying these decisions so the user is oblivious to them might be beneficial.

That's what I'm thinking. It's easier for the user, and also easier for me to code the algorithms knowing all matrices and all vectors have the exact same in-memory format.
There's 2 main things in my proposal:
a) Having vectors and matrices mixed, which I think is very beneficial for both user and developer. For example, the link array has the offsets stored by rows, so a 1xN vector has offsets 1..N stored contiguous and ordered in memory. A column vector (Nx1 matrix) has also 1..N offsets stored in the same order, contiguous in memory, so when I say "one vector will be transposed..." it's only to give it some math feeling, the routine just multiplies elements 1 through N without caring, which is both fast and easy to implement.

b) Having multiple object types within the same matrix. The main reason NOT to do this (as Werner pointed out) was to have all objects of uniform size. Once that falls apart, it no longer matters if they have different types. Also, in newRPL the basic operations between different object types are already dealt with overloaded operators, so algorithms will be relatively clean and oblivious of object type (at least that's what I hope).

Now besides implementation issues (which I'm sure I'll run into several), are there any objections from the RPL language point of view? or math point of view?

(06-02-2015 12:19 PM)Claudio L. Wrote: [ -> ]Now besides implementation issues (which I'm sure I'll run into several), are there any objections from the RPL language point of view? or math point of view?

Aside from the virtues and possible issues with this approach, which I honestly can't address, I've a short question. One of NewRPL's original goals, I think/thought, was to be compatible with a large percentage of existing RPL code. It seems that a change such as this would break an awful lot of existing code wouldn't it?

I'm not really for or against this proposal, just wondering if my understanding was correct, and if so, how does this change impact that goal?

(06-03-2015 01:27 AM)rprosperi Wrote: [ -> ]Aside from the virtues and possible issues with this approach, which I honestly can't address, I've a short question. One of NewRPL's original goals, I think/thought, was to be compatible with a large percentage of existing RPL code. It seems that a change such as this would break an awful lot of existing code wouldn't it?

I'm not really for or against this proposal, just wondering if my understanding was correct, and if so, how does this change impact that goal?

Good point, let's analyze it.
Programs using vectors [ 1 2 3 ] will remain compatible, since the new compiler accepts the single brackets as equivalent to [[ 1 2 3 ]] (as a matter of fact, it removes the double braces and leaves only single when there's a single row).
Commands that convert vectors into matrices can be implemented as a No-Op to keep compatibility with old sources.
Programs using numeric matrices will remain compatible as well.
The new auto-transpose feature only kicks in for operations that would otherwise issue an "Invalid Dimension" error. So if your existing code didn't have errors, it will behave identically.

The only compatibility problem I see is with programs using lists of lists for symbolic matrices, which now should use square brackets instead. This represents relatively minimum portability effort, and if it bothers a lot of people (I doubt it), it could be handled transparently, for instance using the AXL command when a list is detected as argument. The command AXL could be implemented to always output a matrix, regardless of the input (in other words, leave matrices as is, and convert lists to matrices).

Another consequence (very minor in my opinion), is that when polar coordinates are set, 2 and 3 element vectors are displayed and edited as polar. In classic RPL, you could bypass this using double brackets. In newRPL, I think polar coordinates should be like numbers with a base: retain the coordinate system that the user used to create the vector. Operations should retain the coordinate system of the first argument. So the coordinate system becomes a property of the vector, rather than a system-wide setting.
As far as symbolic polar vectors... to be determined, I didn't think of that yet.

Overall, I think the compatibility impact on existing code is very minimum. Worst case, the user will have to manually change all symbolic matrices to square brackets, but even that could be solved transparently (with a speed penalty, of course).

(06-03-2015 01:21 PM)Claudio L. Wrote: [ -> ]Overall, I think the compatibility impact on existing code is very minimum. Worst case, the user will have to manually change all symbolic matrices to square brackets, but even that could be solved transparently (with a speed penalty, of course).

Thanks for the feedback. From the upper discussion, it appeared to me that the impacts would be more significant. Hopefully, the conflict is limited to the cases you described. It would be best if NewRPL could suggest the appropriate change when encountering the original syntax, as most folks will no doubt simply try to feed existing code to NewRPL without manually pre-screening and modifying the code first.

(06-05-2015 11:41 PM)rprosperi Wrote: [ -> ]Thanks for the feedback. From the upper discussion, it appeared to me that the impacts would be more significant.

Internally, it is significant. But from the user perspective the focus still remains on high compatibility.

(06-05-2015 11:41 PM)rprosperi Wrote: [ -> ]Hopefully, the conflict is limited to the cases you described. It would be best if NewRPL could suggest the appropriate change when encountering the original syntax, as most folks will no doubt simply try to feed existing code to NewRPL without manually pre-screening and modifying the code first.

I think you are bringing a good idea to the table. Perhaps instead of trying to have compatibility by brute-force, by creating dummy commands like AXL, etc. or affecting performance by converting lists to matrices, the simulator should include an option to load source code from classic RPL and do an automatic translation to newRPL (or, as you suggest, simply analyze the code and provide suggestions to the user in a code editor with syntax highlighting, etc). The changes would be relatively small: removing the dots after the numbers, for example, and changing symbolic lists of lists into matrices. But that converter should be either a separate application or an option in the simulator, rather than having the newRPL core trying to keep compatibility during runtime on-calc.

I see you already covered vectors in cylindrical or spherical modes (which I was going to voice concerns about). Your solution looks pretty good.
But I found another potential source of trouble: the SIZE command behaves differently on matrices and vectors: on matrices it obviously returns both dimensions as a list with two elements, on a vector or any other array it returns a list with just a single element. This may cause compatibility problems when vectors cannot be distinguished from single row or single-column matrices anymore.
I'd vote for a distinction between 1D and 2D arrays. Keeping the old "hack" for vectors should be simple (vectors and other 1D arrays are stored as 2D arrays with 0 as second dimension; this is how they can share the DOARRY prolog). Other than the SIZE issue, that's more of a personal preference, though.

(06-08-2015 04:39 PM)3298 Wrote: [ -> ]the SIZE command behaves differently on matrices and vectors: on matrices it obviously returns both dimensions as a list with two elements, on a vector or any other array it returns a list with just a single element.

You are right, SIZE would change. I can't say it would break, because it was probably already broken...

[ 1 2 3 ] SIZE --> { 3 }
[[ 1 2 3 ]] SIZE --> { 1 3 }
{ 1 2 3 } SIZE --> 3
{{ 1 2 3 }} SIZE --> 1

So it works for numeric vectors, but for symbolic vectors it returns a number, not a list. For symbolic matrices doesn't even give a hint.
So now the question becomes: Is this a behavior worth keeping?

EDIT: I should clarify the first 2 examples also apply to the MATRIX symbolic matrices, so it's consistent as long as you don't use lists (49g only, not backwards compatible).
So it looks like vectors should be a separate type to avoid compat. issues. I can't see a way around this one.