Matrix Division on the 28C/S (and later)
03-04-2020, 01:41 PM
Post: #21
 J-F Garnier Senior Member Posts: 565 Joined: Dec 2013
RE: Matrix Division on the 28C/S (and later)
(03-04-2020 09:23 AM)Werner Wrote:
Code:
@ dot test 1: input a, result = DOT([a 1 -1],[1 1 1]) >LBL "DT1"   3  1  NEWMAT  EDIT  Rv  \->  1  \->  -1  EXITALL  ENTER  SIGN  ABS  DOT  END

Slightly changing the test to DOT( [ a -1 1 ] , [ 1 1 1] ), I noticed this difference between the 71B and the 28S/42S (and later RPL machines I guess):
Using a=1E-20 (or any values <5E-16):
71B: result is 1E-15 !
28S/42S: result is correctly rounded to 0.

Something I will investigate for the HP71 Math ROM 2 :-)

J-F
03-04-2020, 04:39 PM (This post was last modified: 03-04-2020 07:00 PM by Albert Chan.)
Post: #22
 Albert Chan Senior Member Posts: 1,659 Joined: Jul 2018
RE: Matrix Division on the 28C/S (and later)
(03-04-2020 12:29 PM)Thomas Okken Wrote:  I'll write the new version of DOT next week, I think. I'll have to add FMA to the "phloat" functions, using bid128_fma() in the decimal version, and (I think) std::fma() in the binary version, and then implement the functions from your citations.
It looks pretty straightforward.

DotFMA probably will not gain much accuracy.
Example, if we fuse for every step, with tiny a, such that a+1=1

DotFMA([a 1 -1], [1 1 1]) = (a+1) - 1 = 1-1 = 0

However, fusing result at the end help (essentially "double" precision until final sum)
see CompDot, CompDotFMA, http://rnc7.loria.fr/louvet_poster.pdf

(03-04-2020 01:41 PM)J-F Garnier Wrote:  Slightly changing the test to DOT( [ a -1 1 ] , [ 1 1 1] ), I noticed this difference between the 71B and the 28S/42S
(and later RPL machines I guess):
Using a=1E-20 (or any values <5E-16):
71B: result is 1E-15 !
28S/42S: result is correctly rounded to 0.

It seems 71B internal 15-digits precision is using Round-to-zero (chopping excess digits)
Say, a = 1.00000000001e-15

DOT([a 1 -1], [1 1 1]) = (1.0000 00000 00000 10000 00000 01) - 1 ≈ (1.0000 00000 00000) - 1 = 0

DOT([a -1 1], [1 1 1]) = (-0.99999 99999 99998 99999 999999) + 1 ≈ (-0.99999 99999 99998) + 1 = 2E-15
03-05-2020, 06:41 AM
Post: #23
 Werner Senior Member Posts: 648 Joined: Dec 2013
RE: Matrix Division on the 28C/S (and later)
(03-04-2020 04:39 PM)Albert Chan Wrote:  DotFMA probably will not gain much accuracy.
Example, if we fuse for every step, with tiny a, such that a+1=1

DotFMA([a 1 -1], [1 1 1]) = (a+1) - 1 = 1-1 = 0

However, fusing result at the end help (essentially "double" precision until final sum)
see CompDot, CompDotFMA, http://rnc7.loria.fr/louvet_poster.pdf

Well, yes, that's the idea: don't use the FMA for the dot product itself, but for the correction term, a bit like Kahan summation, but for dot products.

Cheers, Werner
03-06-2020, 05:48 AM (This post was last modified: 03-06-2020 06:12 AM by Thomas Okken.)
Post: #24
 Thomas Okken Senior Member Posts: 1,374 Joined: Feb 2014
RE: Matrix Division on the 28C/S (and later)
First stab at compensated DOT, real-real case only:

Code:
-        for (i = 0; i < size; i++) -            dot += rm1->array->data[i] * rm2->array->data[i]; +        phloat *x = rm1->array->data; +        phloat *y = rm2->array->data; +        phloat s = x[0] * y[0]; +        phloat c = fma(x[0], y[0], -s); +        for (i = 1; i < size; i++) { +            phloat p = x[i] * y[i]; +            phloat pp = fma(x[i], y[i], -p); +            phloat xx = s + p; +            phloat zz = xx - s; +            phloat ss = (s - (xx - zz)) + (p - zz); +            s = xx; +            c = c + (pp + ss); +        } +        phloat dot = s + c;

I guess Werner's test case is a bit too easy, since it's returning a for arbitrarily low values, not just down to 1e-67. But maybe my implementation is wrong!

I uploaded a test version for Windows here. Decimal only, since VC++ 2008 doesn't appear to support FMA in <math.h>; I guess I'll have to switch to a newer version if I'm going through with this. The code does build without complaint for MacOS and Linux; I haven't tried Android or iOS yet.

I optimistically pushed this initial code to my master branch.
03-06-2020, 12:05 PM
Post: #25
 Werner Senior Member Posts: 648 Joined: Dec 2013
RE: Matrix Division on the 28C/S (and later)
Hello Thomas, no that is correct, the FMA 'compensated dot' algorithm is actually not the same as just using 68 digits ;-) As long as we're just performing a DOT product with one argument being a vector with all 1's, we're performing additions only, and we can actually do the same in RPN:
(based on Dekker's double precision algorithms, but modified a bit to work for decimal floating point arithmetic):

Code:
extended precision summation In: [x] Out: Sum(x[i]) >LBL "\S"  EDIT  ABS  CLST  LASTX >LBL 10  XEQ DS+  J+  RCLEL  FC? 77  GTO 10  EXITALL  CLX  +  + @ leave this off to see the full double-precision result  END @ add a number to a double-precision number @        L    X    Y    Z    T >LBL "DS+"        x    ts    hs  X<>Y  X<> ST Z        hs    x    ts  XEQ "S+"        tr    hr    ts  X<>Y  X<> ST Z        ts    tr    hr  + >LBL "Q+" @ 'quick add 2 numbers'  X<>Y  RCL+ ST Y  STO- ST L  X<>Y  RCL+ ST L  RTN >LBL "S+" @ add 2 numbers, result in double precision  LSTO "."  RCL+ ST Y  RCL- ST Y  STO- "."  X<> ST L  STO- ST L  X<>Y  RCL+ ST L  RCL+ "."  END

I can probably do the same for the full-fledged DOT.. I have the 'multiply two numbers and obtain a double-precision result' routines as well. The only thing I don't have for now is time ;-)

Cheers, Werner
03-06-2020, 05:20 PM
Post: #26
 Albert Chan Senior Member Posts: 1,659 Joined: Jul 2018
RE: Matrix Division on the 28C/S (and later)
(03-06-2020 05:48 AM)Thomas Okken Wrote:  I uploaded a test version for Windows here. Decimal only, since VC++ 2008 doesn't appear to support FMA in <math.h>;
I guess I'll have to switch to a newer version if I'm going through with this.

For machines that does not have hardware FMA, we can implement it in software.
see Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates, page 18 to 20

lua> function split(x) -- assume IEEE double
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿local hi = (2^27+1) * x
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿hi = hi - (hi - x)
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿return hi, x - hi
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿end

lua> function product(a, b)
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿local x = a*b
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿local a1, a2 = split(a)
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿local b1, b2 = split(b)
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿return x, -x + a1*b1 + a2*b1 + a1*b2 + a2*b2
: ﻿ ﻿ ﻿ ﻿ ﻿ ﻿end

lua> product(1.1, 1.2)
1.32 ﻿ ﻿ ﻿ ﻿ -4.44089209850063e-018
03-07-2020, 04:08 PM
Post: #27
 Thomas Okken Senior Member Posts: 1,374 Joined: Feb 2014
RE: Matrix Division on the 28C/S (and later)
I implemented the real-complex and complex-complex cases, and moved all three cases to separate functions, so they can be easily reused later in other functions. Those other functions have yet to be done, but the improved DOT is now complete.

I uploaded test versions for Android, Windows, Mac, and Linux to https://thomasokken.com/free42/download/test/
and the source code is in the master branch in GitHub.
For the Windows build, I switched from Visual C++ 2008 to 2019, so I was able to build the Binary version again.
03-08-2020, 06:11 PM (This post was last modified: 03-08-2020 06:18 PM by J-F Garnier.)
Post: #28
 J-F Garnier Senior Member Posts: 565 Joined: Dec 2013
RE: Matrix Division on the 28C/S (and later)
(03-04-2020 04:39 PM)Albert Chan Wrote:
(03-04-2020 01:41 PM)J-F Garnier Wrote:  Slightly changing the test to DOT( [ a -1 1 ] , [ 1 1 1] ), I noticed this difference between the 71B and the 28S/42S
(and later RPL machines I guess):
Using a=1E-20 (or any values <5E-16):
71B: result is 1E-15 !
28S/42S: result is correctly rounded to 0.

It seems 71B internal 15-digits precision is using Round-to-zero (chopping excess digits)
Say, a = 1.00000000001e-15

DOT([a 1 -1], [1 1 1]) = (1.0000 00000 00000 10000 00000 01) - 1 ≈ (1.0000 00000 00000) - 1 = 0

DOT([a -1 1], [1 1 1]) = (-0.99999 99999 99998 99999 999999) + 1 ≈ (-0.99999 99999 99998) + 1 = 2E-15

There is another similar truncating effect, instead of proper rounding with complex numbers:
>(1+1E-10,1)*(1-1E-10,1)
(-1.E-15,2)

Contrary to the previous case, it is still present in the 42S, 28S, 48 series up to the 49G+.
So I will not attempt to fix this effect (I don't call it bug...).

J-F
 « Next Oldest | Next Newest »

User(s) browsing this thread: 1 Guest(s)