(42) Double precision summation and DOT
04-14-2020, 08:38 AM
Post: #1
 Werner Senior Member Posts: 635 Joined: Dec 2013
(42) Double precision summation and DOT
works in all flavors of the 42, mainly as test cases for Thomas' FMA-rewrite of all internal DOT functions. For the 42S, change all LSTO instructions to STO.

It works using subroutines that will:
- add two single-precision numbers and produce a double-precision number (S+)
- multiply two single-precision numbers and produce a double-precision number (Sx)
- add two double-precision numbers (D+)
These routines are bundled in the file double.raw
I've added my notes on these, for those who want more details.

The double precision summation and DOT functions are in SUMDOT.raw. They will return double-precision numbers (heads in Y, tails in X)

listings:

Code:
00 { 93-Byte Prgm } 01▸LBL "Σ" 02 EDIT 03 ABS 04 CLST 05 LASTX 06▸LBL 10 07 0 08 XEQ "D+" 09 J+ 10 RCLEL 11 FC? 77 12 GTO 10 13 EXITALL 14 CLX 15 + 16 RTN 17▸LBL "DOT2" 18 LSTO "REGS" 19 R↓ 20 EDIT 21 CLST 22 LSTO "S" 23 LSTO "I" 24▸LBL 11 25 COMPLEX 26 STO "S" 27 RCLEL 28 RCL IND "I" 29 XEQ "S×" 30 RCL "S" 31 COMPLEX 32 XEQ "D+" 33 J+ 34 ISG "I" 35 X<>Y 36 FC? 77 37 GTO 11 38 RCLEL 39 EXITALL 40 R↓ 41 END

Code:
00 { 154-Byte Prgm } 01▸LBL "D-" 02 +/- 03 X<>Y 04 +/- 05 X<>Y 06▸LBL "D+" 07 X<> ST T 08 XEQ "S+" 09 X<>Y 10 R↓ 11 + 12 + 13▸LBL "Q+" 14 X<>Y 15 RCL+ ST Y 16 STO- ST L 17 X<>Y 18 RCL+ ST L 19 RTN 20▸LBL "S+" 21 LSTO "." 22 RCL+ ST Y 23 RCL- ST Y 24 STO- "." 25 X<> ST L 26 STO- ST L 27 X<>Y 28 RCL+ ST L 29 RCL+ "." 30 RTN 31▸LBL "S×" 32 RCL× ST Y 33 LSTO "." 34 LASTX 35 XEQ 99 36 R↑ 37 XEQ 99 38 RCL× ST Z 39 R↑ 40 RCL× ST L 41 R↓ 42 R↓ 43 STO× ST Y 44 RCL× ST L 45 RCL- "." 46 + 47 + 48 + 49 RCL "." 50 X<>Y 51 RTN 52▸LBL 99 53 100000000000000001 54 ABS 55 R↓ 56 ENTER 57 RCL× ST L 58 STO- ST L 59 RCL+ ST L 60 X<>Y 61 RCL- ST Y 62 END

Cheers, Werner

Attached File(s)
04-14-2020, 02:35 PM
Post: #2
 Albert Chan Senior Member Posts: 1,558 Joined: Jul 2018
RE: (42) Double precision summation and DOT
Quote:0.9 SQRT X^2 0.9 - produces 10^-n, where n=12 for a real 42S and 34 for Free42 Decimal.

This is just lucky. For decimal precision P, many times it just return 0.
Code:
P    "0.9 SQRT X^2 0.9 -" 10   1E-10 11   0 12   1E-12 13   0 14   -1E-14 15   0 16   0 17   0 18   1E-18 19   0 20   0 21   1E-21 22   1E-22 23   0 24   0 25   1E-25 26   -1E-26 27   -1E-27 28   -1E-28 29   0 30   0 31   0 32   1E-32 33   1E-33 34   1E-34 35   0 36   0 37   0 38   1E-38 39   0 40   1E-40

Kahan suggest a better way, "Matlab’s Loss is Nobody’s Gain", page 5

ULP = abs((4/3 - 1)*3 – 1)

round(4/3 - 1) has error of 1/3 ULP (exactly), under both decimal and binary version

Example:

Free42Binary.exe: ﻿ ﻿ULP = 2^-52 ≈ 2.22E-16 ﻿ → precision = 53 bits
Free42Decimal.exe: ULP = 1E-33 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿→ precision = 34 decimals
04-14-2020, 03:07 PM
Post: #3
 Albert Chan Senior Member Posts: 1,558 Joined: Jul 2018
RE: (42) Double precision summation and DOT
Double precision trick assumed round-to-nearest (so sign bit can hold a bit), halfway-to-even.

However, for decimal version, this also meant sign can only hold 1 bit.
Thus, decimal version with odd digits precision cannot be split accurately this way ...

Also, no internal extra precision (unless it were rounded with half-way-to-odd), to avoid double-rounding issues.

Code:
#include <stdio.h> int main() // test.c {     double x = 1e16;     double y = x + 2.9999;    // hit by double-rounding     printf("%g %g %g", 2.9999+1e16-1e16, 2.9999+x-x, y-x);     return 0; }

Above code demonstrated problems of double-rounding.
Compile code with gcc v.7.1.0 (included with Strawberry Perl), Win7, no optimization

> gcc -otest test.c
> test
2 3 4

The reason for the discrepancy can be seen from the bits:

b = 1e16 + 2.9999 = 0b100011100001101111001001101111110000010000000000000010.1111111111111001 ...

round(b, 53-bits) = 1e16 + 2
round(b, 64-bits) = 1e16 + 3
round(round(b, 64-bits), 53-bits) = 1e16 + 4

Reference:
To SSE, or not to SSE, that is the question
GNU Linux and the Extended Precision on x86 Processors
04-14-2020, 04:20 PM
Post: #4
 Werner Senior Member Posts: 635 Joined: Dec 2013
RE: (42) Double precision summation and DOT
(04-14-2020 02:35 PM)Albert Chan Wrote:
Quote:0.9 SQRT X^2 0.9 - produces 10^-n, where n=12 for a real 42S and 34 for Free42 Decimal.

This is just lucky. For decimal precision P, many times it just return 0.
Kahan suggest a better way, "Matlab’s Loss is Nobody’s Gain", page 5

ULP = abs((4/3 - 1)*3 – 1)

Hello Albert.
No, it isn't lucky ;-) I specifically looked for a single invertable function that worked for n=12 and n=34. It doesn't have to work for anything else (well, to work for the binary version as well I had to take something else, still 'simple'.
Kahan's example is longer to code.. that is important to me ;-)
Cheers, Werner
04-15-2020, 12:54 AM (This post was last modified: 04-15-2020 06:13 PM by Albert Chan.)
Post: #5
 Albert Chan Senior Member Posts: 1,558 Joined: Jul 2018
RE: (42) Double precision summation and DOT
For decimal version, we can get precision P with less code

3 1/X 3 × 1 − ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿﻿// -10^(-P), or 0 if binary version (*)

this is longer, but will work for binary version too

3 1/X Enter Enter Enter 1 − + + ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿// -10^(-P), or (-2)^(-P) for binary version

Example, on python 2.6:

>>> x = 1/3.
>>> print (x-1+x+x).hex()
-0x1.0000000000000p-53

(*) with binary, x = 1/3 has error of ±1/3 ULP
2^P ULP = 1/2 → 3x = 1 ± 1/2^(P+1), which always rounds back to 1
04-15-2020, 06:50 AM
Post: #6
 Werner Senior Member Posts: 635 Joined: Dec 2013
RE: (42) Double precision summation and DOT
Hello Albert
first off, Kahan's 4/3 formula results in 10^-11 and 10^-33, not 10^-12 and 10^-34.
And your last 1/3 entry gives 2^-53 for the binary version, while I need 2^-54.
If you can beat 0.44 ENTER 1/X 1/X -, let me know ;-)
Cheers, Werner
04-15-2020, 01:26 PM (This post was last modified: 04-15-2020 05:38 PM by Albert Chan.)
Post: #7
 Albert Chan Senior Member Posts: 1,558 Joined: Jul 2018
RE: (42) Double precision summation and DOT
(04-15-2020 06:50 AM)Werner Wrote:  If you can beat 0.44 ENTER 1/X 1/X -, let me know ;-)

Nice find
For decimal version, with P decimal precision, above returned (-10)^(-P)

Prove:

Let x = 0.44 ﻿ (exactly) ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿// this implied P ≥ 2

y = round(1/x, P-decimals) = c/x, where $$c = 1 + 1.2/(-10)^P$$

$$x-1/y = x(1-1/c) = x(c-1)/c = \large {0.528 \over 1.2 + (-10)^P}$$

$$\text{ulp_error} = (x-1/y)×10^P = \large {0.528 \over 1.2/10^P + (-1)^P}$$

For P = 2, 3, 4, 5, 6, ..., ulp_error ≈ +0.5217, -0.5286, +0.5279, -0.5280, +0.5280 ...

limit(|ulp_error|, P=∞) = |0.528/(0±1)| = 0.528 > 0.5

→ IROUND(ulp_error) = (-1)^P

→ round(x-1/y, P-decimals) = (-1)^P ÷ 10^P = (-10)^(-P)

P.S. for round-to-nearest, halfway-to-even, above work even with P=1
x - 1/(1/x) → 0.4 - 0.5 = -0.1 = (-10)^(-1)
 « Next Oldest | Next Newest »

User(s) browsing this thread: 2 Guest(s)