(42) Double precision summation and DOT

04142020, 08:38 AM
Post: #1




(42) Double precision summation and DOT
works in all flavors of the 42, mainly as test cases for Thomas' FMArewrite of all internal DOT functions. For the 42S, change all LSTO instructions to STO.
It works using subroutines that will:  add two singleprecision numbers and produce a doubleprecision number (S+)  multiply two singleprecision numbers and produce a doubleprecision number (Sx)  add two doubleprecision numbers (D+) These routines are bundled in the file double.raw I've added my notes on these, for those who want more details. The double precision summation and DOT functions are in SUMDOT.raw. They will return doubleprecision numbers (heads in Y, tails in X) listings: Code: 00 { 93Byte Prgm } Code: 00 { 154Byte Prgm } Cheers, Werner 

04142020, 02:35 PM
Post: #2




RE: (42) Double precision summation and DOT
Quote:0.9 SQRT X^2 0.9  produces 10^n, where n=12 for a real 42S and 34 for Free42 Decimal. This is just lucky. For decimal precision P, many times it just return 0. Code: P "0.9 SQRT X^2 0.9 " Kahan suggest a better way, "Matlab’s Loss is Nobody’s Gain", page 5 ULP = abs((4/3  1)*3 – 1) round(4/3  1) has error of 1/3 ULP (exactly), under both decimal and binary version Example: Free42Binary.exe: ULP = 2^52 ≈ 2.22E16 → precision = 53 bits Free42Decimal.exe: ULP = 1E33 → precision = 34 decimals 

04142020, 03:07 PM
Post: #3




RE: (42) Double precision summation and DOT
Double precision trick assumed roundtonearest (so sign bit can hold a bit), halfwaytoeven.
However, for decimal version, this also meant sign can only hold 1 bit. Thus, decimal version with odd digits precision cannot be split accurately this way ... Also, no internal extra precision (unless it were rounded with halfwaytoodd), to avoid doublerounding issues. Code: #include <stdio.h> Above code demonstrated problems of doublerounding. Compile code with gcc v.7.1.0 (included with Strawberry Perl), Win7, no optimization > gcc otest test.c > test 2 3 4 The reason for the discrepancy can be seen from the bits: b = 1e16 + 2.9999 = 0b100011100001101111001001101111110000010000000000000010.1111111111111001 ... round(b, 53bits) = 1e16 + 2 round(b, 64bits) = 1e16 + 3 round(round(b, 64bits), 53bits) = 1e16 + 4 Reference: To SSE, or not to SSE, that is the question GNU Linux and the Extended Precision on x86 Processors 

04142020, 04:20 PM
Post: #4




RE: (42) Double precision summation and DOT
(04142020 02:35 PM)Albert Chan Wrote:Quote:0.9 SQRT X^2 0.9  produces 10^n, where n=12 for a real 42S and 34 for Free42 Decimal. Hello Albert. No, it isn't lucky ;) I specifically looked for a single invertable function that worked for n=12 and n=34. It doesn't have to work for anything else (well, to work for the binary version as well I had to take something else, still 'simple'. Kahan's example is longer to code.. that is important to me ;) Thanks for your insights.. always a joy to read. Cheers, Werner 

04152020, 12:54 AM
(This post was last modified: 04152020 06:13 PM by Albert Chan.)
Post: #5




RE: (42) Double precision summation and DOT
For decimal version, we can get precision P with less code
3 1/X 3 × 1 − // 10^(P), or 0 if binary version (*) this is longer, but will work for binary version too 3 1/X Enter Enter Enter 1 − + + // 10^(P), or (2)^(P) for binary version Example, on python 2.6: >>> x = 1/3. >>> print (x1+x+x).hex() 0x1.0000000000000p53 (*) with binary, x = 1/3 has error of ±1/3 ULP 2^P ULP = 1/2 → 3x = 1 ± 1/2^(P+1), which always rounds back to 1 

04152020, 06:50 AM
Post: #6




RE: (42) Double precision summation and DOT
Hello Albert
first off, Kahan's 4/3 formula results in 10^11 and 10^33, not 10^12 and 10^34. And your last 1/3 entry gives 2^53 for the binary version, while I need 2^54. If you can beat 0.44 ENTER 1/X 1/X , let me know ;) Cheers, Werner 

04152020, 01:26 PM
(This post was last modified: 04152020 05:38 PM by Albert Chan.)
Post: #7




RE: (42) Double precision summation and DOT
(04152020 06:50 AM)Werner Wrote: If you can beat 0.44 ENTER 1/X 1/X , let me know ;) Nice find For decimal version, with P decimal precision, above returned (10)^(P) Prove: Let x = 0.44 (exactly) // this implied P ≥ 2 y = round(1/x, Pdecimals) = c/x, where \(c = 1 + 1.2/(10)^P\) \(x1/y = x(11/c) = x(c1)/c = \large {0.528 \over 1.2 + (10)^P}\) \(\text{ulp_error} = (x1/y)×10^P = \large {0.528 \over 1.2/10^P + (1)^P}\) For P = 2, 3, 4, 5, 6, ..., ulp_error ≈ +0.5217, 0.5286, +0.5279, 0.5280, +0.5280 ... limit(ulp_error, P=∞) = 0.528/(0±1) = 0.528 > 0.5 → IROUND(ulp_error) = (1)^P → round(x1/y, Pdecimals) = (1)^P ÷ 10^P = (10)^(P) P.S. for roundtonearest, halfwaytoeven, above work even with P=1 x  1/(1/x) → 0.4  0.5 = 0.1 = (10)^(1) 

« Next Oldest  Next Newest »

User(s) browsing this thread: 2 Guest(s)