Threaded Mode | Linear Mode

pito · (This post was last modified: 06-04-2014 03:18 PM by pito.)

@Pauli: Assuming my 27x x^2 and 27x POWPOW based tests give the same results as the wp34s, can I derive from that my trig implementation is a little bit more precise than yours?

Claudio L. · 06-04-2014, 04:31 PM

(06-01-2014 11:57 AM)Tugdual Wrote:
(06-01-2014 07:32 AM)Thomas Klemm Wrote: The WP-34S however uses:

Code:

// Now Taylor series // tan(x) = x(1-x^2/3+x^4/5-x^6/7...)

Interesting, I wonder what justified the choice.
A matter of patent?
What is the most efficient approach in terms of speed and accuracy between CORDIC and Taylor?
Or may be CORDIC was the right choice for old calculators with little ROM?

I recently implemented from scratch all the transcendental functions using CORDIC, and I tested a few simple cases using Taylor as well. I also used Taylor to compute the constants needed for the CORDIC method.

CORDIC is almost constant speed, regardless of the argument. Taylor needs more terms for larger arguments, and as you approach zero you need less terms. In the end, if you don't adaptively stop Taylor when you reach the desired accuracy, they should be about the same. For very small numbers, stopping Taylor properly will be much faster than CORDIC, while for large numbers CORDIC will likely win.
The other main difference comes with the hardware: if you have hardware multiply in the CPU, then Taylor will be fast, otherwise it will be much slower than the shifts in CORDIC.
If you are implementing Decimal CORDIC like I did, Taylor would be even faster since Decimal CORDIC needs a lot more shifts than the binary one (4 times more).

Regarding ROM memory usage, I think Taylor uses much less memory than CORDIC. CORDIC requires storage of a significant number of constants to operate properly in the entire range of numbers.
I thought I was going to be able to use only about 300kbytes of constants, but they ended up being about 1 MByte (half our total ROM space!). The additional constants were needed to obtain full precision in bad corner cases (like COSH(1e-500)).

For normal Trig functions, many of the 'quirks' of both methods can be resolved with argument reduction. Paul's example of COS(Pi/2-ULP) can easily be calculated as SIN(ULP) and only a few terms of Taylor will give you a good answer. For hyperbolics you are mostly out of luck.

I think CORDIC is still a very good method to implement in modest hardware or very small CPU's. Other methods start to shine when the CPU is more powerful. If the CPU can multiply in 1 cycle, and a shift takes also 1 cycle, then you'll get a lot more work done by using that multiplication than splitting it into many shifts and adds.

Still, it's quite competitive. My CORDIC implementation was about 3 to 5 times faster than the method implemented in mpdecimal for ln() and about 6 to 10 times for exp(). Granted, CORDIC was limited to 2016 digits precision, while the algorithms in mpdecimal are for unlimited arbitrary precision, so they can't use any precomputed constants to speed up calculations. If you were to specialize your algorithm for the same amount of digits you might be able to optimize it and match the speed.

The main reason to choose CORDIC is versatility: SIN,COS, TAN, ASIN, ACOS, ATAN, EXP, LN, SQRT, SINH, COSH, TANH, ATANH, ACOSH, ASINH, all use the same method. I had to code 5 variants of the method to keep it optimized: 2 for "normal" trig, 2 for hyperbolics, and one for exp() since it can use half of the operations, so it's twice as fast as the normal algorithm.

By the way, I'd like to share the best paper I found so far on Decimal CORDIC, which helped me a lot in my implementation:

http://www.ac.usc.es/arith19/sites/defau...paper1.pdf

Claudio

Claudio L. · 06-04-2014, 05:57 PM

(06-03-2014 02:02 PM)pito Wrote: I've flashed my CM3 with the old stuff I've found - when running below code with 34digits I get (I've added your result for comparison):

Code:

TRIG9 TRIG9= 9.000 000 000 000 000 000 000 000 000 000 227 yours: 8.999 999 999 999 999 999 999 999 999 937 535 Elapsed x1 =201 ms

Code:

#define _PI "3.1415926535897932384626433832795028841971693993751" // TRIG9 PRECISION TEST cout<<"TRIG9"<<endl; decfp tmp, PI; PI = _PI; tmp = "9.0"; // 9 DEGREE // deg to rad timer = millis; tmp = tmp * PI / "180.0"; tmp = arcsin ( arccos ( arctan ( tan ( cos ( sin ( tmp ) ) ) ) ) ) ; // rad to deg tmp = tmp * "180.0" / PI; timer = millis - timer; cout<<"TRIG9= "<<tmp<<endl; cout<<"Elapsed x1 ="<<timer<<" ms"<<endl;

But I've learned you do rounding after each calculation so maybe that is what makes the diff..

Does anyone know of a table of correct values for this?
newRPL gave me (for 34 digits):

8.999999999999999999999999999999979

I ran a loop from 9 to 900 digits, and the difference is always only the last 2 digits, being the worst difference 85 ULP.

But how do we know right from wrong?

Claudio

pito · (This post was last modified: 06-04-2014 06:06 PM by pito.)

Quote:I recently implemented from scratch all the transcendental functions using CORDIC

On which hw platform your implementation runs (cpu)?

Quote:But how do we know right from wrong?

The 9 degree test:
The windows calculator (most probably decNumber lib 34+) gives you 9.0
Wolfram's alfa online gives you 9.0

Claudio L. · 06-04-2014, 06:21 PM

(06-04-2014 06:06 PM)pito Wrote:
Quote:I recently implemented from scratch all the transcendental functions using CORDIC
On which hw platform your implementation runs (cpu)?

Right now it's running on x86 (actually amd64).

(06-04-2014 06:06 PM)pito Wrote:
Quote:But how do we know right from wrong?
The 9 degree test:
The windows calculator (most probably decNumber lib 34+) gives you 9.0
Wolfram's alfa online gives you 9.0

That's my point, it's hard to find the right result. Obviously the 9.0 is incorrect if we accept that rounding is a fact.
What I did to check:
I got all my partial results at 34 digits precision on the stack, for each operation.
Then I set the precision to 9 digits more.
Then I picked each partial result and applied the operation (sin, cos, etc), and compared the result with higher precision to the rounded one to verify correct rounding (so I took a previous result, rounded to 34 digits, and used it as argument to a function with 43 digits). newRPL did the correct rounding every time. I found also that doing * PI / 180 is not the same as / 180 * PI (difference of 1 in the last digit).
So I guess the "correct" answer will also depend on the order of the operations.
I also tried preccalc (from sourceforge), which uses adaptive precision and also gives the useless 9.0.

Claudio

pito · (This post was last modified: 06-04-2014 06:41 PM by pito.)

WP34s calc and various other arbitrary precision decimal fp libs/wrappers are using the IBM's decNumber library (which sets the standards). There are some settings for the rounding inside (context something), I never messed with them in detail, though. Maybe the difference between the results is related to the proper rounding setting's "know-how" with the library..

mpdecimal for example:

Quote:libmpdec - C/C++ library
libmpdec is a complete implementation of the General Decimal Arithmetic Specification. The specification, written by Mike Cowlishaw from IBM, defines a general purpose arbitrary precision data type together with rigorously specified functions and rounding behavior. As described in the scope section of the specification, libmpdec will - with minor restrictions - also conform to the IEEE 754-2008 Standard for Floating-Point Arithmetic, provided that the appropriate context parameters are set...

Claudio L. · 06-04-2014, 06:37 PM

(06-04-2014 06:30 PM)pito Wrote: WP34s calc and various other arbitrary precision decimal fp libs/wrappers are using the IBM's decNumber library (which sets the standards). There are some settings for the rounding inside (context something), I never messed with them in detail, though. Maybe the difference between the results is related to the proper rounding settings know how within the library..

Yes, but decNumber doesn't have sin, cos, tan, so everybody has their own implementations. We should all have correctly rounded results, so in theory all programs using a fixed precision should give the exact same number (not 9.0).
The ones that use adaptive precision or use higher internal precision and the rounding is for display only (not for intermediate results) would be able to show 9.0.

What we can do, is use Wolfram Alpha to do all the steps, but typing the 34 digit rounded result from the previous answer to get the final result with rounding. Same as I did in the stack.

Claudio

pito · (This post was last modified: 06-04-2014 06:46 PM by pito.)

Caution: I've seen the situation where when 9 degree test had been compiled under C (crappy double precision) it gave 9.0. The C had optimized the math off, as it is symmetrical. Hopefully wolfram's alfa does not do the same when the 9 degree formula is entered in Smile

pito · (This post was last modified: 06-04-2014 07:19 PM by pito.)

This is the entire sequence from wolfram's alfa (120digits):

Code:

157079632679489661923132169163975144209858469968755291048747229615390820314310​44993140174126710585339910740432566411533

156434465040230869010105319467166892313899892085660790084641346057758793305623​57933669587267684868837514030084768044693

987789061484321026592245971856886355356443043866049621785751391173229221558883​85891561054646005957655720326772574833859

516357552591207213695274523631412761647072139174478472148435993625758334427960​23036122220616360602112239375025366785886

987789061484321026592245971856886355356443043866049621785751391173229221558883​85891561054646005957655720326772574833858

156434465040230869010105319467166892313899892085660790084641346057758793305623​57933669587267684868837514030084768044700

157079632679489661923132169163975144209858469968755291048747229615390820314310​44993140174126710585339910740432566411539

000000000000000000000000000000000000000000000000000000000000000000000000000000​00000000000000000000000000000000000000330

pito · (This post was last modified: 06-04-2014 07:42 PM by pito.)

This is 34digits not rounded

Code:

157079632679489661923132169163975

156434465040230869010105319467166

987789061484321026592245971856886

516357552591207213695274523631411

987789061484321026592245971856885

156434465040230869010105319467175

157079632679489661923132169163983

000000000000000000000000000000450

and rounded

Code:

157079632679489661923132169163975

156434465040230869010105319467167

987789061484321026592245971856886

516357552591207213695274523631412

987789061484321026592245971856886

156434465040230869010105319467169

157079632679489661923132169163977

000000000000000000000000000000106

Manual rounding at 34th digit. Not sure the proper one.

Claudio L. · (This post was last modified: 06-04-2014 07:37 PM by Claudio L..)

(06-04-2014 07:16 PM)pito Wrote: This is the entire sequence from wolfram's alfa (120digits):

Code:

0.15707963267948966192313216916397514420985846996875529104874722961539082031431044993140174126710585339910740432566411533 0.15643446504023086901010531946716689231389989208566079008464134605775879330562357933669587267684868837514030084768044693 0.98778906148432102659224597185688635535644304386604962178575139117322922155888385891561054646005957655720326772574833859 1.51635755259120721369527452363141276164707213917447847214843599362575833442796023036122220616360602112239375025366785886 0.98778906148432102659224597185688635535644304386604962178575139117322922155888385891561054646005957655720326772574833858 0.15643446504023086901010531946716689231389989208566079008464134605775879330562357933669587267684868837514030084768044700 0.15707963267948966192313216916397514420985846996875529104874722961539082031431044993140174126710585339910740432566411539 9.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000330

You just beat me to it, but here's my sequence for 34 digits:

Code:

round(asin(round(acos(round(atan(round(tan(round(cos(round(sin(round(round(9*rou​nd(pi*10^33))/18)*10^-34)*10^34)*10^-34)*10^34)*10^-34)*10^34)*10^-34)*10^34)*10^-34)*10^34)*10^-34)*10^34)*10^-34

It's incomplete because the free version won't allow me to make it any longer, but all it's missing is the / pi * 180:

Typing in the previous result:

Code:

round(round((0.1570796326794896619231321691639748/round(pi*10^33)*10^33)*10^35)*10^-35 * 180 * 10^33)*10^-33

Gives the correct answer for 34 digits:

8.999999999999999999999999999999980

This is exactly what I get with newRPL (previously I reported 79 with the /pi*180 reversed, which is also correct).

This can be used for any digits, changing the 34/33/35 accordingly for prec/(prec-1)/(prec+1).

Claudio

Claudio L. · 06-04-2014, 07:41 PM

(06-04-2014 07:30 PM)pito Wrote: This is 34digits not rounded
[code]0.157079632679489661923132169163975
0.156434465040230869010105319467166
0.987789061484321026592245971856886
1.516357552591207213695274523631411

You got a problem here. Either the number above has one extra digit or all the others are missing one.

Claudio

pito · (This post was last modified: 06-04-2014 08:22 PM by pito.)

Yea, it seems it is an extra, but the 1 at the end is not a big diff, maybe Smile

Do you mean (recalculated, rounded):

Code:

157079632679489661923132169163975

156434465040230869010105319467167

987789061484321026592245971856886

51635755259120721369527452363141

987789061484321026592245971856886

156434465040230869010105319467169

157079632679489661923132169163977

00000000000000000000000000000011

pito · (This post was last modified: 06-04-2014 08:21 PM by pito.)

When I put 9-1e-35 I get

Code:

8.99999999999999999999999999999999999

9-1e-55

Code:

8.9999999999999999999999999999999999999999999999999999999

It does not round to 9.0, so it seems wolfram likes the 9degree results above 9.0 (when you put the 9degree formula in)..

pito · (This post was last modified: 06-04-2014 08:48 PM by pito.)

I did following experiment when entering the 9degree formula in the wolfram - I added a small epsilon to the 9:

Code:

180/pi*(asin(acos(atan(tan(cos(sin((9.0+epsilon)*pi/180)))))))

epsilon = -1e-36

8.999999999999999999999999999999999999

epsilon = -1e-35

8.99999999999999999999999999999999999

epsilon = -1e-34

8.9999999999999999999999999999999999

epsilon = 0

9.0

epsilon = 1e-34

9.0000000000000000000000000000000001

epsilon = 1e-35

9.00000000000000000000000000000000001

epsilon = 1e-36

9.000000000000000000000000000000000001

Based on this result I think wolfram optimizes out the formula entered.

Claudio L. · 06-04-2014, 09:05 PM

(06-04-2014 07:51 PM)pito Wrote: Yea, it seems it is an extra, but the 1 at the end is not a big diff, maybe
Do you mean (recalculated, rounded):

Code:

0.157079632679489661923132169163975 0.156434465040230869010105319467167 0.987789061484321026592245971856886 1.51635755259120721369527452363141 0.987789061484321026592245971856886 0.156434465040230869010105319467169 0.157079632679489661923132169163977 9.00000000000000000000000000000011

This one seems to be correct for 33 significant digits. WP34S uses 34, though.

Claudio

Claudio L. · 06-04-2014, 09:21 PM

(06-04-2014 08:34 PM)pito Wrote: I did following experiment when entering the 9degree formula in the wolfram - I added a small epsilon to the 9:

Code:

180/pi*(asin(acos(atan(tan(cos(sin((9.0+epsilon)*pi/180))))))) epsilon = -1e-36 8.999999999999999999999999999999999999 epsilon = -1e-35 8.99999999999999999999999999999999999 epsilon = -1e-34 8.9999999999999999999999999999999999 epsilon = 0 9.0 epsilon = 1e-34 9.0000000000000000000000000000000001 epsilon = 1e-35 9.00000000000000000000000000000000001 epsilon = 1e-36 9.000000000000000000000000000000000001
Based on this result I think wolfram optimizes out the formula entered.

Wolfram has a very complex engine, either it symbolically cancels out the functions, or (more likely) uses adaptive precision. Basically, they internally keep recalculating the expression, adding as many precision digits as needed until the last digits they want to display don't change anymore.
If they want to display a result with 34 digits, they start maybe with a 34-digit calculation, then they redo the math with let's say 39 digits, and round only the final result to 34 digits. If it is different, they add 5 more digits of precision and redo the calculation with 44, until the rounded final result doesn't change at all.
So we see 34 digits, but all intermediate rounding was done at 39, 44 or higher, as needed to get a consistent final result.

And in general, we should do this on the calculator as well: if WP34S has 34 digits for calculations, set your display to 30 digits or less, and in your mind, think that you are using only 30 digits.
Then your 9 degree trig test will return exactly 9.0 with 30 digits, and the positive psychological effect of that "perfection" will put a smile on your face.

Claudio

pito · (This post was last modified: 06-04-2014 09:29 PM by pito.)

Wolfram alfa, 34digits (hopefully), rounded:

Code:

1570796326794896619231321691639751

1564344650402308690101053194671668

9877890614843210265922459718568864

516357552591207213695274523631413

9877890614843210265922459718568864

1564344650402308690101053194671666

1570796326794896619231321691639748

999999999999999999999999999999980

pito · (This post was last modified: 06-04-2014 09:34 PM by pito.)

For users who are not happy with the trigo precision of their lovely calculators:

http://www.rskey.org/~mwsebastian/miscprj/results.htm

Claudio L. · (This post was last modified: 06-04-2014 09:40 PM by Claudio L..)

(06-04-2014 09:21 PM)pito Wrote: Wolfram alfa, 34digits (hopefully), rounded:

Code:

0.1570796326794896619231321691639751 0.1564344650402308690101053194671668 0.9877890614843210265922459718568864 1.516357552591207213695274523631413 0.9877890614843210265922459718568864 0.1564344650402308690101053194671666 0.1570796326794896619231321691639748 8.999999999999999999999999999999980

Yes! We finally have the same answer, and since 2 matching answers is enough to define a mathematical truth, we can say that this is the correct number.

Now I wonder what happened to your results:

Quote:TRIG9
TRIG9= 9.000 000 000 000 000 000 000 000 000 000 227
yours: 8.999 999 999 999 999 999 999 999 999 937 535
Elapsed x1 =201 ms

In particular the second one seems to be off by a lot (5 digits).
Try to get partial results for each operation and compare with your latest list of partials from Wolfram to see what's going on.

Claudio