Threaded Mode | Linear Mode

Tugdual · 05-31-2014, 08:00 PM

Hi Thomas, thanks for your posting.
This was a huge discovery for me a few years ago thanks to the internet I finally understood the secrets of calculators. Before that I thought there was some tricky use of limited developments...
This little algorithm is truly amazing, apparently the Prime still uses it today. Wonder if math coprocessors also rely on it.

Could be a good idea to extend the article to all the possibilities of CORDIC?

brouhaha · 06-01-2014, 02:27 AM

(05-31-2014 08:00 PM)Tugdual Wrote: Wonder if math coprocessors also rely on [CORDIC]

Some of the early ones (Intel 8087, Motorola MC68881) did. It's my understanding that more recent systems, both hardware and software, tend to use other algorithms to get faster convergence. See _Elementary Functions: Algorithms and Implementation, 2nd Ed._ by Jean-Michel Muller.

Thomas Klemm · 06-01-2014, 07:32 AM

(05-31-2014 08:00 PM)Tugdual Wrote: This little algorithm is truly amazing, apparently the Prime still uses it today.

The WP-34S however uses:

Code:

    // Now Taylor series

    // tan(x) = x(1-x^2/3+x^4/5-x^6/7...)

Quote:Could be a good idea to extend the article to all the possibilities of CORDIC?

I left that as an exercise for the interested reader. Smile

Cheers
Thomas

Tugdual · 06-01-2014, 11:57 AM

(06-01-2014 07:32 AM)Thomas Klemm Wrote: The WP-34S however uses:

Code:

// Now Taylor series // tan(x) = x(1-x^2/3+x^4/5-x^6/7...)

Interesting, I wonder what justified the choice.
A matter of patent?
What is the most efficient approach in terms of speed and accuracy between CORDIC and Taylor?
Or may be CORDIC was the right choice for old calculators with little ROM?

Paul Dale · (This post was last modified: 06-01-2014 12:14 PM by Paul Dale.)

(06-01-2014 11:57 AM)Tugdual Wrote: Interesting, I wonder what justified the choice.
A matter of patent?

I honestly don't remember why I chose this approach over others. My guess would be a matter of expediency, I had a lot to implement quickly. A taylor expansion is easy to write and generally quite small. They aren't always the most numerically stable methods. There aren't patent issues for these series that I'm aware of -- too well known to an expert in the field.

Quote:What is the most efficient approach in terms of speed and accuracy between CORDIC and Taylor?

Taylor series generally can be made as accurate as you want by adding additional terms -- this being important early on when the working precision wasn't set in stone. CORDIC requires a bit more effort to increase its accuracy, more iterations and more constants. Speed wise, I doubt there is much difference a lot of the time -- at the least, I don't have a good feel for the differences. However, remember the 34S CPU can do integer multiplies which helps a lot. CORDIC assumes only additions and shifts are possible and wouldn't directly benefit from multiplication.

If I was going to go back and revisit these, I'd be looking harder at specially designed rational approximations that gave results roughly at the desired working precision.

Quote:Or may be CORDIC was the right choice for old calculators with little ROM?

CORDIC has the big advantage that is can produce trigonometric, inverse trigonometric, logarithmic, exponential, hyperbolic and interse hyperbolic functions using essentially the same algorithm. I.e. very code space efficient. This was vital for the older devices with tiny amounts of ROM.

It really is difficult to foresee any other algorithms fitting into so few resources yet still providing such functionality.

Pauli

Thomas Klemm · (This post was last modified: 07-31-2022 10:31 AM by Thomas Klemm.)

(06-01-2014 12:12 PM)Paul Dale Wrote: CORDIC has the big advantage that is can produce trigonometric, inverse trigonometric, logarithmic, exponential, hyperbolic and inverse hyperbolic functions using essentially the same algorithm.

In addition to that it can also produce the square root:

Quote:\((w)^{1/2}=(x^2-y^2)^{1/2}\) where \(x=w+\frac{1}{4}\) and \(y=w-\frac{1}{4}\)

It would never have occurred to my mind that this could be used. Solely for this idea it was worth reading the paper [1].

Cheers
Thomas

[1] A Unified Algorithm for Elementary Functions

Tugdual · 06-01-2014, 07:48 PM

I did a few checks here and there on the net. Cordic is indeed the right choice for its small foot print in memory and also very little CPU cost - some say it costs as much as a multiplication while other methods would require a lot of multiplications...

pito · (This post was last modified: 06-01-2014 10:15 PM by pito.)

Quote:What is the most efficient approach in terms of speed and accuracy between CORDIC and Taylor?
Or may be CORDIC was the right choice for old calculators with little ROM?

The cordic was the choice with early calculators which had 1-4kBytes of rom (and fully different architecture as it is today, ie. my HP-25 with 2kx10bit of rom - see the disassembled rom at http://www.jacques-laporte.org/Woodstock...s/hp25.txt). I did some benchmarks in past on cortex M3 (32bit cordic vs. single precision) where I set the cordic's N-iterations such I get similar results as with single precision and the timing difference was not dramatic. The Taylor does not need too many terms to get required precision.

You may try the 32bit cordic (ready code to run):
http://www.dcs.gla.ac.uk/~jhw/cordic/

Today the cordic algorithms are mostly used in FPGA or ASIC designs - there are sources in VHDL/Verilog available - it is the implementation in hardware which is _very_ fast.

Also mind the WP34s is using decNumber library with arbitrary lenght decimal floating point (set to maybe 34 digits or more for trigo), which is not the same as the IEEE double precision (or single precision).

On the ARM a single precision (32bit, 6-7digits precision) calculation is typically twice faster than the double precision (64bit, 14-15digits), and a 34digits decimal floating point is maybe 10-15x slower than the double precision..

Thomas Klemm · 06-02-2014, 06:39 PM

(05-31-2014 08:00 PM)Tugdual Wrote: Could be a good idea to extend the article to all the possibilities of CORDIC?

Due to your request I have attached a Python program with examples from the paper to the article.

Cheers
Thomas

Marcus von Cube · 06-02-2014, 08:47 PM

(06-01-2014 08:28 PM)pito Wrote: Also mind the WP34s is using decNumber library with arbitrary lenght decimal floating point (set to maybe 34 digits or more for trigo), which is not the same as the IEEE double precision (or single precision).

We're using 39 digits as the standard internal precision, and, as you correctly point out, many more digits for some modulo operations used in trig functions. XROM code (internal key stroke programs) are limited to 34 digits.

pito · (This post was last modified: 06-02-2014 10:14 PM by pito.)

Quote:We're using 39 digits as the standard internal precision..

Therefore I've ordered the 30b today - I've always been interested to have a pocket calculator which returns 9.0 with the 9 degree trigo test Smile

Paul Dale · (This post was last modified: 06-02-2014 10:26 PM by Paul Dale.)

(06-02-2014 10:13 PM)pito Wrote: Therefore I've ordered the 30b today - I've always been interested to have a pocket calculator which returns 9.0 with the 9 degree trigo test

The 34S returns 9.000000000029361 in single precision mode which is the proper correctly rounded result. It returns 8.999999999999999999999999999937535 in double precision, I don't know if this is correctly rounded or not.

- Pauli

Tugdual · 06-02-2014, 10:53 PM

As seen on Wikipedia

Quote:For most numerical calculations involving π, a handful of digits provide sufficient precision. According to Jörg Arndt and Christoph Haenel, thirty-nine digits are sufficient to perform most cosmological calculations, because that is the accuracy necessary to calculate the volume of the known universe with a precision of one atom.

We're still missing 5 digits to rule the universe with the 34s :-(

pito · (This post was last modified: 06-02-2014 11:09 PM by pito.)

You have to be able to calculate the volume of the visible Universe with a precision of the Planck lenght, I would say.. Or better..

pito · 06-02-2014, 11:05 PM

Quote:It returns 8.999 999 999 999 999 999 999 999 999 937 535 in double precision, I don't know if this is correctly rounded or not.

With 34-39digits (or more with trigo) precision set in decNumber library you have to get better result with 9deg test.

Paul Dale · 06-02-2014, 11:08 PM

(06-02-2014 11:05 PM)pito Wrote: With 34-39digits (or more with trigo) precision set in decNumber library you have to get better result with 9deg test.

Why?

Rounding in this case is to 34 digits at each step.

Feel free to recode the 34S trigonometric functions if you think you can do better. We accept code donations Smile

- Pauli

Tugdual · 06-02-2014, 11:13 PM

(06-02-2014 11:08 PM)Paul Dale Wrote:
(06-02-2014 11:05 PM)pito Wrote: With 34-39digits (or more with trigo) precision set in decNumber library you have to get better result with 9deg test.

Why?

Rounding in this case is to 34 digits at each step.

Feel free to recode the 34S trigonometric functions if you think you can do better. We accept code donations

- Pauli

So you accept code but not PDF donation?
OOps sorry, I stop before Katie bans me :-)

pito · 06-02-2014, 11:20 PM

Maybe it depends on how you calculate the trig functions - do you iterate Taylor against an epsilon smaller than set precision? I did it that way with decNumber (long time back though) and afaik the results were precise to the last digit.

Paul Dale · 06-02-2014, 11:42 PM

(06-02-2014 11:20 PM)pito Wrote: Maybe it depends on how you calculate the trig functions - do you iterate Taylor against an epsilon smaller than set precision? I did it that way with decNumber (long time back though) and afaik the results were precise to the last digit.

Yes, this is what is used for SIN, COS. I will change this in the 43S, naïve Taylor series expansions aren't ideal for fixed precision results.

It won't necessarily be precise to the last digit however. Consider COS(PI/2-ULP). The Taylor series for COS(x) starts 1 - x^2/2 + ... The import thing is the leading 1. Since COS(PI/2-ULP) is very very close to zero, the Taylor expansion is subtracting almost 1 from 1 which causes quite a number of lost digits. This can be mitigated of course but you have to realise it is happening.

- Pauli

pito · (This post was last modified: 06-03-2014 02:03 PM by pito.)

I've flashed my CM3 with the old stuff I've found - when running below code with 34digits I get (I've added your result for comparison):

Code:

TRIG9

TRIG9= 9.000 000 000 000 000 000 000 000 000 000 227

yours: 8.999 999 999 999 999 999 999 999 999 937 535

Elapsed x1 =201 ms

Code:

#define _PI  "3.1415926535897932384626433832795028841971693993751"

//  TRIG9 PRECISION TEST

cout<<"TRIG9"<<endl;

decfp tmp, PI;

PI = _PI;

tmp = "9.0";  // 9 DEGREE

// deg to rad

timer = millis;

tmp = tmp *  PI / "180.0";

tmp = arcsin ( arccos ( arctan ( tan ( cos ( sin ( tmp ) ) ) ) ) ) ;

// rad to deg

tmp = tmp * "180.0" / PI;

timer = millis - timer;

cout<<"TRIG9= "<<tmp<<endl;

cout<<"Elapsed x1 ="<<timer<<" ms"<<endl;

But I've learned you do rounding after each calculation so maybe that is what makes the diff..