[WP34S] Improving the Incomplete Gammas Message #1 Posted by Les Wright on 14 May 2012, 6:13 a.m.
I welcomed the nonregularized incomplete gammas, and was eager to see how they would compare with my one user code, ported from my HP41 routine, in itself ported from the very Numerical Recipes routines that inspire your code.
I must admit that I have found the performance of the upper nonregularized incomplete gamma underwhelming, and I have been boggled as to why, given that the whole thing is computed in native code to 39 digits before getting returned to the stack.
Take the upper nonregularized gamma for the parameter pair (5,7). My owner routine gives ALL 34 digits correct. The result is roughly 4.15, and the last four digits are 5464.
However, the newlyminted GAMMAxy gives 5493 in the last four digitsoff a whopping 27ULP. I say "whopping" since all of this function is computed natively with 39 digits, none in XROM with less working precision.
At first I thought it had something to do with the taking of logarithms and then reexponentiation at the end of the routine. However, when I replicated this in my user code I lost a few ULP, but was still a lot closer than than the built in routine.
Then I looked at your implementation of the Lentz algorithm, and I wondered about your choice for the "tiny" parameter. You choose 1e32, which is small enough for machine double precision of roughly 16 digits, but may be somewhat too large when working with 39digit decNumbers. In my own analogue for gcf(), I use 1e256, computed on the fly as 8 2^x 10^x 1/x. Even on the HP41, I somewhat arbitrarily use 1e25 given that we have only 10 digits to deal with.
Running with this hypothesis, I changed my code to use 1e32, and guess what? I get a result this time with 5484 in the last 4 digitsoff by 20ULP.
In the modified Lentz algorithm, the purpose of the "tiny" parameter is to permit the routine to proceed should a divisionbyzero error threaten to derail things. I think choosing its size is a matter of art, not science, but I understand it should be small enough that it simply rescues one from a divisionbyzero error but does not, in the end, throw off the computation of the continued fraction.
I would be interested to see how gcf() performs if you substitute, say, 1e400, a number already in available constants, as the tiny parameter. But if you do decide to keep the existing 1e32, you need to correct line 2122 of decn.c for read "decNumberCopy(&d, &const_1e_32);". The underscore is missing in the original. However, based on the experiment with my own user code, I opt for moving to a much tinier tiny parameter (and, correspondingly, a much bigger "big" parameter where it is used at the start of the gcf routine.)
I hope this is a worthwhile petition. If I am right, we will see improved accuracy for the cumulative normal, erf, chisquare, and Poisson.
Les
