Faster inverse gamma and factorial for the WP 34S

12042014, 06:14 AM
(This post was last modified: 02062015 05:30 AM by Bit.)
Post: #1




Faster inverse gamma and factorial for the WP 34S
I noticed that the inverse gamma library routine included with the 34S (discussed here) was quite slow for small inputs. I've added a very simple second algorithm that uses the secant method and converges faster for small inputs: < ~2.3 in single precision mode and < ~1e17 in double precision mode. For example, 1.6 vs 5.3 seconds in single precision mode or 2.1 vs 13.5 seconds in double precision mode with 0.89 as the input. While it's a noticeable improvement, it's probably far from optimal since I'm not an expert on numerical algorithms, so I'd be interested in better versions.
Code: LBL'![^1]' // Inverse factorial 

02032015, 05:05 PM
(This post was last modified: 02032015 05:07 PM by BarryMead.)
Post: #2




RE: Faster inverse gamma and factorial for the WP 34S
(12042014 06:14 AM)Bit Wrote: While it's a noticeable improvement, it's probably far from optimal since I'm not an expert on numerical algorithms, so I'd be interested in better versions.Bit: When I was helping Torsten Manz improve his HP15C simulator, I did a lot of research on good numerical algorithms, and one of the better examples of good coding and well selected numerical algorithms was the "Python" math library. The guys who wrote the Python math library used some really good algorithms. I also found that the Wolfram Alpha web site is a nice reference to check the accuracy of your algorithms. I don't know that your particular functions "Inverse Gamma" & "Inverse Factorial" are included in the Python math library, but if they are, they will probably be fast, accurate and interesting to look at as examples. Hope this helps, Barry 

02062015, 01:39 AM
Post: #3




RE: Faster inverse gamma and factorial for the WP 34S
(02032015 05:05 PM)BarryMead Wrote:Thank you for the advice. Unfortunately the inverse functions don't seem to be included in this library, but if you were referring to another math library, please let me know.(12042014 06:14 AM)Bit Wrote: While it's a noticeable improvement, it's probably far from optimal since I'm not an expert on numerical algorithms, so I'd be interested in better versions.Bit: When I was helping Torsten Manz improve his HP15C simulator, I did a lot of research on good numerical algorithms, and one of the better examples of good coding and well selected numerical algorithms was the "Python" math library. The guys who wrote the Python math library used some really good algorithms. I also found that the Wolfram Alpha web site is a nice reference to check the accuracy of your algorithms. I don't know that your particular functions "Inverse Gamma" & "Inverse Factorial" are included in the Python math library, but if they are, they will probably be fast, accurate and interesting to look at as examples. Hope this helps, Barry 

02062015, 03:33 AM
(This post was last modified: 02062015 03:40 AM by BarryMead.)
Post: #4




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 01:39 AM)Bit Wrote: If you were referring to another math library, please let me know.Bit I did a google search for "python math libraries inverse gamma" and found this code snippet here. Without knowledge of your intended goal, and qualitative evaluation of the function in operation, I wouldn't be able to tell you if it is of any value or not. Code:


02062015, 03:57 AM
Post: #5




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 03:33 AM)BarryMead Wrote:(02062015 01:39 AM)Bit Wrote: If you were referring to another math library, please let me know.Bit I did a google search for "python math libraries inverse gamma" and found this code snippet here. That piece of code implements the same initial approximation that's employed in the 34S algorithm but doesn't go further. Here's some explanation and some more. That approximation seems quite good (but I'm not an expert). What I've been trying to do is make the subsequent iterations converge faster. 

02062015, 04:16 AM
Post: #6




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 03:57 AM)Bit Wrote: That approximation seems quite good (but I'm not an expert). What I've been trying to do is make the subsequent iterations converge faster.I haven't tried your program or confirmed it's accuracy with wolfram alpha, but it seems like you have already improved upon the widely accepted and well established algorithms in math geek circles. Nice Job! 

02062015, 04:38 AM
(This post was last modified: 02062015 04:40 AM by BarryMead.)
Post: #7




RE: Faster inverse gamma and factorial for the WP 34S
Bit: I have a question related to testing out your inverse gamma function. When I used cut and paste to
grab your code into a text file, and tried to compile it with the wp34s_asm.pl program I couldn't get it to make a "Ram" image to test the program out in ram. Do you know of a way to get the assembler to generate a ram image that can be copied to the wp34s.dat file for testing of a program? If I try to generate a "ROM" image then I have to manipulate whole libraries and run into library management complexities. I know I could manually key the program into RAM, but that is prone to human typo errors. I have never gotten the assembler to do that simple task have you? 

02062015, 05:28 AM
Post: #8




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 04:38 AM)BarryMead Wrote: Bit: I have a question related to testing out your inverse gamma function. When I used cut and paste to What you need is the wp34s_lib program, a wp34s.dat memory file and the code saved in a text file (I've just made a few minor changes so copypaste it again). The following command will add the contents of 'sourcefile' to your memory file: tools/wp34s_lib.pl sourcefile ilib path_to_wp34s.dat f state 

02062015, 06:09 AM
(This post was last modified: 02062015 06:09 AM by BarryMead.)
Post: #9




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 05:28 AM)Bit Wrote: tools/wp34s_lib.pl sourcefile ilib path_to_wp34s.dat f stateThanks, Bit. I had gotten part of this to work in the past, but for some reason if your original program from RAM that you disassemble contains any "HOTKEY" labels A .. D then the assembler chokes on the program and you get nothing when you try to reassemble the program. I think this is a design error in the assembler personally. I think people should be able to keystroke any program they want into ram and be able to disassemble it into a text file and reassemble it into a ram image without error. 

02062015, 07:39 AM
(This post was last modified: 02062015 07:51 AM by BarryMead.)
Post: #10




RE: Faster inverse gamma and factorial for the WP 34S
Inverse Gamma Accuracy Issue.
Bit: When I enter the following into wolfram alpha: 1.5=gamma(x) solve for x, I get an answer: 2.6627663453201472954412987 When I enter 1.5 and use the compiled inverse gamma function on the wp34sgui_Bit.exe eumulator, in double precision mode I get 2.705794548255370. If I use the same compiled wp34s.dat image on the QT emulator, or the wp34sgui.exe (non _Bit) emulator, I get the same answer as wolfram alpha. Also if I use the _Bit emulator in single precision mode I seem to get the same answer (less digits) as wolfram alpha. Why is that? 

02062015, 07:58 AM
Post: #11




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 07:39 AM)BarryMead Wrote: Inverse Gamma Accuracy Issue. My builds support a few extra instructions and because of that the instruction codes are different and wp34s.dat files that contain programs aren't compatible with the official versions. You'd need another wp34s.op file to be able to compile the source correctly for my builds. There's no secret sauce: You can apply my patches to a source tree, recompile everything and generate a wp34s.op file. But I'll include it in my next release to make it easy. 

02062015, 08:09 AM
(This post was last modified: 02062015 02:36 PM by BarryMead.)
Post: #12




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 07:58 AM)Bit Wrote: You'd need another wp34s.op file to be able to compile the source correctly for my builds.Thanks again Bit. I was puzzled by the discrepancy. I look forward to your next build with the custom opcode file. In the mean time, I stepped through the compiled program to find where it differed from the source listing and corrected the offending opcode. It was the "DBL?". Which got miscompiled into "DBLOFF". When I fixed that one bad instruction, the program works to full accuracy. 

02062015, 02:45 PM
Post: #13




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 07:58 AM)Bit Wrote: You'd need another wp34s.op file to be able to compile the source correctly for my builds. Bit: Does this mean that the wp34slib.dat file is special for your compiles as well. If so could you please include that file in the next build as well as the custom wp34s.op opcode file? With both of these files included we can run the wp34sgui_Bit.exe program and it will behave exactly like the calc_xtal_full.bin realbuild. And we would be able to add/remove library routines to/from the standard Bit build. Eventually I would LOVE to duplicate your full toolchain setup on my Linux system. I know that is a big undertaking, but with it I could contribute more to the community. So far I have not been able to get the yagartobu2.20.1_gcc4.6.0cc++_nl1.19.0_gdb7.2_eabi_20110328.exe which installs to do anything under wine on my linux system once installed. I am not sure what have done wrong. Thanks ever so much for all of your valuable time and patience. Barry 

02072015, 04:57 PM
Post: #14




RE: Faster inverse gamma and factorial for the WP 34S
(02062015 02:45 PM)BarryMead Wrote:Yes and yes: I'll include those two files in my next builds. I didn't in the past because I didn't think the emulator would be used for anything but quickly testing the new features.(02062015 07:58 AM)Bit Wrote: You'd need another wp34s.op file to be able to compile the source correctly for my builds. (02062015 02:45 PM)BarryMead Wrote: Eventually I would LOVE to duplicate your full toolchain setup on my Linux system. I know that is a big undertaking, but with it I could contribute more to the community.It's not very difficult actually, but this is not the right thread to discuss it. I've sent you a message. 

02072015, 07:34 PM
(This post was last modified: 02072015 07:40 PM by Dieter.)
Post: #15




RE: Faster inverse gamma and factorial for the WP 34S
(12042014 06:14 AM)Bit Wrote: I noticed that the inverse gamma library routine included with the 34S (discussed here) was quite slow for small inputs. I've added a very simple second algorithm that uses the secant method and converges faster for small inputs: < ~2.3 in single precision mode and < ~1e17 in double precision mode. For example, 1.6 vs 5.3 seconds in single precision mode or 2.1 vs 13.5 seconds in double precision mode with 0.89 as the input. While it's a noticeable improvement, it's probably far from optimal since I'm not an expert on numerical algorithms, so I'd be interested in better versions. The standard lib that comes with the 34s firmware contains a digamma routine  so why not use the obvious and setup a simple Newton solver for Γ^{1}(a): x := x + (a/Γ(x) – 1) / Ψ(x) The idea of using Ψ was already mentioned in the 2013 thread, but the solution posted there replaced Ψ with an approximation. Using "the real thing" seems to work better, at least for small arguments. Since the digamma routine does not preserve the stack (at least the current version doesn't) two registers are used. R_{0} stores the input a, R_{1} holds the current approximation. The code is simple and straightforward: Code: LBL"IGM" Timings: An input of a = 0,89 returns the result in 1,9 (SP) resp. 2,7 (DP) seconds. The above code is a first quick and dirty version, so take care. But it seems to run almost as fast as your version while it's much more compact and needs no separate handling of small or large arguments. Dieter 

02082015, 01:51 AM
Post: #16




RE: Faster inverse gamma and factorial for the WP 34S
(02072015 07:34 PM)Dieter Wrote: The standard lib that comes with the 34s firmware contains a digamma routine  so why not use the obvious and setup a simple Newton solver for Γ^{1}(a): That's a good idea, thank you. When I have the time, I'll see if this method is any faster if used with the builtin digamma routine (which is a compile time option, not available by default). 

02082015, 02:07 AM
(This post was last modified: 02082015 02:11 AM by BarryMead.)
Post: #17




RE: Faster inverse gamma and factorial for the WP 34S
(02082015 01:51 AM)Bit Wrote: That's a good idea, thank you. When I have the time, I'll see if this method is any faster if used with the builtin digamma routine (which is a compile time option, not available by default).If you combine the source code for the Digamma function with Dieter's Inverse Gamma function, the combined program is longer than Bit's original Inverse Gamma function (in RAM memory steps). It might be worth the extra memory space if it is significantly faster. But since it is not part of the standard flash library, it is not really fair to say it is a shorter program. 

02082015, 07:02 AM
Post: #18




RE: Faster inverse gamma and factorial for the WP 34S
If the digamma route is interesting and custom images are acceptable, I've implemented this function both in xrom (trunk/xrom/digamma.wp34s) and native C (trunk/unused/digamma.c).
 Pauli 

02082015, 03:34 PM
(This post was last modified: 02082015 05:26 PM by Dieter.)
Post: #19




RE: Faster inverse gamma and factorial for the WP 34S
(02082015 02:07 AM)BarryMead Wrote: If you combine the source code for the Digamma function with Dieter's Inverse Gamma function, the combined program is longer than Bit's original Inverse Gamma function (in RAM memory steps). Yes, that's true of course. But you can also include the relevant parts of the current digamma function into the inverse gamma routine. As far as I can tell, only positive arguments occur, and also some overhead in digamma can be removed. The final result may look like this: Code: LBL"IGM" This version has 69 lines and it runs even faster. Γ^{1}(0,89) now is returned in 1,6 (SP) resp. 2,3 (DP) seconds. The code also addresses a problem for arguments that are too close to the Gamma minimum at 0,88560 31944 10888 70027 88159 00582 58873 32... Here the current inverse Gamma implementation throws an error (Lambert's W is undefined for x<–1/e). This is handled by the two RSDs. Arguments very close to the mentioned Gamma minimum still take longer, but they return a result instead of an error message. That is: they do, except in some extremely close cases. Γ^{1}(0,88560 31944 10888 70027 88159 1) still works, but accuracy may be limited. Arguments even closer will not converge within reasonable time, so the code is limited to 86 iterations. Otherwise a "No root found" error is generated. In this case X and Y hold the two last approximations and the last adjustment is in Z, so that the user at least gets a result along with a rough impression of its accuracy. Dieter 

02082015, 05:04 PM
(This post was last modified: 02082015 05:04 PM by BarryMead.)
Post: #20




RE: Faster inverse gamma and factorial for the WP 34S  
« Next Oldest  Next Newest »

User(s) browsing this thread: 2 Guest(s)