Hi floppy,
for over 30 years (not a knowledge we have today 2024) I programmed the gamma-function for a computer with a 368 processor and a 387 coprocessor (only for real arguments).
The interesting point is only that I used different expressions for different arguments:
1) for positive x
\[x \ge 10: \Gamma(x) \approx \sqrt{\frac{2\pi}{x}}\cdot \exp\left(x*\ln(x) + (S_8 - x)\right)\]
This is the expression - Stirling, but the infinite sum is reduced to 8 terms (is possible because x => 10):
\[S_8 = \sum_{k=1}^{8} \frac{B_{2k}}{2k(2k-1)x^{2k-1}}\]
The B(2k) are the Bernoullinumbers and the sum ist calculated with the Horner-scheme:
\[S_8 = (((((((c_8x^{-2} + c_7)x^{-2} + c_6)x^{-2} + c_5)x^{-2} + c_4)x^{-2} + c_3)x^{-2} +
c_2)x^{-2} + c_1)x^{-1} \]
The c(k) are the precalculated expressions:
\[ c_k = \frac{B_{2k}}{2k(2k-1)} \]
If we want to calculate for smaller positiv arguments let's us say x = 7.32, my former program calculates the value for 10.32 and in a loop it calculates:
\[ \Gamma(9.32) = \frac{\Gamma(10.32)}{9.32 } \qquad \Gamma(8.32) = \frac{\Gamma(9.32)}{8.32 } \qquad \Gamma(7.32) = \frac{\Gamma(8.32)}{7.32 }\]
My test if this works with a good accuracy is:
\[ \Gamma(0.5) = \sqrt{\pi} \]
Unfortunately I didn't document how many digit were correct.
2) And the negative Numbers x < 0 (without the negativ integers) were calculated with the expression:
\[ \Gamma(x) = \frac{\pi}{\sin(\pi\cdot x)\Gamma(1-x)} \]
Let us say we have x = - 3.6 then we calculate:
\[ \Gamma(-3.6) = \frac{\pi}{\sin(\pi\cdot (-3.6))\Gamma(4.6)} \]
This all was only a self educational exercise for learning programming the coprocessor with some aspects. May be someone else remembers or knows a more actual procedure if he reads that.
Very interesting. Thanks.
And: using multiprocessing for math calculation (I have already done with python where separate calculation like Result = Series A / Series B where both series are independent) is something I had in my head for HP71 spreading tasks to other HP71 in a loop (just for fun; probably giving the task via HP-IL is too slow but could simulate a parallel processing).
More information on Stirling's approximation can be found
here and
here. More details
here under "Implementation". Assuming that a reasonable number of terms are sufficient for 12-digit accuracy, it would be faster and easier to use in-line constants for the coefficients in your code.
The HP 49 and 50 have the Gamma function for real and complex arguments. I haven't tried to examine the internal code myself but I would guess that it is rather involved.