Binomial Probability Distribution - Printable Version +- HP Forums (https://www.hpmuseum.org/forum) +-- Forum: HP Software Libraries (/forum-10.html) +--- Forum: HP-41C Software Library (/forum-11.html) +--- Thread: Binomial Probability Distribution (/thread-3963.html) Binomial Probability Distribution - Dave Britten - 05-24-2015 05:58 PM This program calculates the odds of a specified number of successes from independent trials with a known probability of success. This requires a combinations function named COMB that calculates nCr. If you need one, see here for a few options. Inputs: t: Number of trials z: Individual probability of success y: Minimum number of successes x: Maximum number of successes Probability will be returned to x. Example: What is the probability of obtaining at least two even numbers when rolling five standard 6-sided dice? 5 ENTER .5 ENTER 2 ENTER 5 XEQ BINPR Result: 0.8125 (81.25%) Edit: Rewritten with Dieter's suggested optimizations. It's a bit longer, but faster. I'm sure there's still room for more elegant stack/register use here. I'll wait until those shake out before I pencil the new listing into my Moleskine. Running it with inputs 120, 1/6, 40, 120 yields 6.419629920E-6 in just over a minute. Compare to the version on my 48 using a direct summation of terms and repeated COMB calls returning 6.4196298769E-6. That's an error of 6.71E-7%, which isn't a terrible sacrifice for having the program finish inside of a lunar cycle (or possibly solar orbit) now. Code: ```t: Number of trials z: Individual probability of success y: Minimum number of successes x: Maximum number of successes 01 LBL "BINPR" 02 X<>Y 03 - 04 STO 00 05 RDN 06 STO 01 07 X<>Y 08 STO 02 09 LASTX 10 STO 03 11 XEQ "COMB" 12 RCL 01 13 RCL 03 14 Y^X 15 * 16 1 17 RCL 01 18 - 19 RCL 02 20 RCL 03 21 - 22 Y^X 23 * 24 RCL 00 25 X>0? 26 GTO 00 27 RDN 28 RTN 29 LBL 00 30 RDN 31 STO 04 32 RCL 01 33 ENTER 34 ENTER 35 1 36 X<>Y 37 - 38 / 39 X<>Y 40 LBL 01 41 RCL Y 42 * 43 RCL 03 44 1 45 + 46 / 47 RCL 02 48 RCL 03 49 - 50 * 51 ST+ 04 52 1 53 ST+ 03 54 RDN 55 DSE 00 56 GTO 01 57 RCL 04 58 RTN 59 END``` RE: Binomial Probability Distribution - Dieter - 05-25-2015 09:51 PM (05-24-2015 05:58 PM)Dave Britten Wrote:  This program calculates the odds of a specified number of successes from independent trials with a known probability of success. This requires a combinations function named COMB that calculates nCr. If you need one, see here. Dave, it looks like your program evaluates COMB in every single loop so that it should not run particulary fast. But there is a simple fix: once nCr is known, the following nCr+1 can be evaluated very easily from this. Even the whole Binomial PDF for x+1 can be calculated from the PDF of the previous x, so that the slow power function in the loop can be avoided as well. Dieter RE: Binomial Probability Distribution - Dave Britten - 05-25-2015 10:11 PM (05-25-2015 09:51 PM)Dieter Wrote:   (05-24-2015 05:58 PM)Dave Britten Wrote:  This program calculates the odds of a specified number of successes from independent trials with a known probability of success. This requires a combinations function named COMB that calculates nCr. If you need one, see here. Dave, it looks like your program evaluates COMB in every single loop so that it should not run particulary fast. But there is a simple fix: once nCr is known, the following nCr+1 can be evaluated very easily from this. Even the whole Binomial PDF for x+1 can be calculated from the PDF of the previous x, so that the slow power function in the loop can be avoided as well. Dieter Thanks for the tip. I'll see if I can incorporate that. I've just always been doing the looped nCr approach. RE: Binomial Probability Distribution - Dieter - 05-25-2015 10:38 PM (05-25-2015 10:11 PM)Dave Britten Wrote:  Thanks for the tip. I'll see if I can incorporate that. I've just always been doing the looped nCr approach. I just tried it and it works well. Let n and p be the distribution parameters, and k1 resp. k2 the lower/upper bound of the random variable. Then this is the basic idea: Calculate Comb(n, k1). Use this to calculate the probability B(k1,n,p). This is the initial value of the sum you want to get. Repeat the following loop k2–k1 times: Calculate B(k+1) = B(k) * (n–k)/(k+1) * p/(1–p) Add this to the sum, increase k by 1. You can even prestore the constant p/(1–p) so that it has to be calculated only once. Dieter RE: Binomial Probability Distribution - Dave Britten - 05-25-2015 10:51 PM (05-25-2015 10:38 PM)Dieter Wrote:   (05-25-2015 10:11 PM)Dave Britten Wrote:  Thanks for the tip. I'll see if I can incorporate that. I've just always been doing the looped nCr approach. I just tried it and it works well. Let n and p be the distribution parameters, and k1 resp. k2 the lower/upper bound of the random variable. Then this is the basic idea: Calculate Comb(n, k1). Use this to calculate the probability B(k1,n,p). This is the initial value of the sum you want to get. Repeat the following loop k2–k1 times: Calculate B(k+1) = B(k) * (n–k)/(k+1) * p/(1–p) Add this to the sum, increase k by 1. You can even prestore the constant p/(1–p) so that it has to be calculated only once. Dieter That appears to mesh with what I came up with: COMB(n, r+1)=COMB(n, r)/(r+1)*(n-r). RE: Binomial Probability Distribution - Dave Britten - 05-26-2015 03:47 PM First post updated with a new listing. Thanks for the algebra optimization tips. RE: Binomial Probability Distribution - Dieter - 05-26-2015 10:38 PM (05-24-2015 05:58 PM)Dave Britten Wrote:  Running it with inputs 120, 1/6, 40, 120 yields 6.419629920E-6 in just over a minute. Compare to the version on my 48 using a direct summation of terms and repeated COMB calls returning 6.4196298769E-6. That's an error of 6.71E-7%, which isn't a terrible sacrifice for having the program finish inside of a lunar cycle (or possibly solar orbit) now. The HP41 works with 10 digits, the HP48 with 12. For the former 1/6 equals 0,166 666 666 7, for the latter it's 0,166 666 666 667. This means: The correct result for p=1/6 is 6,4196298765918... E–6 The correct 10-digit result for p=0,1666666667 is 6,419629908 E–6. So the HP41 is pretty close. The correct 12-digit result for p=0,166666666667 is 6,41962987691 E–6. So the HP48 is very close. Since COMB is an internal function and thus most probably exact in all 12 digits, this could be expected. Compared to the true result with p exactly 1/6, both calculators offer similar accuracy: 8 out of 10 digits resp. 10 out of 12. ;-) BTW my program returns the same result as yours. It has 70 steps including the COMB routine and the example runs in 53 seconds. Dieter RE: Binomial Probability Distribution - Dieter - 05-27-2015 12:51 PM (05-26-2015 03:47 PM)Dave Britten Wrote:  First post updated with a new listing. Thanks for the algebra optimization tips. Great. Here's my solution. It has 74 steps including its own COMB routine. If you want to use your own external function, simply replace steps 16...32 with your COMB command. The program then is down to 58 steps. The code below implements another feature that I consider very useful especially for large n resp. wide ranges between k1 and k2. Take for instance this example: n=100, p=0,1, k=10...90 The result is 0,5487 and requires the summation of 80 single PDFs. The previous version of my program required 43,5 seconds for this. However, there is not much sense in adding the PDFs for k=50, 60 or even 90. They are so small that they do not change the final result. At k=40 the PDF already has dropped to 2,47 E–15, so the addition loop can stop here, or even earlier. This is checked by the program version below: if an additional PDF does not change the CDF sum, the program exits. In the example above the last added term is the PDF for k=36, which was 1,16 E–11 so that it did not change the final result of 0,5487. Which of course is returned much faster now: instead of 43,5 s only 18,5 s are required – less than half the time. Finally, here is the code: Code: ```01 LBL"BINCDF" 02 xy 04 INT 05 STO 04 06 RDN 07 INT 08 STO 03 09 RDN 10 STO 02 11 RDN 12 INT 13 STO 01 14 RCL 01 15 RCL 03 16 - 17 RCL 03 18 xy     // leave min(n, n-k1) in Y 20 SIGN     // set x:=1 21 x>y?     // min(n, n-k1) = 0 ? 22 GTO 03   // then Comb=1, skip Comb loop 23 LBL 01 24 RCL Z    // calculate Comb(n, k1) iteratively 25 * 26 RCL Y 27 / 28 DSE Z 29 LBL 00   // dummy label = NOP 30 DSE Y 31 GTO 01 32 LBL 03 33 RCL 02 34 RCL 03 35 y^x 36 * 37 1 38 RCL 02 39 - 40 ST/ 02   // prestore p/(1-p) 41 RCL 01 42 RCL 03 43 - 44 STO 01 45 y^x 46 * 47 RCL 04 48 RCL 03 49 - 50 x=0?     // # PDF loops = 0 ? 51 GTO 04   // then quit 52 STO 04 53 x<>y 54 ENTER 55 LBL 02   // stack: PDF(k) CDF(k) z  t 56 RCL 01 57 RCL 02 58 * 59 *        // this way CDF(k) is pushed to T 60 SIGN 61 ST- 01 62 ST+ 03 63 x<> L 64 RCL 03 65 /        // stack: X=PDF(k), Y=CDF(k), Z=CDF(k-1), T=CDF(k-1) 66 +        // PDF in L, current and previous CDF in X and Y 67 x=y?     // result same as before? 68 GTO 04   // then quit 69 LastX    // else recall last PDF 70 DSE 04   // decrement counter 71 GTO 02   // and continue with next k 72 LBL 04 73 RDN 74 END``` Now try your example: Code: ```120 [ENTER] 6 [1/x] 40 [ENTER] 120 XEQ"BINCDF"   =>   6,419629920 E–6``` The result now is returned in 25 seconds. That's more than twice as fast as the standard approach since only PDFs between k=41 and k=59 had to be summed up. That's merely 19 loops instead of 80(!): Code: ```LastX  => 1,7254 E–16  (last summed = first negligible PDF(k)) RCL 03 => 59  (last k) RCL 04 => 62  (number of saved loops + 1)``` Dieter RE: Binomial Probability Distribution - Dave Britten - 05-28-2015 01:44 AM Good idea with the early exit. I'll have to play around with that a bit. RE: Binomial Probability Distribution - Dieter - 05-28-2015 11:11 AM (05-28-2015 01:44 AM)Dave Britten Wrote:  Good idea with the early exit. I'll have to play around with that a bit. Here is an update with the Comb loop counting up the denominator: Code: ```.. ... 21 x>y?     // min(n, n-k1) = 0 ? 22 GTO 03   // then Comb=1, skip Comb loop 23 RCL 01 24 LBL 01 25 *        // calculate Comb(n, k1) iteratively 26 DSE L 27 RCL 01 28 LastX 29 - 30 ST/ Y 31 X<> L 32 DSE Z 33 GTO 01 34 RDN 35 LBL 03 .. ...``` That's three more steps, on the other hand the RCL 01 command in line 14 now can be omitted. So all in all it's two steps more. But accuracy improves – try your example once again: Code: ```120 [ENTER] 6 [1/x] 40 [ENTER] 120 XEQ"BINCDF"   =>   6,419629908 E–6``` That's the exact result for a ten-digit value of 1/6. :-) Dieter