Post Reply 
(11C) Probability of No Repetitions
01-10-2020, 12:42 PM (This post was last modified: 01-11-2020 01:02 AM by Gamo.)
Post: #1
(11C) Probability of No Repetitions
This program was adapted from HP-55 Statistic Book (Page 12)

Reference:
E. Parzen, Modern Probability Theory and its Applications,
John Wiley and Sons, 1960 (CH. 2 Page 46)

As stated in the book:

Probability of No Repetitions in a Sample

Suppose a sample of size n is drawn with replacement from population
containing m different objects. Let P be the probability that there are no
repetitions in the sample, then

P = [1- (1/m)][1- (2/m)]....[1- (n-1/m)]

Given integer m, n such that m ≥ n ≥ 1 this program finds the probability P.

Remark:
The execution time of the program depends on n; the larger n is, the longer it takes.
------------------------------------------------------------------------------
Example: HP-55 Statistic Book page 13

In a room containing n persons, what is the probability that no two or more
persons have the same birthday for n = 4, 23, 48?

Note: m = 365 // 365 is the days amount of birthday

[USER] [FIX] 2

365 [A] display 365.00 // Enter m

4 [B] display brieftly 0.98 then 2.00 // Enter n

23 [B] display briefly 0.49 the 51.00

48 [B] display briefly 0.04 then 96.00

Answer:
In a room for the probability that at lease two of them will
have the same birthday

4 people in a room P = 0.98 or only 2%
23 people in a room P = 0.49 or 51%
48 people in a room P = 0.04 or 96%
--------------------------------------------------------------
Program:
Code:

LBL A  // m
STO 1
RTN
LBL B  // n
STO 2
1
STO 0
LBL 0
RCL 1
RCL 2
1
-
STO 2
0
X=Y
GTO 1
Rv
X<>Y
÷
1
X<>Y
-
STOx0
GTO 0
LBL 1
RCL 0
PSE
1
RCL 0
RND
-
EEX
2
x
RTN

Gamo 1/2020
Find all posts by this user
Quote this message in a reply
01-11-2020, 01:46 AM (This post was last modified: 01-11-2020 09:02 PM by Albert Chan.)
Post: #2
RE: (11C) Probability of No Repetitions
(01-10-2020 12:42 PM)Gamo Wrote:  Probability of No Repetitions in a Sample

Suppose a sample of size n is drawn with replacement from population
containing m different objects. Let P be the probability that there are no
repetitions in the sample, then

P = [1- (1/m)][1- (2/m)]....[1- (n-1)/m]

This is similar to thread (42S) Probability of Same Birthday Day

We can do approximation the same way: products of terms = exp of sum of ln(term)
Sum is approximated by integral with correction (bolded expression below)

If we let u = 1-x/m, we have x = m-m*u

XCas> f := ln(1-x/m)
XCas> expand(subst(int(f) - f/2 + diff(f)/12, x = m-m*u))   → m*u - 1/(12*m*u) - ln(u)/2 - m*u*ln(u)
XCas> g(m,u) := -1/(12*m*u) - ln(u)*(0.5 + m*u) + m*u       // ≈ sum formula
XCas> p(m,n) := exp(g(m, 1.-n/m) - g(m, 1.-1/m))                 // ≈ product(1-x/m, x=1 .. n-1)

XCas> p(365, 4)       → 0.9836440875
XCas> p(365,23)      → 0.4927027657
XCas> p(365,48)      → 0.03940202712
Find all posts by this user
Quote this message in a reply
01-11-2020, 02:15 PM (This post was last modified: 04-28-2020 11:50 PM by Albert Chan.)
Post: #3
RE: (11C) Probability of No Repetitions
(01-10-2020 12:42 PM)Gamo Wrote:  P = [1- (1/m)][1- (2/m)]....[1- (n-1)/m]

Perhaps it is more useful by adding a (1-0/m) term, we have P = perm(m,n) / mn

Using the same idea to approximate ln(perm(m,n)):

XCas> f := ln(m-x)
XCas> expand(subst(int(f) - f/2 + diff(f)/12, x = m-u))     → -1/(12*u) - ln(u)/2 - u*ln(u) + u
XCaS> g(u) := -1/(12*u) - ln(u)/2 - u*ln(u) + u
XCas> ln_perm(m,n) := g(m-n) - g(m)                                 // integral limit x = 0 to n, u=m-x

ln_perm() can be simplified and more accurate by putting tiny terms up front, and use logp1()

\(\large \ln(perm(m,n)) ≈ {-n\over 12m(m-n)} - [n + (m-n+½)\ln(1 - n/m)]+ n \ln(m)
\)

For ln(P), above last term cancelled out.

XCas> ln_P(m,n) := -n/(12*m*(m-n)) - (n + (m-n+0.5)*ln(1-n/m))
XCas> ln_perm(m,n) := ln_P(m,n) + n*ln(m)

XCas> e^ln_P(365, 4)       → 0.9836440875
XCas> e^ln_P(365,23)      → 0.4927027657
XCas> e^ln_P(365,48)      → 0.03940202712

Trivia: ln_perm(m,n) = ln_perm(m,a) + ln_perm(m-a, n-a), same as real permutation function.

if m ≈ n, ln_perm(m,n) is not very accurate.
With above trivia, splitting ln_perm(m,n) does not help.
However, we can replace the 2nd term with real ln(perm(m-a,n-a))

Example: ln(perm(69,68)) = ln(69!) ≈ 226.1905483
XCas> ln_perm(69,68)                   → 226.1882765 // slightly worse than log of Stirling's formula.
XCas> ln_perm(68,67) + ln(2)        → 226.1902224 // perm(69-67, 68-67) = perm(2,1) = 2
XCas> ln_perm(68,66) + ln(6)        → 226.1904485 // perm(69-66, 68-66) = perm(3,2) = 6

Edit: above ln_P formula is numerically very bad, use this instead
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 1 Guest(s)