Post Reply 
(32S, 32SII) Birthday paradox, hash collision probability
05-03-2020, 01:00 PM (This post was last modified: 05-09-2020 12:44 PM by Albert Chan.)
Post: #9
RE: (32S, 32SII) Birthday paradox, hash collision probability
(05-02-2020 11:15 PM)Albert Chan Wrote:  XCas> ratio(x,a) := (x*x + (a+1)*(x + (a+2)/6)) ^ (a/2)
XCas> ratio(69.,0.95)
→ 56.5843203845
XCas> ratio(70., -.05) * 70                            // 69.95!/69! = 69.95!/70! * 70
→ 56.5843203829

We can reverse the logic, and produce a compact, accurate ln_nr(n, s) Smile

P_nr(n, s) = perm(n, s) / n^s = ratio(n-s, s) / n^s = b^(s/2)

XCas> b := expand(ratio(n-s,s)^(2/s) / n^2)   → \(-\frac{s}{n}+\frac{s^{2}}{6\cdot n^{2}}+1+\frac{1}{n}-\frac{s}{2\cdot n^{2}}+\frac{1}{n^{2}\cdot 3}\)
XCas> factor(b-1)                                          → \(\large\frac{(-s+1) (6\cdot n-s+2)}{6\cdot n^{2}}\)

Code:
m = require 'mathx'
function ln_nr(n,s) return 0.5*s* m.log1p((s-1)*(s-2-6*n)/(6*n*n)) end
function P_collision(n,s) return -m.expm1(ln_nr(n,s)) end

lua> n, s = 365, 30
lua> x1 = ln_nr(n,s)
lua> x1, -m.expm1(x1)
-1.2252495190150705       0.706315588664643

We can improve above by halving the "interval"

lua> h = s/2
lua> x2 = ln_nr(n,h) + ln_nr(n-h,h) + h * m.log1p(-h/n)
lua> x2, -m.expm1(x2)
-1.2252516087363756       0.7063162023825731

I noticed halving the interval improved ln_nr(n,s) by about 16 times
We can extrapolate for better estimate

lua> x = x2 + (x2-x1)/15   -- (x - x1) ≈ 16*(x - x2)
lua> x, -m.expm1(x)          -- extrapolated ln_nr(n,s), collision probability
-1.2252517480511294       0.7063162432970561

Update: I found a general formula. However, convergence is not good ...

XCas> ln_nr(n,s,kmax) := ln(1+s/(n-s)) - sum(subst(sum((j/n)^k, j=1..jmax),jmax=s)/k, k=1..kmax)
XCas> map(range(1,11), k -> [k, 1 - exp(ln_nr(365.,30.,k))])

\(\left(\begin{array}{cc}
1 & 0.695232406385 \\
2 & 0.705857478651 \\
3 & 0.706293132999 \\
4 & 0.706314950579 \\
5 & 0.706316165395 \\
6 & 0.706316237864 \\
7 & 0.706316242403 \\
8 & 0.706316242698 \\
9 & 0.706316242718 \\
10 & 0.706316242719
\end{array}\right) \)
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
RE: (32S, 32SII) Birthday paradox, hash collision probability - Albert Chan - 05-03-2020 01:00 PM



User(s) browsing this thread: 1 Guest(s)