| hi Richie,
I don't see any way to really simplify this problem, but I can rewrite
a general equation which may suggest something to somebody.
This is a special case of the multinomial distribution, with all
pi the same. The stat books I have at hand give little more than
the equation below, but maybe somebody has more information
The probability of a given distribution of coins is
P(n1,n2,...nK) = N! * K**(-N) / ( n1! n2! ... nK! )
where ni is the number of coins acquired by knight i. The probability
is zero unless all ni are non-negative and
n1 + n2 + ... nK = K
The expected value of M, the number of coins acquired by the knight(s)
who acquired the most coins, is the sum over all M distributions
of M for a given distribution times the probability of that
distribution.
E(M) = N! * K**(-N) SUM( max(n1 n2 ... nk) / ( n1! n2! ... nK! ))
The sum runs over all non-negative values of ni such that
n1 + n2 + ... nK = K
I don't see any way to simplify this for the general case.
If N and K are both reasonably small, you could just write a program
to evaluate the sum.
> I know that the law of large numbers states that if N is large enough the
> answer is "1+epsilon"; its when N is not very large that its interesting)
But if N is large relative to K, then the factorials can be
approximated by Stirling's formula, and the sum should simplify
and you will get something better than 1+epsilon.
And if K is large, then the ni will be approximately independent,
and each ni can be approximated by a normal distribution around
N/K.
|
| Hi, Wally!
Yup, I got my result for K=2 by expanding using the binomial distribution
and simplifying, then got the asymptotic expression by using Stirling's
formula. I then tried it for K=3 and got tied up in knots.
Generating the answer directly by summing terms works for very small N and K,
but the number of terms in the multinomial expansion explodes wildly for
larger N and K (I think its something like (N+K-1)!/(N!(K-1)!)) and direct
computation becomes impossible. Consider the following variation of the
problem:
"What is the expected value of the maximum number of employees in a DEC-sized
company who have the same birthday?"
In this case N=120000 (give or take) and K=365. Lotsa computing!
I did, by generating a lot of random cases and playing curve-fitting games,
come up with an elegant-looking simplified formula that seems to give
a very good asymptotic approximation to the ratio; it is:
( K ) K-1
1 + SQRT(------) * SUM (1/i) [the sum is ~LN(K-1) for large K]
( PI*N ) i=1
But I don't have a clue as to how to prove it...
(By the way, N=120000 and K=365 gives a ratio of ~1.2, so the answer to the
above problem is about 1.2*120000/365 = 394 employees)
|
| Well, after a lot of work on this I managed to come up with an algorithm
that reduces the computation of the expected value of the ratio from O(N^K)
to O(K*N^2), and used that algorithm to calibrate the asymptotic approximation
in .2; the approximation still looks real good, and converges swiftly, but
I'll never be able to prove it.
For the record, the O(K*N^2) algorithm follows the form feed. Its ugly,
but accurate...
if F(N,K) is the desired ratio, then:
N*K N! K! K-2 g(N, K-i, 0)
F(N,K) = ------- + ------- * SUM -------------
N N i=0 i!
K K
Where:
[n/k] h(n-i, k-1, i)
g(n,k,r) = SUM ------------------
i=r+1 k
(i!/r!)
And:
/ \
/ r/(k+1)! if n = k*r \ k-2 g(n-ir, k-i, r)
h(n,k,r) = < > + SUM ------------------
\ r! /k!(n-kr+r-1)! if n > k*r / i=0 (i+1)!
\ /
Since r is always <= n/k when g or h is invoked, using the recurrences to
compute all the g(n,k,r) and h(n,k,r) for n<=N and k<=K performs O(K*N^2)
computations, where a computation is a division (by a factorial or an integer
raised to an integer power) and an add.
|
| You can get good Monte-Carlo approximations with a simple program that
calls a random-number generator N times to produce N random integers in
[1,K], get the ratio, then repeat a gazillion times and average them. The
results agree with the output of the recurrence to within the expected
Monte-Carlo error.
That recurrence really looks a mess, but its quite simple (to compute from)
compared to some of the intermediate recurrences I got. The derivation
starts from Wally's multinomial formulation in .1, groups the (equivalued)
terms whose denominators are permutations of each other so that each
denominator is one of the K-way partitions of N, and then uses a recurrence
for enumerating those partitions.
|