T.R | Title | User | Personal Name | Date | Lines |
---|
1846.1 | | NOVA::FINNERTY | Sell high, buy low | Tue Mar 08 1994 13:40 | 22 |
|
Under those assumptions (which don't seem very realistic), and
assuming that the events are independent, and assuming that the
events occur at uniform time intervals, then the time required
to break a previous record depends on the inverse of the probability
of the outcome.
So, for an N(0,1) variable, P(x >= value) = 1 - (1/sqrt(2pi))*
integral(-x:inf exp(-t�/2) dt), so the expected time to reach
'value' is 1/P(x >= value).
Since the events are presumed to be independent, the 'clock'
resets after each record is set (since you are interested in
the time between successive records), and the time requried
is just a function of the number of sigmas away from the
mean that the record represents.
The easiest generalization to make is that the expected time
between successive records is a strictly increasing function.
/jim
|
1846.2 | A specific result. | CADSYS::COOPER | Topher Cooper | Tue Mar 08 1994 15:09 | 54 |
| Let {X1, X2, ... , Xn} be n independent, identically distributed random
variables. Furthermore, assume that the distribution is continuous
(this means in this context, essentially, that we can neglect the
possibility of an exact tie), with
f1 being the probability density function, and
F1 being the cumulative probability function
(i.e., F1(x) = int(f1(y),y=-infinity..x)).
Now let X(n) be a random variable which is the maximum of those n
X's. It is easy to show that:
Fn(x) = the cumulative probabilty function for X(n) =
F1(x)^n;
fn(x) = the probability density function for X(n) =
n*f1(x)*F1(x)^(n-1).
I'm going to answer the question, "what is the probability that Xn will
exceed the maximum of the previous n-1 RVs?"
This probability is given by the "sum" over all possible values that
Xn can take of the "probability" (from the probability density
function) that Xn will have that value times the probability that
the maximum of the previous n-1 values will be less than that value,
i.e.,
int(f1(x)*F[n-1](x), x=-infinity..infinity)
which is:
int(f1(x)*F1(x)^(n-1), x=-infinity..infinity)
but the expression being integrated is:
f1(x)*F1(x)^(n-1) = fn(x)/n
so the integral is:
int(fn(x)/n, x=-infinity..infinity) =
(1/n)*int(fn(x), x=-infinity..infinity)
Since fn is a probability density function, its integral over the reals
is 1 so we get as the probability that the n'th try at the record will
beat it is simply:
1/n
independent of the initial distribution (subject to the no-ties and
other assumptions).
I think that's a pretty neat result.
Topher
|
1846.3 | interesting result | NOVA::FINNERTY | Sell high, buy low | Tue Mar 08 1994 16:35 | 12 |
|
>> It is easy to show that:
>>
>> Fn(x) = the cumulative probabilty function for X(n) =
>> F1(x)^n;
>> fn(x) = the probability density function for X(n) =
>> n*f1(x)*F1(x)^(n-1).
please do!
/jim
|
1846.4 | Proof | CADSYS::COOPER | Topher Cooper | Wed Mar 09 1994 09:30 | 28 |
| RE: .3 (jim)
Using the same notation as in .2.
We want to find:
Fn(x) = prob(X(n) <= x)
= prob((X1 <= x) & (X2 <= x) & (X3 <= x) & ... & (Xn <= x))
Since the Xi's are independent this equals:
= prob(X1 <= x) * prob(X2 <= x) * ... * prob(Xn <= x)
Since the Xi's are identically distributed with cumulative probability
function F1 this equals:
= F1(x) * F1(x) * ... * F1(x)
= F1(x)^n
The probability density function fn(x) can now be found by:
fn(x) = dFn(x)/dx = dF1(x)^n / dx = n*F1(x)^(n-1)*dF1(x)
= n*f1(x)*F1(x)^(n-1)
QED
Topher
|
1846.5 | Paradox? | VMSDEV::HALLYB | Fish have no concept of fire | Wed Mar 09 1994 11:56 | 62 |
| The next question would likely be along the lines of "How long will it
take before the world record is broken for the Nth time?" Measuring
"how long" by number of attempts.
For N=1 we use the "Cooper formula" to note that the record is broken
for the 1st time with probability 1, i.e., the first attempt will
obviously set the world record.
Now about N=2? I.e., how many tries will it take, on average, before the
original record is broken for the first time? You might want to stop
and think about that for a second or two. Half the time the second
attempt will succeed, etc.
[Pause here and guess the average number of tries before record is broken]
This looks to be a standard series of the form
oo
----
\ i * Pr(BREAK(i))
/
----
i=2
This is just your expected value formula. To calculate the probability
the record is broken on the i-th try, let's look at the first few terms:
For i=2 the probability the i=1 record is broken is 1/2. Write this as:
(1/1) * (1/2) = 1/2.
For i=3 the probability the i=1 record is broken is:
The probability the record was not broken at i=2 TIMES 1/3rd.
I.e., (1/2) * (1/3) = 1/6.
For i=4 the probability the i=1 record is broken is:
The probability the record was not broken at (i=2 or i=3) TIMES 1/4th.
I.e., (1-1/2-1/6) * (1/4) = (1/3) * (1/4)
For i=5 we get (1-1/2-1/6-1/12) * (1/5) = (1/4) * (1/5). And so on.
I think induction can be used to show that at step K the probability
the record is NOT broken at any prior step is 1/(K-1), so the
probability of breaking the record at step K is 1/(K-1) * (1/K).
Well, then, the expected number of attempts is calculated as:
oo
----
\ i
> --------
/ (i-1)(i)
----
i=2
... which converges to, well....ahhh.....well.... it doesn't converge.
It's the Harmonic series.
Which is to say that most world records are never broken.
Can anybody spot any flaws in the above?
John
|
1846.6 | Resolution. | CADSYS::COOPER | Topher Cooper | Wed Mar 09 1994 14:55 | 34 |
| It does not follow from the fact that there is no mean length of time
until the initial record is broken that "most world records are never
broken." For example, if there were 1 chance in 10^1000 that the
initial record were never broken then there would be no mean, even if
excluding those cases where it runs on forever leaves a small mean.
In this case, though, the lack of a mean does not even mean that there
is a tiny non-zero chance of the record being sustained forever.
Note, for example, that 2/3'ds of initial records are broken on or
before the third event.
Here is another example of this phenomenon which may make things
clearer. Imagine some kind of server which accepts "jobs" and
processes them at a steady rate of 1 each second, and queues them
(first come first serve) if it is not ready to process them. How long
will an incoming job have to wait to bre processed if jobs arrive at a
steady rate of 2 per second?
Well, the i'th ariving job will arrive at time i/2. During that time,
i/2 (truncated, but ignore that for simplicity) jobs will have been
processed, leaving i/2 jobs unprocesses, so the i'th job will have to
wait i/2 seconds.
What is the mean waiting time?
limit sum(i/2n, i=1..n)
n->oo
which does not converge. The mean waiting time is infinite.
Yet each job will only wait a finite amount of time. Paradoxical only
in the sense that it is an *apparent* contradiction.
Topher
|
1846.7 | Alternate derivation. | CADSYS::COOPER | Topher Cooper | Wed Mar 09 1994 15:19 | 16 |
| Sometimes we do things the hard way first. Here is a derivation of the
result in .2 which is simpler and more intuitive.
Imagine that there have been "n" events. We can assign to each event a
rank -- 1 for the lowest result, 2 for the next, etc. The independence
assumption means that each permutation of the values from 1..n are
equally likely. The n'th event will be a record only if it has rank n.
Since each of the n ranks are equally likely for this position, there
is one chance in n that it will be rank n. QED
Also note that there will be no record breaker after the first of those
n, will occur iff the first element has rank n (i.e., is the largest of
the n values), which also has probability of 1/n. This is the result
that John got.
Topher
|
1846.8 | independent events | NOVA::FINNERTY | Sell high, buy low | Thu Mar 10 1994 11:14 | 22 |
|
>> For i=2 the probability the i=1 record is broken is 1/2. Write this
>> as:
>> (1/1) * (1/2) = 1/2.
>>
>> For i=3 the probability the i=1 record is broken is:
>> The probability the record was not broken at i=2 TIMES 1/3rd.
>> I.e., (1/2) * (1/3) = 1/6.
I interpreted Topher's result differently. Since the events are
independent,
For i=3 the probability the i=1 record is broken is:
The probability the record was not broken at i=2 times 1/2
I.e., (1/2) * (1/2) = 1/4.
So that after the record has been broken n times, we expect to break
it after n+1 more tries. In other words, if the 1st record is not
broken in the 2nd event, this shouldn't reduce the probability
of breaking the 1st record in the 3rd event.
/jim
|
1846.9 | Not independent. | CADSYS::COOPER | Topher Cooper | Thu Mar 10 1994 12:20 | 39 |
| RE: .8 (jim)
There is a subtlty here which took me some time to resolve in my own
head. The events are not truely independent.
If you know precisely what the previous record is, then intervening,
failures do not, as you say, change the probability of a success. But,
as stated, you do not know what the previous record is, and knowing
that there have been x failures shifts the probabilities that you
would assign to various possible values of that previous record. A
lot of failures makes it distinctly more likely that the previous
record was a high value. This results in the 1/n probability.
It is easy to show this for the example in question, by using the
rank permutations view of the situation (in fact, it was in looking at
this issue that I came up with the rank permutation based proof).
Looking at the rank among the first three events:
Perm. R1 R2 R3
----------------
123 T T T
132 T T -
213 T - T
231 T T -
312 T - -
321 T - -
The Rx column indicates whether a new record was set on event x.
As expected all 6 of the permutations produces a record on event 1,
three of the six (1/2) of the permutations produces a record on event
2, and two of the six (1/3) of the permutations produces a record on
event 3. Three of the events (213, 312 and 321) failed to produce
a record on event two, and of those three one (213) does produce a
record on the third event, therefore the probability of producing a
record on the third event if no record were set on the second is 1/3.
Topher
|
1846.10 | Median. | CADSYS::COOPER | Topher Cooper | Thu Mar 10 1994 12:34 | 25 |
| Since the mean cannot be applied, it is reasonable to look to the
median, which is a more robust measure of "central tendency".
I'll look at a slightly different question. You are about to
attend a series of events, starting with event "n". You want to
see the old record broken (though, as discussed, you don't know what
that record is). How many events must you attend to have a 50%
chance of seeing the record broken.
A bit of playing with maple gave me the answer:
floor[M(n)] + ceiling[M(n)]
-----------------------------
2
where M(n) = n^2/(n-2).
In general, the P'th quantile is, ignoring that it must be an integer
value:
( P*n^2)/(Q*n - 1)
where Q = 1 - P.
Topher
|
1846.11 | Independent | CADSYS::COOPER | Topher Cooper | Fri Mar 11 1994 10:10 | 31 |
| RE: .9 (me)
It seems, on the basis of some EMail, that some sloppy language on my
part has sown some confusion.
In the formal statistical sense of the word, the probability of the
events *are* independent, since the probability of an event n being
a record breaker is unchanged given information about whether or not
previous events are record breakers.
Our intuition about "independence" is based on situations like coin
flips where the events in question are (to use a different, if somewhat
clumsy word) "non-influencing", and common -- as opposed to technical
-- usage of "independence" has implications of "non-influencing".
In this example there are some subtle forms of "influence" going on --
at least if viewed from some natural perspectives -- but these
influences cancel out in such a way to produce *statistical*
independence. Jim was arguing that since the trials are
non-influencing -- memoryless in a restricted sense -- that a failure
does not *modify* the probability of a record, which compensates
for the decline implied by the 1/n rule. That means that the lack of
influnce would create a *statistical* dependence. I was arguing that
there *is* an influence (which I called a dependence) and that
therefore there was *no* statistical dependence -- i.e., the
probability was still 1/n.
That may have confused you even more -- but at least you know now that
you are confused ;-)
Topher
|
1846.12 | Some data to speculate upon | VMSDEV::HALLYB | Fish have no concept of fire | Fri Mar 11 1994 15:09 | 40 |
| Here are some empirical results.
Using the "Cooper Principle" I counted the number of attempts before
a record would be broken for the 8th time. I ran 10,000 independent
experiments.
Below is a histogram of the iteration number at which the "record" was
broken for the 8th time. Note the bins are not linear, but logarithmic.
Looks like the raw data (interation number for Ri=8) is lognormal (?).
Good evidence that experimental data are well-behaved even though the
theoretical mean is undefined. I.e., it just "looks" like a paradox.
John
Average for 10000 iterations: 110457.92 tries
Minimum was 16, maximum was 99270410
LOG(-) Count Histo
0 0
1 0
2 8
3 327 ****
4 817 **********
5 638 *******
6 940 ***********
7 1311 ****************
8 1327 ****************
9 1342 ****************
10 1030 ************
11 836 **********
12 566 *******
13 357 ****
14 203 **
15 130 *
16 81 *
17 34
18 13
19 0
|
1846.13 | lognormal! | CADSYS::COOPER | Topher Cooper | Mon Mar 14 1994 13:00 | 77 |
| RE: .12
> Looks like the raw data (interation number for Ri=8) is lognormal (?).
Great intuition!
My first reaction was "It takes a lot more than a roughly symetric,
humpbacked distribution to imply lognormality." But then I thought
about it some, and realized that:
The time for the k'th record asymptotically approaches a lognormal
distribution for sufficiently large k.
For those without much statistics background let me say that a random
variable X is lognormally distributed with parameters a, m and s if
Y = ln(X-a) is normally distributed iwth mean m and standard deviation
s. The parameter "a" is frequently 0.
Anyway, imagine that instead of a discrete set of events, with
probability 1/n of for the n'th event we have a continuous set of
events -- i.e. a record can be broken at any time. The probability
density is 1/x for time x.
We are interested in the distribution of the time at which the k'th
record is broken, call that quantity E. Instead of looking at E
directly, though, let us look, as suggested by John, at
T = ln(E)
First a basic question. What is the probability that a record will
occur between logrithmatized times "A" and "A+L" for "L" short enough
that we can ignore the possibilty of two events occuring? The
probability that a record will occur at those transformed times is
the probability that a record will occur between time exp(A) and
exp(A+L) in the untransformed time. That probability is:
int(1/x, x=exp(A)..exp(A+L))
which simply equal "L". The probability is proportional to the length
of the interval independent of the placement.
This is the condition known classically as a Poisson process, and is
illustrated by radioactive decay. In a Poisson process the time till
the first event (decay or record) is distributed according to an
exponential distribution, and the time till the k'th event is
distributed according to the sum of exponential distributions (which
is a Gamma distribution with positive integral parameter k). The
sum of sufficiently well behaved random variables approaches a normal
distribution, and so it is for the Gamma distribution.
If the substitution of a continuous RV (Random Variable) for a discrete
one can be justified that proves the theorem.
We can replace (or merely view) the discrete RV with a step-function
distribution (since we do not cumulate to 1, I am using "distribution"
here loosely). The continuous, 1/x, distribution quickly (specfically
quadratically) becomes flat and horizontal for increasing x, and so
it is a good approximation to the discrete case for large enough "n".
When k becomes large most of the records are taking place at large n,
where the approximation is good. With large enough k the relative
contribution of exactly when the first few records are set will become
negligable to the overall time, so convergence will occur. QED.
I considered using the "a" parameter for the lognormal distribution to
add in an offset for the error introduced by the continuity
approximation. This would be in the form of an a(k) function
correcting for a an error offset for each k. If this could be made to
work, though, (I don't know if it could) it is unneeded for the basic,
asymptotic result. As the later records, occuring at later times where
the approximation is more accurate, become more important to the
overall time, the correction, a(k) will converge to a constant value.
The m and s parameters will, however continue to grow (linearly for
m and by the square root for s) untill the effect of the a(k) will be
negligable. Use of a(k) might cause slightly faster convergence, but
it is not necessary for convergence.
Topher
|