[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

1846.0. "The breaking of world records" by BAYES::HEIMANN (Heisenberg may have been here) Fri Feb 18 1994 14:05

With the Olympics going on and world records being broken, I have a
question, for which someone can either answer directly or cite 
a reference:

Let X(n) be the n-th draw from a standard normal distribution (mean 0, 
variance 1), and let Y(n) = max (X(1), ... , X(n)).  Let n0=1, n1 be the 
first occurrence of n for which X(n) is in fact the maximum value (i.e., 
Y(n)=X(n)), n2 be the second such occurrence, etc.  What can be said about 
the growth of E(Y(n)) for n=1,2,3,...?  What can be said about the values 
E(n1-n0), E(n2-n1), E(n3-n2), ... ?

The above represents the level and progression of world records being 
broken when there is actually no fundamental improvement in the basic 
performance.  I'm wondering how this compares to the actual breaking 
of Olympic records.

						Thanks,
						David


P.S. -- A related question:  How strongly does the result depend upon the 
specific distribution that X(n) assumes (what if it were not n(0,1))?
T.RTitleUserPersonal
Name
DateLines
1846.1NOVA::FINNERTYSell high, buy lowTue Mar 08 1994 13:4022
    
    Under those assumptions (which don't seem very realistic), and
    assuming that the events are independent, and assuming that the
    events occur at uniform time intervals, then the time required
    to break a previous record depends on the inverse of the probability
    of the outcome.
    
    So, for an N(0,1) variable, P(x >= value) = 1 - (1/sqrt(2pi))*
    integral(-x:inf exp(-t�/2) dt), so the expected time to reach
    'value' is 1/P(x >= value).
    
    Since the events are presumed to be independent, the 'clock'
    resets after each record is set (since you are interested in
    the time between successive records), and the time requried
    is just a function of the number of sigmas away from the
    mean that the record represents.
    
    The easiest generalization to make is that the expected time
    between successive records is a strictly increasing function.
    
    /jim
    
1846.2A specific result.CADSYS::COOPERTopher CooperTue Mar 08 1994 15:0954
    Let {X1, X2, ... , Xn} be n independent, identically distributed random
    variables.  Furthermore, assume that the distribution is continuous
    (this means in this context, essentially, that we can neglect the
    possibility of an exact tie), with

	f1 being the probability density function, and
	F1 being the cumulative probability function

    (i.e., F1(x) = int(f1(y),y=-infinity..x)).

    Now let X(n) be a random variable which is the maximum of those n
    X's.  It is easy to show that:

	Fn(x) = the cumulative probabilty function for X(n) =
	    F1(x)^n;
	fn(x) = the probability density function for X(n) =
	    n*f1(x)*F1(x)^(n-1).

    I'm going to answer the question, "what is the probability that Xn will
    exceed the maximum of the previous n-1 RVs?"

    This probability is given by the "sum" over all possible values that
    Xn can take of the "probability" (from the probability density
    function) that Xn will have that value times the probability that
    the maximum of the previous n-1 values will be less than that value,
    i.e.,

	int(f1(x)*F[n-1](x), x=-infinity..infinity)

    which is:

	int(f1(x)*F1(x)^(n-1), x=-infinity..infinity)

    but the expression being integrated is:

	f1(x)*F1(x)^(n-1) = fn(x)/n

    so the integral is:

	int(fn(x)/n, x=-infinity..infinity) =
	(1/n)*int(fn(x), x=-infinity..infinity)

    Since fn is a probability density function, its integral over the reals
    is 1 so we get as the probability that the n'th try at the record will
    beat it is simply:

	1/n

    independent of the initial distribution (subject to the no-ties and
    other assumptions).

    I think that's a pretty neat result.

				Topher
1846.3interesting resultNOVA::FINNERTYSell high, buy lowTue Mar 08 1994 16:3512
    
    >> It is easy to show that:
    >>
    >>        Fn(x) = the cumulative probabilty function for X(n) =
    >>            F1(x)^n;
    >>        fn(x) = the probability density function for X(n) =
    >>            n*f1(x)*F1(x)^(n-1).
    
    please do!
    
    /jim
    
1846.4ProofCADSYS::COOPERTopher CooperWed Mar 09 1994 09:3028
RE: .3 (jim)

    Using the same notation as in .2.

    We want to find:

	Fn(x) = prob(X(n) <= x)

	    = prob((X1 <= x) & (X2 <= x) & (X3 <= x) & ... & (Xn <= x))

    Since the Xi's are independent this equals:

	    = prob(X1 <= x) * prob(X2 <= x) * ... * prob(Xn <= x)

    Since the Xi's are identically distributed with cumulative probability
    function F1 this equals:

	    = F1(x) * F1(x) * ... * F1(x)
	    = F1(x)^n

    The probability density function fn(x) can now be found by:

	fn(x) = dFn(x)/dx = dF1(x)^n / dx = n*F1(x)^(n-1)*dF1(x)
	    = n*f1(x)*F1(x)^(n-1)

		QED

					Topher
1846.5Paradox?VMSDEV::HALLYBFish have no concept of fireWed Mar 09 1994 11:5662
    The next question would likely be along the lines of "How long will it
    take before the world record is broken for the Nth time?" Measuring
    "how long" by number of attempts.
    
    For N=1 we use the "Cooper formula" to note that the record is broken
    for the 1st time with probability 1, i.e., the first attempt will
    obviously set the world record.
    
    Now about N=2? I.e., how many tries will it take, on average, before the 
    original record is broken for the first time? You might want to stop
    and think about that for a second or two. Half the time the second
    attempt will succeed, etc.
    
    [Pause here and guess the average number of tries before record is broken]
    
    This looks to be a standard series of the form
    
         oo
    	----
    	\    i * Pr(BREAK(i))
    	/     
    	----
         i=2
    
    This is just your expected value formula. To calculate the probability
    the record is broken on the i-th try, let's look at the first few terms:
    
    For i=2 the probability the i=1 record is broken is 1/2. Write this as:
          (1/1) * (1/2) = 1/2.
    
    For i=3 the probability the i=1 record is broken is:
    	The probability the record was not broken at i=2 TIMES 1/3rd.
    I.e., (1/2) * (1/3) = 1/6.
    
    For i=4 the probability the i=1 record is broken is:
    	The probability the record was not broken at (i=2 or i=3) TIMES 1/4th.
    I.e., (1-1/2-1/6) * (1/4) = (1/3) * (1/4)
    
    For i=5 we get (1-1/2-1/6-1/12) * (1/5) = (1/4) * (1/5). And so on.
    
    I think induction can be used to show that at step K the probability
    the record is NOT broken at any prior step is 1/(K-1), so the
    probability of breaking the record at step K is 1/(K-1) * (1/K).
    
    Well, then, the expected number of attempts is calculated as:
    
         oo
    	----
    	\    	  i
    	 >    --------
    	/     (i-1)(i)
    	----
         i=2
    
    ... which converges to, well....ahhh.....well.... it doesn't converge.
    It's the Harmonic series.
    
    Which is to say that most world records are never broken.
    
    Can anybody spot any flaws in the above?
    
      John
1846.6Resolution.CADSYS::COOPERTopher CooperWed Mar 09 1994 14:5534
    It does not follow from the fact that there is no mean length of time
    until the initial record is broken that "most world records are never
    broken."  For example, if there were 1 chance in 10^1000 that the
    initial record were never broken then there would be no mean, even if
    excluding those cases where it runs on forever leaves a small mean.
    In this case, though, the lack of a mean does not even mean that there
    is a tiny non-zero chance of the record being sustained forever.

    Note, for example, that 2/3'ds of initial records are broken on or
    before the third event.

    Here is another example of this phenomenon which may make things
    clearer.  Imagine some kind of server which accepts "jobs" and
    processes them at a steady rate of 1 each second, and queues them
    (first come first serve) if it is not ready to process them.  How long
    will an incoming job have to wait to bre processed if jobs arrive at a
    steady rate of 2 per second?

    Well, the i'th ariving job will arrive at time i/2.  During that time,
    i/2 (truncated, but ignore that for simplicity) jobs will have been
    processed, leaving i/2 jobs unprocesses, so the i'th job will have to
    wait i/2 seconds.

    What is the mean waiting time?

	limit sum(i/2n, i=1..n)
	n->oo

    which does not converge.  The mean waiting time is infinite.

    Yet each job will only wait a finite amount of time.  Paradoxical only
    in the sense that it is an *apparent* contradiction.

				Topher
1846.7Alternate derivation.CADSYS::COOPERTopher CooperWed Mar 09 1994 15:1916
    Sometimes we do things the hard way first.  Here is a derivation of the
    result in .2 which is simpler and more intuitive.

    Imagine that there have been "n" events.  We can assign to each event a
    rank -- 1 for the lowest result, 2 for the next, etc.  The independence
    assumption means that each permutation of the values from 1..n are
    equally likely.  The n'th event will be a record only if it has rank n. 
    Since each of the n ranks are equally likely for this position, there
    is one chance in n that it will be rank n. QED

    Also note that there will be no record breaker after the first of those
    n, will occur iff the first element has rank n (i.e., is the largest of
    the n values), which also has probability of 1/n.  This is the result
    that John got.

					Topher
1846.8independent eventsNOVA::FINNERTYSell high, buy lowThu Mar 10 1994 11:1422
    
    >> For i=2 the probability the i=1 record is broken is 1/2. Write this
    >> as:
    >>          (1/1) * (1/2) = 1/2.
    >>
    >> For i=3 the probability the i=1 record is broken is:
    >>      The probability the record was not broken at i=2 TIMES 1/3rd.
    >>   I.e., (1/2) * (1/3) = 1/6.
    
    I interpreted Topher's result differently.  Since the events are
    independent, 
    
    For i=3 the probability the i=1 record is broken is:
    	The probability the record was not broken at i=2 times 1/2
    	I.e., (1/2) * (1/2) = 1/4.
    
    So that after the record has been broken n times, we expect to break
    it after n+1 more tries.  In other words, if the 1st record is not
    broken in the 2nd event, this shouldn't reduce the probability
    of breaking the 1st record in the 3rd event.
    
    /jim
1846.9Not independent.CADSYS::COOPERTopher CooperThu Mar 10 1994 12:2039
RE: .8 (jim)

    There is a subtlty here which took me some time to resolve in my own
    head.  The events are not truely independent.

    If you know precisely what the previous record is, then intervening,
    failures do not, as you say, change the probability of a success.  But,
    as stated, you do not know what the previous record is, and knowing
    that there have been x failures shifts the probabilities that you
    would assign to various possible values of that previous record.  A
    lot of failures makes it distinctly more likely that the previous
    record was a high value.  This results in the 1/n probability.

    It is easy to show this for the example in question, by using the
    rank permutations view of the situation (in fact, it was in looking at
    this issue that I came up with the rank permutation based proof).

    Looking at the rank among the first three events:

	Perm.	R1 R2 R3
	----------------
	123	T  T  T
	132	T  T  -
	213	T  -  T
	231	T  T  -
	312	T  -  -
	321	T  -  -

    The Rx column indicates whether a new record was set on event x.

    As expected all 6 of the permutations produces a record on event 1,
    three of the six (1/2) of the permutations produces a record on event
    2, and two of the six (1/3) of the permutations produces a record on
    event 3.  Three of the events (213, 312 and 321) failed to produce
    a record on event two, and of those three one (213) does produce a
    record on the third event, therefore the probability of producing a
    record on the third event if no record were set on the second is 1/3.

					Topher
1846.10Median.CADSYS::COOPERTopher CooperThu Mar 10 1994 12:3425
    Since the mean cannot be applied, it is reasonable to look to the
    median, which is a more robust measure of "central tendency".

    I'll look at a slightly different question.  You are about to
    attend a series of events, starting with event "n".  You want to
    see the old record broken (though, as discussed, you don't know what
    that record is).  How many events must you attend to have a 50%
    chance of seeing the record broken.

    A bit of playing with maple gave me the answer:

	    floor[M(n)] + ceiling[M(n)]
	   -----------------------------
			2

    where M(n) = n^2/(n-2).

    In general, the P'th quantile is, ignoring that it must be an integer
    value:

	   ( P*n^2)/(Q*n - 1)

   where Q = 1 - P.

				Topher
1846.11IndependentCADSYS::COOPERTopher CooperFri Mar 11 1994 10:1031
RE: .9 (me)

    It seems, on the basis of some EMail, that some sloppy language on my
    part has sown some confusion.

    In the formal statistical sense of the word, the probability of the
    events *are* independent, since the probability of an event n being
    a record breaker is unchanged given information about whether or not
    previous events are record breakers.

    Our intuition about "independence" is based on situations like coin
    flips where the events in question are (to use a different, if somewhat
    clumsy word) "non-influencing", and common -- as opposed to technical
    -- usage of "independence" has implications of "non-influencing".

    In this example there are some subtle forms of "influence" going on --
    at least if viewed from some natural perspectives -- but these
    influences cancel out in such a way to produce *statistical*
    independence.  Jim was arguing that since the trials are
    non-influencing -- memoryless in a restricted sense -- that a failure
    does not *modify* the probability of a record, which compensates
    for the decline implied by the 1/n rule.  That means that the lack of
    influnce would create a *statistical* dependence.  I was arguing that
    there *is* an influence (which I called a dependence) and that
    therefore there was *no* statistical dependence -- i.e., the
    probability was still 1/n.

    That may have confused you even more -- but at least you know now that
    you are confused ;-)

					Topher
1846.12Some data to speculate uponVMSDEV::HALLYBFish have no concept of fireFri Mar 11 1994 15:0940
    Here are some empirical results.
    
    Using the "Cooper Principle" I counted the number of attempts before
    a record would be broken for the 8th time. I ran 10,000 independent
    experiments.
    
    Below is a histogram of the iteration number at which the "record" was
    broken for the 8th time. Note the bins are not linear, but logarithmic.
    Looks like the raw data (interation number for Ri=8) is lognormal (?).
    
    Good evidence that experimental data are well-behaved even though the
    theoretical mean is undefined. I.e., it just "looks" like a paradox.
    
      John
    
    Average for 10000 iterations:  110457.92 tries
    Minimum was 16, maximum was 99270410
    LOG(-)  Count  Histo
    
        0      0
        1      0
        2      8
        3    327  ****
        4    817  **********
        5    638  *******
        6    940  ***********
        7   1311  ****************
        8   1327  ****************
        9   1342  ****************
       10   1030  ************
       11    836  **********
       12    566  *******
       13    357  ****
       14    203  **
       15    130  *
       16     81  *
       17     34
       18     13
       19      0
    
1846.13lognormal!CADSYS::COOPERTopher CooperMon Mar 14 1994 13:0077
RE: .12

>    Looks like the raw data (interation number for Ri=8) is lognormal (?).

    Great intuition!

    My first reaction was "It takes a lot more than a roughly symetric,
    humpbacked distribution to imply lognormality."  But then I thought
    about it some, and realized that:

	The time for the k'th record asymptotically approaches a lognormal
	distribution for sufficiently large k.

    For those without much statistics background let me say that a random
    variable X is lognormally distributed with parameters a, m and s if
    Y = ln(X-a) is normally distributed iwth mean m and standard deviation
    s.  The parameter "a" is frequently 0.

    Anyway, imagine that instead of a discrete set of events, with
    probability 1/n of for the n'th event we have a continuous set of
    events -- i.e. a record can be broken at any time.  The probability
    density is 1/x for time x.

    We are interested in the distribution of the time at which the k'th
    record is broken, call that quantity E.  Instead of looking at E
    directly, though, let us look, as suggested by John, at

	    T = ln(E)

    First a basic question.  What is the probability that a record will
    occur between logrithmatized times "A" and "A+L" for "L" short enough
    that we can ignore the possibilty of two events occuring?  The
    probability that a record will occur at those transformed times is
    the probability that a record will occur between time exp(A) and
    exp(A+L) in the untransformed time.  That probability is:

	    int(1/x, x=exp(A)..exp(A+L))

    which simply equal "L".  The probability is proportional to the length
    of the interval independent of the placement.

    This is the condition known classically as a Poisson process, and is
    illustrated by radioactive decay.  In a Poisson process the time till
    the first event (decay or record) is distributed according to an
    exponential distribution, and the time till the k'th event is
    distributed according to the sum of exponential distributions (which
    is a Gamma distribution with positive integral parameter k).  The
    sum of sufficiently well behaved random variables approaches a normal
    distribution, and so it is for the Gamma distribution.

    If the substitution of a continuous RV (Random Variable) for a discrete
    one can be justified that proves the theorem.

    We can replace (or merely view) the discrete RV with a step-function
    distribution (since we do not cumulate to 1, I am using "distribution"
    here loosely).  The continuous, 1/x, distribution quickly (specfically
    quadratically) becomes flat and horizontal for increasing x, and so
    it is a good approximation to the discrete case for large enough "n".
    When k becomes large most of the records are taking place at large n,
    where the approximation is good.  With large enough k the relative
    contribution of exactly when the first few records are set will become
    negligable to the overall time, so convergence will occur.  QED.

    I considered using the "a" parameter for the lognormal distribution to
    add in an offset for the error introduced by the continuity
    approximation.  This would be in the form of an a(k) function
    correcting for a an error offset for each k.  If this could be made to
    work, though, (I don't know if it could) it is unneeded for the basic,
    asymptotic result.  As the later records, occuring at later times where
    the approximation is more accurate, become more important to the
    overall time, the correction, a(k) will converge to a constant value.
    The m and s parameters will, however continue to grow (linearly for
    m and by the square root for s) untill the effect of the a(k) will be
    negligable.  Use of a(k) might cause slightly faster convergence, but
    it is not necessary for convergence.

				    Topher