[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

1846.0. "The breaking of world records" by BAYES::HEIMANN (Heisenberg may have been here) Fri Feb 18 1994 14:05

With the Olympics going on and world records being broken, I have a
question, for which someone can either answer directly or cite 
a reference:

Let X(n) be the n-th draw from a standard normal distribution (mean 0, 
variance 1), and let Y(n) = max (X(1), ... , X(n)).  Let n0=1, n1 be the 
first occurrence of n for which X(n) is in fact the maximum value (i.e., 
Y(n)=X(n)), n2 be the second such occurrence, etc.  What can be said about 
the growth of E(Y(n)) for n=1,2,3,...?  What can be said about the values 
E(n1-n0), E(n2-n1), E(n3-n2), ... ?

The above represents the level and progression of world records being 
broken when there is actually no fundamental improvement in the basic 
performance.  I'm wondering how this compares to the actual breaking 
of Olympic records.

						Thanks,
						David


P.S. -- A related question:  How strongly does the result depend upon the 
specific distribution that X(n) assumes (what if it were not n(0,1))?

T.R	Title	User	Personal Name	Date	Lines
1846.1		NOVA::FINNERTY	Sell high, buy low	`Tue Mar 08 1994 13:40`	22
	Under those assumptions (which don't seem very realistic), and assuming that the events are independent, and assuming that the events occur at uniform time intervals, then the time required to break a previous record depends on the inverse of the probability of the outcome. So, for an N(0,1) variable, P(x >= value) = 1 - (1/sqrt(2pi))* integral(-x:inf exp(-t�/2) dt), so the expected time to reach 'value' is 1/P(x >= value). Since the events are presumed to be independent, the 'clock' resets after each record is set (since you are interested in the time between successive records), and the time requried is just a function of the number of sigmas away from the mean that the record represents. The easiest generalization to make is that the expected time between successive records is a strictly increasing function. /jim
1846.2	A specific result.	CADSYS::COOPER	Topher Cooper	`Tue Mar 08 1994 15:09`	54
	Let {X1, X2, ... , Xn} be n independent, identically distributed random variables. Furthermore, assume that the distribution is continuous (this means in this context, essentially, that we can neglect the possibility of an exact tie), with f1 being the probability density function, and F1 being the cumulative probability function (i.e., F1(x) = int(f1(y),y=-infinity..x)). Now let X(n) be a random variable which is the maximum of those n X's. It is easy to show that: Fn(x) = the cumulative probabilty function for X(n) = F1(x)^n; fn(x) = the probability density function for X(n) = nf1(x)F1(x)^(n-1). I'm going to answer the question, "what is the probability that Xn will exceed the maximum of the previous n-1 RVs?" This probability is given by the "sum" over all possible values that Xn can take of the "probability" (from the probability density function) that Xn will have that value times the probability that the maximum of the previous n-1 values will be less than that value, i.e., int(f1(x)F[n-1](x), x=-infinity..infinity) which is: int(f1(x)F1(x)^(n-1), x=-infinity..infinity) but the expression being integrated is: f1(x)F1(x)^(n-1) = fn(x)/n so the integral is: int(fn(x)/n, x=-infinity..infinity) = (1/n)int(fn(x), x=-infinity..infinity) Since fn is a probability density function, its integral over the reals is 1 so we get as the probability that the n'th try at the record will beat it is simply: 1/n independent of the initial distribution (subject to the no-ties and other assumptions). I think that's a pretty neat result. Topher
1846.3	interesting result	NOVA::FINNERTY	Sell high, buy low	`Tue Mar 08 1994 16:35`	12
	>> It is easy to show that: >> >> Fn(x) = the cumulative probabilty function for X(n) = >> F1(x)^n; >> fn(x) = the probability density function for X(n) = >> nf1(x)F1(x)^(n-1). please do! /jim
1846.4	Proof	CADSYS::COOPER	Topher Cooper	`Wed Mar 09 1994 09:30`	28
	RE: .3 (jim) Using the same notation as in .2. We want to find: Fn(x) = prob(X(n) <= x) = prob((X1 <= x) & (X2 <= x) & (X3 <= x) & ... & (Xn <= x)) Since the Xi's are independent this equals: = prob(X1 <= x) * prob(X2 <= x) * ... * prob(Xn <= x) Since the Xi's are identically distributed with cumulative probability function F1 this equals: = F1(x) * F1(x) * ... * F1(x) = F1(x)^n The probability density function fn(x) can now be found by: fn(x) = dFn(x)/dx = dF1(x)^n / dx = nF1(x)^(n-1)dF1(x) = nf1(x)F1(x)^(n-1) QED Topher
1846.5	Paradox?	VMSDEV::HALLYB	Fish have no concept of fire	`Wed Mar 09 1994 11:56`	62
	The next question would likely be along the lines of "How long will it take before the world record is broken for the Nth time?" Measuring "how long" by number of attempts. For N=1 we use the "Cooper formula" to note that the record is broken for the 1st time with probability 1, i.e., the first attempt will obviously set the world record. Now about N=2? I.e., how many tries will it take, on average, before the original record is broken for the first time? You might want to stop and think about that for a second or two. Half the time the second attempt will succeed, etc. [Pause here and guess the average number of tries before record is broken] This looks to be a standard series of the form oo ---- \ i * Pr(BREAK(i)) / ---- i=2 This is just your expected value formula. To calculate the probability the record is broken on the i-th try, let's look at the first few terms: For i=2 the probability the i=1 record is broken is 1/2. Write this as: (1/1) * (1/2) = 1/2. For i=3 the probability the i=1 record is broken is: The probability the record was not broken at i=2 TIMES 1/3rd. I.e., (1/2) * (1/3) = 1/6. For i=4 the probability the i=1 record is broken is: The probability the record was not broken at (i=2 or i=3) TIMES 1/4th. I.e., (1-1/2-1/6) * (1/4) = (1/3) * (1/4) For i=5 we get (1-1/2-1/6-1/12) * (1/5) = (1/4) * (1/5). And so on. I think induction can be used to show that at step K the probability the record is NOT broken at any prior step is 1/(K-1), so the probability of breaking the record at step K is 1/(K-1) * (1/K). Well, then, the expected number of attempts is calculated as: oo ---- \ i > -------- / (i-1)(i) ---- i=2 ... which converges to, well....ahhh.....well.... it doesn't converge. It's the Harmonic series. Which is to say that most world records are never broken. Can anybody spot any flaws in the above? John
1846.6	Resolution.	CADSYS::COOPER	Topher Cooper	`Wed Mar 09 1994 14:55`	34
	It does not follow from the fact that there is no mean length of time until the initial record is broken that "most world records are never broken." For example, if there were 1 chance in 10^1000 that the initial record were never broken then there would be no mean, even if excluding those cases where it runs on forever leaves a small mean. In this case, though, the lack of a mean does not even mean that there is a tiny non-zero chance of the record being sustained forever. Note, for example, that 2/3'ds of initial records are broken on or before the third event. Here is another example of this phenomenon which may make things clearer. Imagine some kind of server which accepts "jobs" and processes them at a steady rate of 1 each second, and queues them (first come first serve) if it is not ready to process them. How long will an incoming job have to wait to bre processed if jobs arrive at a steady rate of 2 per second? Well, the i'th ariving job will arrive at time i/2. During that time, i/2 (truncated, but ignore that for simplicity) jobs will have been processed, leaving i/2 jobs unprocesses, so the i'th job will have to wait i/2 seconds. What is the mean waiting time? limit sum(i/2n, i=1..n) n->oo which does not converge. The mean waiting time is infinite. Yet each job will only wait a finite amount of time. Paradoxical only in the sense that it is an apparent contradiction. Topher
1846.7	Alternate derivation.	CADSYS::COOPER	Topher Cooper	`Wed Mar 09 1994 15:19`	16
	Sometimes we do things the hard way first. Here is a derivation of the result in .2 which is simpler and more intuitive. Imagine that there have been "n" events. We can assign to each event a rank -- 1 for the lowest result, 2 for the next, etc. The independence assumption means that each permutation of the values from 1..n are equally likely. The n'th event will be a record only if it has rank n. Since each of the n ranks are equally likely for this position, there is one chance in n that it will be rank n. QED Also note that there will be no record breaker after the first of those n, will occur iff the first element has rank n (i.e., is the largest of the n values), which also has probability of 1/n. This is the result that John got. Topher
1846.8	independent events	NOVA::FINNERTY	Sell high, buy low	`Thu Mar 10 1994 11:14`	22
	>> For i=2 the probability the i=1 record is broken is 1/2. Write this >> as: >> (1/1) * (1/2) = 1/2. >> >> For i=3 the probability the i=1 record is broken is: >> The probability the record was not broken at i=2 TIMES 1/3rd. >> I.e., (1/2) * (1/3) = 1/6. I interpreted Topher's result differently. Since the events are independent, For i=3 the probability the i=1 record is broken is: The probability the record was not broken at i=2 times 1/2 I.e., (1/2) * (1/2) = 1/4. So that after the record has been broken n times, we expect to break it after n+1 more tries. In other words, if the 1st record is not broken in the 2nd event, this shouldn't reduce the probability of breaking the 1st record in the 3rd event. /jim
1846.9	Not independent.	CADSYS::COOPER	Topher Cooper	`Thu Mar 10 1994 12:20`	39
	RE: .8 (jim) There is a subtlty here which took me some time to resolve in my own head. The events are not truely independent. If you know precisely what the previous record is, then intervening, failures do not, as you say, change the probability of a success. But, as stated, you do not know what the previous record is, and knowing that there have been x failures shifts the probabilities that you would assign to various possible values of that previous record. A lot of failures makes it distinctly more likely that the previous record was a high value. This results in the 1/n probability. It is easy to show this for the example in question, by using the rank permutations view of the situation (in fact, it was in looking at this issue that I came up with the rank permutation based proof). Looking at the rank among the first three events: Perm. R1 R2 R3 ---------------- 123 T T T 132 T T - 213 T - T 231 T T - 312 T - - 321 T - - The Rx column indicates whether a new record was set on event x. As expected all 6 of the permutations produces a record on event 1, three of the six (1/2) of the permutations produces a record on event 2, and two of the six (1/3) of the permutations produces a record on event 3. Three of the events (213, 312 and 321) failed to produce a record on event two, and of those three one (213) does produce a record on the third event, therefore the probability of producing a record on the third event if no record were set on the second is 1/3. Topher
1846.10	Median.	CADSYS::COOPER	Topher Cooper	`Thu Mar 10 1994 12:34`	25
	Since the mean cannot be applied, it is reasonable to look to the median, which is a more robust measure of "central tendency". I'll look at a slightly different question. You are about to attend a series of events, starting with event "n". You want to see the old record broken (though, as discussed, you don't know what that record is). How many events must you attend to have a 50% chance of seeing the record broken. A bit of playing with maple gave me the answer: floor[M(n)] + ceiling[M(n)] ----------------------------- 2 where M(n) = n^2/(n-2). In general, the P'th quantile is, ignoring that it must be an integer value: ( Pn^2)/(Qn - 1) where Q = 1 - P. Topher
1846.11	Independent	CADSYS::COOPER	Topher Cooper	`Fri Mar 11 1994 10:10`	31
	RE: .9 (me) It seems, on the basis of some EMail, that some sloppy language on my part has sown some confusion. In the formal statistical sense of the word, the probability of the events are independent, since the probability of an event n being a record breaker is unchanged given information about whether or not previous events are record breakers. Our intuition about "independence" is based on situations like coin flips where the events in question are (to use a different, if somewhat clumsy word) "non-influencing", and common -- as opposed to technical -- usage of "independence" has implications of "non-influencing". In this example there are some subtle forms of "influence" going on -- at least if viewed from some natural perspectives -- but these influences cancel out in such a way to produce statistical independence. Jim was arguing that since the trials are non-influencing -- memoryless in a restricted sense -- that a failure does not modify the probability of a record, which compensates for the decline implied by the 1/n rule. That means that the lack of influnce would create a statistical dependence. I was arguing that there is an influence (which I called a dependence) and that therefore there was no statistical dependence -- i.e., the probability was still 1/n. That may have confused you even more -- but at least you know now that you are confused ;-) Topher
1846.12	Some data to speculate upon	VMSDEV::HALLYB	Fish have no concept of fire	`Fri Mar 11 1994 15:09`	40
	Here are some empirical results. Using the "Cooper Principle" I counted the number of attempts before a record would be broken for the 8th time. I ran 10,000 independent experiments. Below is a histogram of the iteration number at which the "record" was broken for the 8th time. Note the bins are not linear, but logarithmic. Looks like the raw data (interation number for Ri=8) is lognormal (?). Good evidence that experimental data are well-behaved even though the theoretical mean is undefined. I.e., it just "looks" like a paradox. John Average for 10000 iterations: 110457.92 tries Minimum was 16, maximum was 99270410 LOG(-) Count Histo 0 0 1 0 2 8 3 327 ** 4 817 ****** 5 638 *** 6 940 ******* 7 1311 ************ 8 1327 ************ 9 1342 ************ 10 1030 ******** 11 836 ****** 12 566 *** 13 357 14 203 15 130 * 16 81 * 17 34 18 13 19 0
1846.13	lognormal!	CADSYS::COOPER	Topher Cooper	`Mon Mar 14 1994 13:00`	77
	RE: .12 > Looks like the raw data (interation number for Ri=8) is lognormal (?). Great intuition! My first reaction was "It takes a lot more than a roughly symetric, humpbacked distribution to imply lognormality." But then I thought about it some, and realized that: The time for the k'th record asymptotically approaches a lognormal distribution for sufficiently large k. For those without much statistics background let me say that a random variable X is lognormally distributed with parameters a, m and s if Y = ln(X-a) is normally distributed iwth mean m and standard deviation s. The parameter "a" is frequently 0. Anyway, imagine that instead of a discrete set of events, with probability 1/n of for the n'th event we have a continuous set of events -- i.e. a record can be broken at any time. The probability density is 1/x for time x. We are interested in the distribution of the time at which the k'th record is broken, call that quantity E. Instead of looking at E directly, though, let us look, as suggested by John, at T = ln(E) First a basic question. What is the probability that a record will occur between logrithmatized times "A" and "A+L" for "L" short enough that we can ignore the possibilty of two events occuring? The probability that a record will occur at those transformed times is the probability that a record will occur between time exp(A) and exp(A+L) in the untransformed time. That probability is: int(1/x, x=exp(A)..exp(A+L)) which simply equal "L". The probability is proportional to the length of the interval independent of the placement. This is the condition known classically as a Poisson process, and is illustrated by radioactive decay. In a Poisson process the time till the first event (decay or record) is distributed according to an exponential distribution, and the time till the k'th event is distributed according to the sum of exponential distributions (which is a Gamma distribution with positive integral parameter k). The sum of sufficiently well behaved random variables approaches a normal distribution, and so it is for the Gamma distribution. If the substitution of a continuous RV (Random Variable) for a discrete one can be justified that proves the theorem. We can replace (or merely view) the discrete RV with a step-function distribution (since we do not cumulate to 1, I am using "distribution" here loosely). The continuous, 1/x, distribution quickly (specfically quadratically) becomes flat and horizontal for increasing x, and so it is a good approximation to the discrete case for large enough "n". When k becomes large most of the records are taking place at large n, where the approximation is good. With large enough k the relative contribution of exactly when the first few records are set will become negligable to the overall time, so convergence will occur. QED. I considered using the "a" parameter for the lognormal distribution to add in an offset for the error introduced by the continuity approximation. This would be in the form of an a(k) function correcting for a an error offset for each k. If this could be made to work, though, (I don't know if it could) it is unneeded for the basic, asymptotic result. As the later records, occuring at later times where the approximation is more accurate, become more important to the overall time, the correction, a(k) will converge to a constant value. The m and s parameters will, however continue to grow (linearly for m and by the square root for s) untill the effect of the a(k) will be negligable. Use of a(k) might cause slightly faster convergence, but it is not necessary for convergence. Topher