[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

1180.0. "closed form integral for normal distribution function?" by REGENT::POWERS () Wed Jan 10 1990 10:15

Is there a closed form expression of the integral of the normal
probability distribution function f(x) = (1/sqrt(2*pi))*exp(-0.5*x**2)?
I'd like to compute the fraction of a population between given values of x.
My CRC book lists the derivatives of this function, and indicates (obviously)
that the definite integral for a given range is what I want,  but I can't
find a closed form expression of the integral itself.
My calculus skills are very rusty, and I'm having trouble recalling 
the right substitutions to integrate a form of exp(f(t))dt.

- tom powers]

T.R	Title	User	Personal Name	Date	Lines
1180.1	Sorry, no such thing ...	COOKIE::PBERGH	Peter Bergh, DTN 523-3007	`Wed Jan 10 1990 10:32`	2
	To the best of my knowledge, there is no closed form for the integral of the normal probability distribution function.
1180.2	No can do, can come close	VMSDEV::HALLYB	The Smart Money was on Goliath	`Wed Jan 10 1990 11:23`	5
	That's a theorem that there is no closed form for the integral. However I recall there's a 5th degree polynomial (or so) that is quite an accurate approximation, if that will suffice. John
1180.3		ALLVAX::ROTH	It's a bush recording...	`Wed Jan 10 1990 13:04`	4
	See note 1136 and some of the replies - there are some routines that can be adapted nicely to your problem... - Jim
1180.4		REGENT::POWERS		`Tue Jan 16 1990 09:15`	16
	> However I recall there's a 5th degree polynomial (or so) that is > quite an accurate approximation, if that will suffice. That would be handy.... ...as would be some partly tongue-in-cheek background: 1) If we don't have a closed form for the integral, how do we know the total area under the curve is, in fact, 1.00000......? 2) Presuming that the answer to 1) is based on connections with binomial distribution and sum of the negative powers of 2, what is the derivation of the form of the curve as an exponential of a function of x**2? - tom powers]
1180.5	A partial answer ...	COOKIE::PBERGH	Peter Bergh, DTN 523-3007	`Tue Jan 16 1990 11:47`	54
	>> 1) If we don't have a closed form for the integral, how do we >> know the total area under the curve is, in fact, 1.00000......? The easiest way that I know of to evaluate I(-infinity, +infinity, e*(-xx), dx) goes roughly as follows (for infinity, I use the symbol oo): Consider I(-oo, +oo, e*(-xx), dx) * I(-oo, +oo, e*(-yy), dy) = Z. Notice that this product is the same as the double integral over the whole (x,y) plane: II(-oo, +oo, -oo, +oo, e*(-xx)e(-yy), dxdy) which in turn equals II(-oo, +oo, -oo, +oo, e(-xx-yy), dxdy). Transforming to polar coordinates, we get that Z = II(0, 2PI, 0, +oo, re*(-rr), dthetadr) Here, we can separate the two variables of integration, so Z = I(0, 2PI, 1, dtheta) * I(0, +oo, re(-rr), dr). These two integrals can easily be evaluated and we get that Z = PI. Thus, we have proved that I(-oo, +oo, e*(-xx), dx) = sqrt(PI). (Note that I haven't bothered to quote chapter and verse of the appropriate theorems; the integrands are extremely well behaved, so ordinary Riemann-integration theorems ought to suffice to justify these calculations.) >> 2) Presuming that the answer to 1) is based on connections with >> binomial distribution and sum of the negative powers of 2, >> what is the derivation of the form of the curve as an exponential >> of a function of x**2? As you notice, the binomial distribution does not enter into the proof at all, neither do negative powers of two. I don't know what your question here is aiming at, but I can tell you of a theorem in statistics (the law of large numbers) which I think may answer at least part of your question. The theorem goes roughly as follows: Given a set of independent random variables with the same distribution (note that there is no requirement for them to have a binomial distribution; the law of large numbers doesn't "care" what the distribution of a single random variable is), the sum of N of these random variables will have a distribution that converges in probability to a normal distribution. This has often been used to get a quick approximation to a normally-distributed random variable (one simply adds enough uniformly-distributed random variables and, presto, the sum is approximately normally distributed). (Convergence in probability means roughly "the probability of the distribution differing from the normal distribution converges to zero as the number of terms in the sum increases".)
1180.6		AITG::DERAMO	Daniel V. {AITG,ZFC}:: D'Eramo	`Tue Jan 16 1990 22:19`	8
	re .5, I believe that the theorem at the end of reply .5 needs the added condition that the random variables' distribution have a well defined and finite mean and variance. Dan
1180.7		ALLVAX::ROTH	It's a bush recording...	`Wed Jan 17 1990 02:40`	21
	Re .-1 Yes, that's clearly correct on intuitive grounds; there's no way a PDF that's a set of impulses will converge to a proper Gaussian. An easy way to see that sums of "nicly distributed" random variables converge to a normal distribution is that the distribution of their sum is the convolution of the individual distributions. Convolution causes smoothing and spreading out; try the simplest case of convolving rectangular pulses - very quickly a bell-shaped curve results. In fact, n-fold convolution of a rectangular pulse gives the uniform B-splines. +--+ + \| \| / \ \| \| -> / \ -> etc. ---+ +--- ---+ +--- one constant piece 2 linear pieces 3 parabolic pieces... - Jim
1180.8		REGENT::POWERS		`Wed Jan 17 1990 09:51`	11
	My reference to the binomial theorem was in regard to the physical demonstration of the normal distribution by dumping the balls over the pyramid of pegs and seeing how the normal curve appears as a histogram underneath. The reference to the negative powers of two comes from this demonstration (1/2 probability of left or right for each ball at every peg) and the fact that the sun of 2**(-i) for i=1 to infinity is 1. Admittedly naive.... - tom]
1180.9	A confirmation and a refutation	COOKIE::PBERGH	Peter Bergh, DTN 523-3007	`Wed Jan 17 1990 10:20`	19
	Re .6: the requirement for a finite variance and a finite expected value is correct <<< Note 1180.8 by REGENT::POWERS >>> >> My reference to the binomial theorem was in regard to the physical >> demonstration of the normal distribution by dumping the balls over >> the pyramid of pegs and seeing how the normal curve appears as a >> histogram underneath. According to a book that I read some twenty years ago ("Theory of probability" by Gnedenko), the fact that the binomial distribution converges to the normal distribution as the number of trials grows is du to DeMoivre and Laplace, so that demonstration is probably a very early example of the occurrence (admittedly, only in the limit) of the normal distribution in nature. Thus, this is not naive; it is an excellent example of the use of the theorem in .5.
1180.10	counter-example and two incomplete derivations	PULSAR::WALLY	Wally Neilsen-Steinhardt	`Thu Jan 18 1990 13:06`	57
	re: <<< Note 1180.7 by ALLVAX::ROTH "It's a bush recording..." >>> > Yes, that's clearly correct on intuitive grounds; there's no way > a PDF that's a set of impulses will converge to a proper Gaussian. Consider a PDF which is zero everywhere but x=-1 and x=1, and its values there are such that the integral over the whole line is 1. Obviously we need integrals something like Stieljes (and I cannot even remember how to spell it!) This has zero mean and finite variance, but the sum of these distributions look like binomial distributions, and converge to the normal distribution in the very qualified sense mentioned earlier. To fail to converge to a normal distribution, the starting distribution has to lack a finite mean or variance, as previously stated. I have seen two other derivations for the form of the bell shaped curve. I could not reproduce either when I tired, but maybe if I put down what I remember, somebody else will fill in the gaps. A: Start with any well-known PDF, like the binomial distribution. Let the parameters in the distribution become very large. Take logs of both sides, and apply Stirling's Approximation log n! = n log n approximately to all the factorials. After a bit of algebra (which is what I have forgotten) you end up with something like log P = something - (r - n/2)*2 / something Raise e to the power of both sides and you get the Gaussian. The first term just becomes the normalization constant. Obviously, this only proves that a particular PDF converges to the Gaussian, but it is still interesting. B: Start with the fact that if you take n samples from any PDF, the mean of the sampling distribution is the mean of the PDF, and the variance of the sampling distribution is the variance of the PDF divided by n. Consider the logarithm of the sampling distribution, and expand it around its mean: log P(x-xm) = A0 + A1(x-xm) + A2(x-xm)^2 + ... One part I forgot is how you show that xm is also a maximum and therefore A1=0. Another part is how A2 remains constant while you increase n so that the variance decreases to zero, so for all the x where P is significantly far from zero, higher terms may be ignored. The result is the limit log P(x-xm) = A0 + A2 (x-xm)^2 and you raise e to both sides as above. Neither of these is a proof of the central limit theorem, but they may give you a better feeling for where the Gaussian came from.
1180.11		EVMS::HALLYB	Fish have no concept of fire	`Wed Jul 17 1996 13:04`	13
	Here's a problem I've come across that seems intuitively obvious but no proof comes to mind, other than "visualize it and it's obvious". Suppose N(I) is the area of the interval I under the standard normal curve. Let I be an interval of length dx containing 0 as an interior point. Let I' be an interval of length dx not containing 0, interior or end. Claim N(I) > N(I') is obviously true. Is there any rigorous way to prove this? John
1180.12		AUSS::GARSON	DECcharity Program Office	`Wed Jul 17 1996 19:56`	46
	re .11 How rigorous does it need to be? At a quick look the following seems to apply... Let f be symmetric about x=M and let f be strictly decreasing on [M,oo] and ignoring the exact constraints on f so that it is sufficiently integrable ('coz I didn't ever study that stuff) then you need the following "obvious" results. int(f,M+a,M+a+d) > d � f(M+a+d) where a >= 0 and d > 0. [ i.e. if f is monotonic on an interval then the integral is greater than the width times the value of f at the endpoint where f has the smaller value ] and int(f,M-a,M-b) = int(f,M+a,M+b) where a>b>=0 [ i.e. if f is symmetric about x=M then you can "reflect" about the line x=M without changing the value of the integral ] with the more generally applicable int(f,a,b) + int(f,b,c) = int(f,a,c) This is where my initial question is relevant. Some would demand that these three results in turn are proved from first principles and the definition of the integral. It is then just a case of splitting the intervals I & I' into the right pieces so that you can assert that each piece of I has an integral that is strictly greater than the corresponding piece of I'. At an even quicker look the correct splitting would seem to be... split I' and I into two intervals where the width of the part of I' that is closer to M equals the the width of the part of I that is on the same side of M as I'. In comparing N() for a piece of I and a piece of I' it may be possible that these pieces overlap but in that case just eliminate the overlap (the integrals will be equal) and you are then left comparing integrals of non-overlapping intervals.