[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

609.0. "Urn with 36 numbered balls" by ANT::JANZEN (The casual observor) Mon Nov 10 1986 08:55

How do you evaluate the statistics of a program like this?  
Intent: emulate the drawing of 6 numbered balls from an urn with 36 balls.
THe balls are numbered 1 to 36.

100 RANDOMIZE
110 balls = 36
120 ballsdrawn = 6
130 DIM urn(37),result(6)
140 REM PLAY 5 GAMES
150 for game = 1 to 5
160     REM Put 36 balls, numbered 1 to 36 in the urn
170 	for index = 0 to balls-1 step 1
180 		urn(index) = index+1
190 	next index
200     REM mark the position after the last ball in the urn
210 	urn(balls)=99
220     REM Start a game in which you draw 6 balls
230 	for index = 1 to ballsdrawn
240             REM initialize the acculator
250 		number=0
260             REM build the accumulator bit by bit starting with 2**0...2**5
270 		for power = 0 to 5
280                     REM Bat is a bit found by inspecting random number 
290                     REM for being above .5 or below .5
300 			bat = fix(rnd+.5)
310                     REM accumulate the bit into number, in its bit position
320 			number = number + bat*(2**power)
330 		next power
340             REM 0<number<63.  Scale to the number of balls left in the urn
350 		ball = fix(number*((balls-index)/64)+.5)
360             REM Result holds the number of the 
370             REM ball drawn from the random position in the urn
380 		result(index)=urn(ball)
390             REM Subtract one ball from urn, reduce number of balls left
400 		gosub shrink
410 	next index
420 	print
430     REM Sort the 6 numbers in the game
440	gosub sortnum
450	REM Print out the balls
460     for prtnum = 1 to 6
470		print result(prtnum);
480	next prtnum
490 next game
500 goto 680
510 shrink:
520 For pointer=ball to balls-1
530 	urn(pointer)=urn(pointer+1)
540 next pointer
550 return
560 sortnum:
570	for sortpass=1 to 5
580		for sorts=(sortpass) to 6
590			if result(sortpass) > result(sorts) then gosub swap	
600		next sorts
610	next sortpass
620 return
630 swap:
640	tempnum=result(sortpass) 
650     result(sortpass)=result(sorts)
660     result(sorts)=tempnum
670 return
680 end

T.R	Title	User	Personal Name	Date	Lines
609.1	Megabucks	MODEL::YARBROUGH		`Mon Nov 10 1986 12:20`	7
	>How do you evaluate the statistics of a program like this? >Intent: emulate the drawing of 6 numbered balls from an urn with 36 balls. >THe balls are numbered 1 to 36. Evidently you're trying to figure out something about the Mass. Lottery. All that one can say is that any six numbers are as likely to appear as any other six numbers. What kind of evaluation are you seeking?
609.2	Mass has a lottery?	ANT::JANZEN	The casual observor	`Mon Nov 10 1986 14:46`	31
	I was trying to figure out if the program acted like an urn with numbered balls. I think it does on the basis of the procedure it goes through. Here it is: Build a 6-bit binary number from bits found the following way: 1. Get a random number 0<n<1 2. Bit = 1 if n>=.5; Bit = 0 if n< .5 3. Concatenate six of Bit into a 6-bit number. Scale the 6-bit number down to the number of balls currently left in the urn. Use the result to point into an array which had only those balls left listed in it, in any order. Take the number of the ball as the final result. Remove the taken ball from the array urn. Decrement the number_of_balls_left_in_urn. Go to step one until you have 6 numbers. Sort the numbers. ---------- A standard deviation on the numbers doesn't seem pertinent. Perhaps a standard deviation on the first ball is good enough, but it wouldn't check the mechanisms for choosing successive balls. Perhaps Standard Deviation, mean, and mode, for each position separately would be the correct way to check that the program worked correctly. Tom
609.3		CLT::GILBERT	eager like a child	`Mon Nov 10 1986 19:52`	58
	No, it doesn't model it very well. You generate a random integer in the range of 0 thru 63, and want to convert it to a random integer in the range of 0 thru 35 (or so). First, that'll cause some values to be more likely than others. Second, the line that does this: ball = fix(number((balls-index)/64)+.5) only assigns values in the range of 0 thru 34. How about the following, instead? 100 RANDOMIZE 110 balls = 36 120 ballsdrawn = 6 130 DIM urn(36),result(6) 140 REM PLAY 5 GAMES 150 for game = 1 to 5 160 REM Put 36 balls, numbered 1 to 36 in the urn 170 for index = 1 to balls step 1 180 urn(index) = index 190 next index 220 REM Start a game in which you draw 6 balls 225 currballs = balls 230 for index = 1 to ballsdrawn 235 REM Random number in the range of 0..curballs-1 240 ball = fix (rndcurrballs) 250 REM Put that ball into our result array 380 result(index)=urn(ball+1) 390 REM Close ranks, as it were 400 urn(ball+1) = urn(currballs) 401 REM One less ball from which to choose 402 currballs = currballs - 1 410 next index 420 print 430 REM Sort the numbers in the game 440 gosub sortnum 450 REM Print out the balls 460 for prtnum = 1 to ballsdrawn 470 print result(prtnum); 480 next prtnum 490 next game 500 goto 680 501 560 sortnum: 570 for sortpass = 1 to ballsdrawn-1 580 for sorts = sortpass+1 to ballsdrawn 590 if result(sortpass) > result(sorts) then gosub swap 600 next sorts 610 next sortpass 620 return 630 swap: 640 tempnum=result(sortpass) 650 result(sortpass)=result(sorts) 660 result(sorts)=tempnum 670 return 680 end
609.4	How do we test the programs?	ANT::JANZEN	The casual observor	`Tue Nov 11 1986 08:56`	9
	Thanks for finding the error. Here's a fix: 350 ball = fix(number*((balls-index+1)/64)+.5) Now, the quantity balls-index+1 represent the number of balls left in the urn, which it should before being used to scale down number. So, how can I test the goodness of my model or your model? Tom
609.5	Volume II	TURRIS::AMARTIN	Alan H. Martin	`Tue Nov 11 1986 09:58`	7
	Re .4: >So, how can I test the goodness of my model or your model? Read the half of Knuth Volume 2 which is devoted to random numbers. /AHM P. S. I haven't read it, so I can't tell you myself.
609.6	Try the Collision Test	SMURF::JMARTIN	Kill a Type A today: drive 55.	`Tue Nov 11 1986 10:42`	11
	Donald Knuth, "The Art of Computer Programming, Vol.2 / Seminumerical Algorithms (2nd edition, Addison-Wesley, 1981), p. 81. Knuth muddies the waters (of this discussion) by explaining the test in terms of urns and balls. His urns are the number of possible outcomes (about 2 million in this case) and his balls are the number of samples you take of your program. Also see Communications of the ACM, Vol. 27, No. 12, p. 1179 (Dec. 1984) This "Programming Pearls" article discusses efficient methods of picking a sample of N items from a list of M possibilities. --Joe
609.7	Sorry for any inconvenience	SMURF::JMARTIN	Kill a Type A today: drive 55.	`Tue Nov 11 1986 14:46`	1
	That's page 68 of Knuth.
609.8	Reaciton	ANT::JANZEN	Tom LMO2/O23 296-5421	`Tue Nov 11 1986 15:59`	38
	>< Note 609.3 by CLT::GILBERT "eager like a child" > > -< >- > >No, it doesn't model it very well. You generate a random integer in the >range of 0 thru 63, and want to convert it to a random integer in the range >of 0 thru 35 (or so). First, that'll cause some values to be more likely >than others. Second, the line that does this: Which numbers will be more likely? > > ball = fix(number((balls-index)/64)+.5) > >only assigns values in the range of 0 thru 34. > > >How about the following, instead? > >240 ball = fix (rndcurrballs) >380 result(index)=urn(ball+1) >390 REM Close ranks, as it were >400 urn(ball+1) = urn(currballs) >401 REM One less ball from which to choose >402 currballs = currballs - 1 > OK. First of all, the reason I didn't directly use BASIC RND numbers, which you scale to the number of balls left, is that BASIC RND numbers might have an uneven distribution. THe numbers have to be white noise numbers. 0<=number<=63 has to be completely equal probability of an number showing up. BASIC RND numbers might not be like that. However, I do think, having taken a few measurments, that BASIC RND number usually break with a median (50 percentile) at about .5, so I take that point to pick a one or zero. If BASIC RND numbers are at all bell-shaped, then numbers will tend to be mostly around .5, and less near 1 and 0. Measurements do not verify this, but of course it could change with a software update that shouldn't affect my technique. I like the way you efficiently close ranks. TOm
609.9		CLT::GILBERT	eager like a child	`Tue Nov 11 1986 16:05`	19
	re .4 >Thanks for finding the error. Here's a fix: >350 ball = fix(number((balls-index+1)/64)+.5) --- Well, yes, that's a 'fix', but it doesn't fix the problem. The first time that statement is executed, 'number' is an integer uniformly distributed in the range of 0 thru 63. It's a simple matter to determine that 'ball' takes one of the values {0,4,9,13,18,22,27,31} with probability of 1/64 (each), and one of the other values (in the range 0 thru 35) with probability 1/32 (each). I thought you wanted each probability to be 1/36. One way to 'test' your* 'fix' is by tallying the number of times that each ball is 'drawn'. Over several hundred trials, I'd expect that some would be drawn signific�ntly more frequently than others.
609.10		CLT::GILBERT	eager like a child	`Tue Nov 11 1986 16:15`	6
	re .8 BASIC's RND function is supposed to return a uniformly distri�uted pseudo-random number in the range 0 <= rnd < 1. It does this fairly well -- well enough for your use, anyway. Th�re is nothing to suggest that combining multiple RNDs (as you had in .0) will improve the randomness; indeed, it's likely to hurt.
609.11	doesn't LOOK bell shaped	CACHE::MARSHALL	hunting the snark	`Tue Nov 11 1986 16:18`	21
	re .8: > OK. First of all, the reason I didn't directly use BASIC RND numbers, which > you scale to the number of balls left, is that BASIC RND numbers might have > an uneven distribution. THe numbers have to be white noise numbers. > 0<=number<=63 has to be completely equal probability of an number showing > up. BASIC RND numbers might not be like that. To check this out I wrote a program that plotted a histogram of the occurence of numbers output by RND. I scaled the RND by 800 (the width of a 240 screen) then plotted a new point at that "x" coordinate, the "y" coordinate being the number of times I got that number. You would be amazed at how flat that graph looked. It also looked pretty flat throught the entire run, not just after thousands of iterations. / ( ___ ) /// /
609.12	How flat is flat?	ANT::JANZEN	Tom LMO2/O23 296-5421	`Tue Nov 11 1986 17:11`	24
	You call this flat? +/-14% around 100 5 dim histogram(10) 8 for n = 1 to 1000 10 x=fix((rnd*10)) 15 histogram(x)=1+histogram(x) 20 next n 30 for n = 0 to 9 40 print histogram(n) 50 next n $ run rnd 87 98 97 124 87 105 87 93 108 114 Tom
609.13		CLT::GILBERT	eager like a child	`Tue Nov 11 1986 17:17`	6
	re .12 Is the difference between the observed data and a f�at distribution statistically significant? I doubt it. If you refer to Knuth (the Chi-squared test, I believe), you'll be able to determine the probability that a true random number generator would yield such a 'poor' distribution. Please post your results, too. Thanks.
609.14	0<=Number<=63 histogram	ANT::JANZEN	Tom LMO2/O23 296-5421	`Tue Nov 11 1986 17:18`	89
	It gets worse when I histogram my "number." 100 RANDOMIZE 110 balls = 36 120 ballsdrawn = 6 130 DIM histogram(64) 135 for x=0 to 63 136 histogram(x) = 0 137 next x 230 for index = 1 to 500 250 number=0 270 for power = 0 to 5 300 bat = fix(rnd+.5) 320 number = number + bat(2*power) 330 next power 345 histogram(number)=histogram(number)+1 410 next index 500 for x = 0 to 63 510 print histogram(x) 520 next x 680 end $ run 64 5 9 10 8 6 4 11 12 14 11 12 9 10 7 10 8 5 7 8 5 8 8 10 11 6 6 9 7 9 6 13 8 8 4 7 6 4 9 11 8 7 7 8 4 8 7 4 5 7 8 11 10 6 6 10 4 7 3 8 8 10 7 5 11 Tom
609.15	scaled results	ANT::JANZEN	Tom LMO2/O23 296-5421	`Tue Nov 11 1986 17:26`	62
	Here's a histogram done after scaling 100 RANDOMIZE 110 balls = 36 120 ballsdrawn = 6 130 DIM urn(37),result(6),histogram(36) 135 for x = 1 to 36 136 histogram(x)=0 137 next x 150 for game = 1 to 1000 250 number=0 270 for power = 0 to 5 300 bat = fix(rnd+.5) 320 number = number + bat(2power) 330 next power 350 ball = fix(number(balls-index+1)/64) 360 histogram(ball)=histogram(ball)+1 370 next game 400 for x = 1 to 36 410 print histogram(x) 420 next x 680 end $ run scaled 29 30 16 32 36 27 10 43 27 33 15 21 40 18 28 20 33 26 41 46 37 13 29 30 16 36 30 27 15 21 32 31 9 18 36 13 Tom
609.16	answer	ANT::JANZEN	Tom LMO2/O23 296-5421	`Tue Nov 11 1986 17:27`	5
	>< Note 609.13 by CLT::GILBERT "eager like a child" > >re .12 >Please post your results, too. Thanks. I did. In .12 Tom
609.17	my program	CACHE::MARSHALL	hunting the snark	`Tue Nov 11 1986 18:19`	59
	The following is the program I used. You will notice that there are three random-number generators, two of which are rem'ed out. One is the function suggested by HP for the 41C calculator, the other is one a friend suggested. I hope you have a VT240 or 125. / ( ___ ) /// / 1 rem input "k = ";k 2 randomize 10 rem enter regis 20 print chr$(27);"[2J" 30 print chr$(27);"P1p W(V,I3,F3,M1,N0,P1,P(M2),S0)" 40 print "P[0,0]V[][0,400][799,400][799,0][0,0]" 45 rem PRINT "P[10,10]T(A0)'r(n) = frc { e^[";k;"+ r(n-1)]}'" 46 PRINT "P[10,10]T(A0)'r(n) = rnd '" 47 rem PRINT "P[10,10]T(A0)'r(n) = frc { 9821 * r(n-1) + .211327}'" 50 rem ---------------- initialize --------------- 60 dim P(800) 70 r0 = rnd 75 rem ---------- <cntrl>C handler enable -------- 80 e%=ctrlc 90 on error goto 300 100 rem --------------- loop ---------------------- 110 n=n+1 119 rem q = exp(k+r0) 120 rem r1 = q - int(q) 121 rem q = 9821 * r0 + .211327 122 rem r1 = q - int(q) 125 r1 = rnd 130 r0 = r1 140 gosub 200 150 goto 100 160 goto 500 200 rem---------------- plot --------------------- 210 r2 = int(800*r0) 220 P(r2)=P(r2)+1 230 x=r2 240 y=400-P(r2) 250 if y <= 0 then goto 270 255 rem print x,y 260 print "P[";x;",";y;"]V[]" 261 nz=n/1000 262 IF nz <> INT(nz) THEN GOTO 270 263 PRINT "P[10,30]W(R)T'iter =";nz;"k'W(V)" 270 return 300 rem ------------ <cntrl>C handler -------------- 310 resume 320 320 gosub 400 330 goto 500 400 rem ------------- exit regis ------------------- 410 rem exit regis 420 print chr$(27);"\" 430 return 440 rem --------------------------------------------- 500 end
609.18	n+1st terminal characteristic is wrong, no doubt	NOBUGS::AMARTIN	Alan H. Martin	`Tue Nov 11 1986 18:58`	34
	Re .17: I have a Vaxstation-II/GPX. What to I have to do to get bloody Basic programs to run on it besides extracting the note into a file, removing the header, and typing: $ BASIC OLD FOOBAR RUN Except for the screen initially clearing, nothing appears on the screen, unless I toggle the display control mode to "display control characters", at which point it becomes obvious that the program is doing output of REGIS. These are the terminal characteristics: Terminal: _WTA1: Device_Type: VT200_Series Owner: AMARTIN Input: 9600 LFfill: 0 Width: 80 Parity: None Output: 9600 CRfill: 0 Page: 55 Terminal Characteristics: Interactive Echo Type_ahead No Escape Hostsync TTsync Lowercase Tab Wrap Scope No Remote Eightbit Broadcast No Readsync No Form Fulldup No Modem No Local_echo Autobaud No Hangup No Brdcstmbx No DMA No Altypeahd Set_speed Line Editing Insert editing No Fallback No Dialup No Secure server No Disconnect No Pasthru No Syspassword SIXEL Graphics No Soft Characters No Printer Port Numeric Keypad ANSI_CRT Regis No Block_mode Advanced_video Edit_mode DEC_CRT DEC_CRT2 /AHM/THX
609.19		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Tue Nov 11 1986 20:40`	10
	Re .16: >> Please post your results, too. Thanks. > I did. In .12 I believe the results Peter was referring to would be the results of the Chi-square test. -- edp
609.20	more info	EAGLE1::DANTOWITZ	David .. DTN: 226-6957	`Wed Nov 12 1986 08:59`	4
	See note 531 for more on random numbers. David
609.21	Let's do it the hard way	ANT::JANZEN	Tom LMO2/O23 296-5421	`Wed Nov 12 1986 12:32`	9
	re: .17 I ran the program. It shows that there is no hump in the middle, though I had expected one. I also see clearly now why I can't map 64 discrete objects into 36 discrete slots with an equal distribution. 36 doesn't go into 64 without a remainder. Thanks Tom
609.22	Uniform vs normal	MODEL::YARBROUGH		`Wed Nov 12 1986 16:31`	10
	>I ran the program. It shows that there is no hump in the middle, >though I had expected one. I think you are confusing the uniform distribution, which is completly flat, with the normal distribution, which is bell-shaped. The useful property of uniform random numbers, which are (approximatley) produced by random number programs, is that their distribution is NOT normal but is uniform. Lynn
609.23	That's why it got so complicated	ANT::JANZEN	Tom LMO2/O23 296-5421	`Thu Nov 13 1986 11:49`	16
	>I think you are confusing the uniform distribution, which is completly >flat, with the normal distribution, which is bell-shaped. The useful > >Lynn No. I know the difference. I expect the VAX BASIC RND numbers to be normal; that's why I couldn't use them directly distributed over a range of 36 slots. That's why I went to all the rigamarole with building a binary number and scaling it, and then pointing into an array instead of just throwing out duplicates. I got killed in the scaling part. Although, if I had used a number such as 65535 instead of 63, the errors in distribution would have been tiny. Now that I know BASIC RND numbers in VMS are uniform, I could just use them directly, as shown in the programming example earlier. TOm
609.24	Setting expectations	NOBUGS::AMARTIN	Alan H. Martin	`Thu Nov 13 1986 13:27`	12
	Re .23: >I expect[ed?] the VAX BASIC RND numbers to be normal; . . . Why? I've never seen a language (or an implementation of one) which had only one random number generator which intentionally used any distribution except for a uniform distribution. Languages with more than one generator might supply both uniform and normal, but languages with only one invariably use uniform. /AHM/THX
609.25	root of randomness	CACHE::MARSHALL	hunting the snark	`Thu Nov 13 1986 15:29`	11
	re .24: additionally, isn't the uniform distribution really the "fundamental" distribution? That is, once you have a good uniform distribution, it is relatively trivial to generate any other type of distribution? / ( ___ ) /// /
609.26		CLT::GILBERT	eager like a child	`Thu Nov 13 1986 16:08`	17
	re .25: > additionally, isn't the uniform distribution really the "fundamental" > distribution? That is, once you have a good uniform distribution, > it is relatively trivial to generate any other type of distribution? A uniform distribution is the easiest to generate, and is "fundamental". Other distributions are generated by starting with a uniform distribution, but it is by no means trivial -- consider the normal (bell) curve. re .nn: The on-line help for VAX BASIC doesn't mention that RND generates uniformly distributed random numbers, nor does the Language Reference Manual. A BASIC developer down the hall also thought it was "more of a bell-shaped curve". This lapse in BASIC's documentation will be corrected.
609.27	Now we know the answer	NOBUGS::AMARTIN	Alan H. Martin	`Thu Nov 13 1986 17:52`	5
	Re .26: I guess Tom must have eaten lunch with the Basic developer sometime in the past. /AHM/THX