[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

531.0. "RANDOM numbers" by EAGLEA::DANTOWITZ (David .. DTN: 226-6957 -- LTN2-1/H07) Mon Jul 07 1986 13:19

    
    
    I am presently looking for a very good random number generator.

    David

T.R	Title	User	Personal Name	Date	Lines
531.1		CLT::GILBERT	Juggler of Noterdom	`Mon Jul 07 1986 13:37`	5
	Try the one used by LISP. It combines two (tested) linear congruential random number sequences, using the MacLaren-Marsaglia method (which you can find outlined in Vol 2 of The Art of Computer Programming). BTW, I suspect this is the same Don MacLaren who is now at DECwest.
531.2	The Spectral test	EAGLEA::DANTOWITZ	David .. DTN: 226-6957 -- LTN2-1/H07	`Mon Jul 07 1986 15:37`	5
	Does anyone have the code to perform the Spectral test mentioned in "The Art of Computer Programming"? David
531.3	A Valid method?	EAGLEA::DANTOWITZ	David .. DTN: 226-6957 -- LTN2-1/H07	`Tue Jul 08 1986 14:32`	18
	In looking over a current implementation of a random number generator I found the following algorithm. Generate a new RN, where 0<=RN<X, given that SEED is the current random longword (assume a valid algorithm exists for generating random longwords). RN is derived from the longword by assigning RN = SEED mod X. --------------- Is this a valid method if SEED is a random longword? [ SEED is produced as follows: SEED = SEED*314159269+453806245 ]
531.4		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Tue Jul 08 1986 15:10`	17
	Re .3: That is not a good method. If X can be divided by two one or more times, there will be patterns in the numbers generated. For example, if X is two, you will get 0, 1, 0, 1, 0, 1, . . ., but you would prefer something like 0, 1, 1, 0, 0, 0, 1, 0, . . . I will assume that the method of generating SEED is good. What you should then do is let RN = SEED*X / (magnitude of longword). (If n is the number of bits in a longword, then the above "magnitude of longword" should be 2^n.) I am not familiar with the VAX instruction set, but if it supports multiplying one unsigned longword by another to get a two longword product, you should multiply SEED and X and take the more significant longword of the product. -- edp
531.5	use product (.4)	LATOUR::JMUNZER		`Tue Jul 08 1986 15:22`	20
	.3 will not be very good for large X's that are not powers of two: RN \| \| X \| / / / / \| / / / / \| / / / / \| / / / / / \| / / / / / \| / / / / / \|/______/______/______/______/___\|_________________SEED X 2X 3X 4X 2^32 If 2^32 = A * X + B, then 0 <= RN < B will happen too often B <= RN < X will happen too seldom John
531.6		JON::MORONEY	Madman	`Tue Jul 08 1986 15:31`	14
	I know that the method RANDOM=SEED SEED=(SEED*A+B) mod 2^N where N is the wordsize of the computer in question, and A and B are specially chosen constants, is widely used in computer "random" number generation. I don't have the method used to determine the "best" A and B except that they depend on N and I seem to remember that A is chosen so its lower 3 digits are z21, where z is one of {0,2,4,6,8}. Will try to see if I can find more info tonight. -Mike
531.7		CLT::GILBERT	Juggler of Noterdom	`Tue Jul 08 1986 18:15`	2
	re .3,.4 For the given generator, X = 17 yields the least random sequence.
531.8		EAGLEA::DANTOWITZ	David .. DTN: 226-6957 -- LTN2-1/H07	`Tue Jul 08 1986 18:20`	19
	Yes, I agree the proven method is to think of the seed being between 0 and 1, a fraction (divide seed by 2^32). The simplest way to get a new seed on a VAX is to use EMUL, which gives you (SEEDA+B). Then you take the upper longword as the new seed, which is equivalent to dividing by 2^32. The same method can be used to obtain a RN: RN=upper longword(seedX+0). Generating a new seed this way can only be done of you are using 2^32 as the divisor. If you use another value you must also perform a division (EDIV on the VAX). The reason I asked about .3 is because it is used presently -- a perversion of something from Knuth, by someone long ago -- and I wouldn't change it unless it proved to be poor. Using Linear conguence is a simple and proven method, what is slightly tricky is the selection of a and m (seed=(a*seed+b) mod m).
531.9		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Wed Jul 09 1986 10:03`	25
	Re .6: > I don't have the method used to determine the "best" A and B except > that they depend on N and I seem to remember that A is chosen so its > lower 3 digits are z21, where z is one of {0,2,4,6,8}. That may have some validity when the modulus is a multiple of a power of ten, but I don't think it holds for powers of two. Here is what Knuth has to say in Volume Two of _The Art of Computer Programming: Seminumerical Algorithms_, sections 3.2.1.2 and 3.2.1.3: Theorem A. The linear congruential sequence defined by m, a, c, and X0 [Xnext = a*Xcurrent + c mod m] has period length m if and only if i) c is relatively prime to m; ii) b = a-1 is a multiple of p, for every prime p dividing m; iii) b is a multiple of 4, if m is a multiple of 4. The _potency_ of a linear congruential sequence [which you want to be as large as possible] is defined to be the least integer s such that b^s is congruent to zero modulo m. -- edp
531.10		CLT::GILBERT	Juggler of Noterdom	`Wed Jul 09 1986 14:08`	32
	re .8 > The simplest way to > get a new seed on a VAX is to use EMUL, which gives you (SEEDA+B). > Then you take the upper longword as the new seed, which is equivalent > to dividing by 2^32. Yuck! If we let w = A/(2^32) be a real between 0 and 1, this always gives new_seed <= old_seedw + 1. Thus, the seed gets smaller and smaller, eventually cycling stuck at zero. A plausible approach is to take the middle 32 bits (instead of the top-most). This is much better, but still not to be recommended, especially since there are even better methods. > The reason I asked about .3 is because it is used presently -- a > perversion of something from Knuth, by someone long ago -- and I > wouldn't change it unless it proved to be poor. It's poor. Consider generating random numbers between 0 and 9; it will generate one of the following five sequences, depending on the initial value of the seed: 0,5,0,5,..., 1,4,1,4,..., 2,3,2,3,..., 6,9,6,9,..., or 7,8,7,8,.... Or consider numbers from 0 thru 99; the period of the sequence (i.e., before it repeats) is either 5 or 20 -- pretty short (in general, the period could never exceed X). Note that correct use of the seed would give a period of 2^32. Now I suspect the next seed is really generated by: (A*SEED+B) MOD 2^32 (that is, high order bits are ignored). Although this is better, taking this mod X to produce a random number in the range of 0..(X-1) is still a bad idea -- if X is 16, for example, it is unaffected by this, and still has an extremely short period.
531.11	Not to be redundant,	AURORA::HALLYB	Free the quarks!	`Wed Jul 09 1986 14:28`	9
	Instead of just doing an EDIV to get a number mod X, it is often suggested that you consider the rightmost 'b' bits to be the fractional part of a floating point number and multiplying by X, then CVTFL'ing the result to obtain an integer from {0,...,X-1}. On a VAX this could probably be done by ANDing out bits 23:31 and ORing in the right bit pattern in the exponent field. That way you tend to ignore the periodic pattern of the low order bits. John
531.12		EAGLEA::DANTOWITZ	David .. DTN: 226-6957 -- LTN2-1/H07	`Wed Jul 09 1986 16:44`	4
	Re .10 Using EMUL you would wish to take the low 32 bits, for this is the value (seed*A+B) mod 2^32.
531.13	LISP does it with ...	EAGLEA::DANTOWITZ	David .. DTN: 226-6957 -- LTN2-1/H07	`Wed Jul 09 1986 17:14`	25
	The following algorithm is a bit better. Something like it is used in LISP to generate a random number {RN: 0 <= RN < X; if X is an integer and X < 2^22} Another method is used for generating numbers in a range greater than 2^22-1. each new SEED is chosen by SEED = (SEED314159269+1) mod (2^31-1) TABLE is an array of 64 consecutive seeds [0..63] Y is initialized to the 65th seed. For each random number (0 <= RN < X) do the following: j = Y64/2^31 (treat Y as a 31 bit number and take the top 6 bits) Y = TABLE[j] TABLE[j] = new seed (see above) Output (Y mod X) This seems to work well (LISP users haven't complained) I did a quick test on consecutive byte pairs and found that it works --------
531.14		CLT::GILBERT	Juggler of Noterdom	`Wed Jul 09 1986 18:28`	8
	Now that we've all learned a bit about pseudo-random number generation, here's a good 'real' problem. How can we generate a sequence of very large random floating point numbers (i.e., such as for the 113-bit mantissa in H-floating numbers)? The periodicity must be large too -- greater that 2^113. Hopefully (and here's the rub) generation should be fairly fast (i.e., multiple precision versions of linear congruence generators are out).
531.15	What's wrong with shift registers?	TAV02::NITSAN	Nitsan Duvdevani, Digital Israel	`Thu Jul 10 1986 03:11`	0
531.16	Question...	GENRAL::TAVARES		`Thu Jul 10 1986 10:52`	6
	re:13 I've been reading this file for quite a while now, and must confess that I only understand about 1 in 5 words you folks write... but - how do you initialize a table to 64 seeds, then let Y be the 65 seed?
531.17		EAGLEA::DANTOWITZ	David .. DTN: 226-6957 -- LTN2-1/H07	`Thu Jul 10 1986 14:52`	17
	Good question. Hope this is better. the method for obtaining a new seed is seed' = (seeda+b) mod m. Initialize (before you start the general algorithm) TABLE to the first 64 seeds generated by the above method. assign table [0] = seed' = (seed a+b) mod m table [1] = seed'' = (seed' a+b) mod m table [2] = seed''' = (seed'' a+b) mod m etc. then after assigning table [63], assign Y the next new seed. you don't really need each new seed, so the actual operation can assign seed=(seed*a+b) mod m.
531.18	real random	GALLO::JMUNZER		`Thu Jul 10 1986 16:00`	4
	Re .14: how should the numbers be distributed? Uniformly distributed between an upper and lower bound? Something else? John
531.19	.14 How Fast?	CHOVAX::YOUNG	Chi-Square	`Sun Jul 13 1986 23:05`	11
	Re .14: How fast is fast Peter? I have a routine that will return about 2-3000 results a second on an 8650, more than proportionatly less on a 780, or 785 (why?). It is written in fortran and seemss to be very well distributed, BUT I fear that even macro could not make it much faster. -- Barry
531.20	Program & a Mystery	CHOVAX::YOUNG	Chi-Square	`Mon Jul 14 1986 03:33`	137
	OK, this is too much. Re .19, .14 I have modified my fortran program and now get a 4 to 1 increase on the 8650. I have benchmarked on other machines and the results baffle me: CPU FPA RESULTS/SEC VMS FORTRAN --- --- ----------- --- -------- 8650 Y 13000 4.4 4.4-177 785 N 1000 4.4 4.4-177 780 ? 71 (!!) 4.3 4.4-193 What going on here?!? My algorithim looks pretty good on an 8650, how can it be 13 times slower on a 785 and nearly 200 times slower on a 780? I am including the code so that others may show me where I have erred. I am fairly sure that the periodicty is at LEAST 2^113. And if it isn't I am sure that it can be made so. -- Barry !******** PROGRAM TEST_RANDOM_GENERATOR REAL16 TEST_SEED REAL16 TEST_RAND TEST_SEED = (1.0 / 3.0) CALL LIB$INIT_TIMER DO I = 1, 10000 CALL DUMMY_SUB END DO CALL LIB$SHOW_TIMER CALL LIB$INIT_TIMER DO I = 1, 10000 CALL RAND (TEST_SEED, TEST_RAND) TEST_SEED = TEST_RAND END DO CALL LIB$SHOW_TIMER END !****** SUBROUTINE DUMMY_SUB (DUMMY1, DUMMY2) END !******* OPTIONS /CHECK=NOOVERFLOW SUBROUTINE RAND (SEED_IN, RAND_OUT) STRUCTURE /OCTA_WORD/ UNION MAP INTEGER2 WORD_1 INTEGER2 WORD_2 INTEGER4 LWORD_2 INTEGER4 LWORD_3 INTEGER4 LWORD_4 END MAP MAP REAL16 H_FLOAT END MAP END UNION END STRUCTURE RECORD /OCTA_WORD/ SEED_IN RECORD /OCTA_WORD/ RAND_OUT RECORD /OCTA_WORD/ CONTEXT STRUCTURE /LWORD/ UNION MAP INTEGER4 LWORD END MAP MAP INTEGER2 WORD_1 INTEGER2 WORD_2 END MAP END UNION END STRUCTURE RECORD /LWORD/ TEMP RECORD /LWORD/ HOLD REAL16 TEMP_PROD RAND_OUT.H_FLOAT = SEED_IN.H_FLOAT RAND_OUT.WORD_1 = (214) + 1 RAND_OUT.H_FLOAT = RAND_OUT.H_FLOAT + CONTEXT.H_FLOAT + + (107.0/109.0) RAND_OUT.WORD_1 = (214) + 1 TEMP_PROD = RAND_OUT.H_FLOAT * RAND_OUT.H_FLOAT TEMP_PROD = TEMP_PROD * TEMP_PROD TEMP_PROD = TEMP_PROD * TEMP_PROD TEMP_PROD = TEMP_PROD * TEMP_PROD RAND_OUT.H_FLOAT = TEMP_PROD * RAND_OUT.H_FLOAT TEMP_PROD = TEMP_PROD * TEMP_PROD RAND_OUT.H_FLOAT = TEMP_PROD * RAND_OUT.H_FLOAT TEMP_PROD = TEMP_PROD * TEMP_PROD RAND_OUT.H_FLOAT = TEMP_PROD * RAND_OUT.H_FLOAT TEMP.WORD_2 = RAND_OUT.WORD_1 RAND_OUT.WORD_1 = (2**14) CONTEXT.H_FLOAT = RAND_OUT.H_FLOAT TEMP.WORD_1 = RAND_OUT.WORD_2 HOLD.LWORD = RAND_OUT.LWORD_2 RAND_OUT.LWORD_2 = RAND_OUT.LWORD_3 RAND_OUT.LWORD_3 = RAND_OUT.LWORD_4 RAND_OUT.LWORD_4 = TEMP.LWORD RAND_OUT.WORD_2 = HOLD.WORD_2 END
531.21	CTRL/Z	AURORA::HALLYB	Free the quarks!	`Mon Jul 14 1986 13:53`	13
	Well, the 8650 was optimized to handle G- and H-floating, and the published benchmarks have some of those operations running 19-20 time as fast on a 8650 as a 780, so the 785 numbers don't look too far out of the park. The 780 numbers would make sense if you ran the tests on an old rev of the 780 microcode. Any other ideas out there? Finally, what's wrong with the idea of just making consecutive calls to 32-bit random number generators, and concatenating the resulting bit patterns? It seems that the LISP algorithm could be adapted to obtain an arbitrarily large period. (He said, frantically waving his hands...)
531.22		CLT::GILBERT	$ no /nono vaxnotes	`Mon Jul 14 1986 20:13`	14
	re .20: MTH$RANDOM produces results at the rate of 30,000/second on an 11-780. The 'bigger' random number generator should produce results at least at the same rate (with respect to wall time) on the newer machines. re .21: The problem with using a sequence of calls to a 32-bit linear congruential random number generators is that the period is decreased, and the high-order portion is not so random -- it repeats roughly the same way it did 2(32-4) 32-bit numbers ago. The idea behind the problem is that with faster machines, simulations that use random numbers will want better random numbers, and will have enough horsepower so that a period of 232 will simply be too small.
531.23	Re-seed from the clock	ERIS::CALLAS	Jon Callas	`Tue Jul 15 1986 14:33`	8
	There's a way to get around repeats: every so often, re-seed the generator from some known random variable. On a multi-tasking VAX (especially one used for timesharing) the middle 32 bits of the system clock are good enough. The vagaries of system usage make the clock a random variable. If you re-seed every few thousand (tens? hundreds?) of numbers, you won't get repeats (except at random ;-). Jon
531.24	how about another constraint?	CLT::GILBERT	$ no /nono vaxnotes	`Tue Jul 15 1986 16:58`	7
	A similar approach is used for real random number generation. A counter is incremented in a tight loop at process level, and the low-order bit is sampled by an AST routine. Unfortunately, the 'big' random number generator should allow the random number sequence to be replayed, so that simulations can be rerun exactly.
531.25	.re 20,21,23	CHOVAX::YOUNG	Chi-Square	`Wed Jul 16 1986 03:01`	33
	Sorry for not replying sonner, but those of us in SWS in the field are at the mercy of sometimes flaky area routers. Sometimes very flaky. .Re .21, and others: Thank you all for the help, and all of you who replied were right, as summed up here in .21. It still looks wierd though. .Re .22: By "new" machines, are you referring to the 8600's and higher? I don't have an 8200, 8300, or 8500 to test on, and I don't have an up to date VAX performance summary to get the correct ratios from. I guess i'll have to leave them to others. I have again modified my subroutine (.20) and it is now returning about 20000 results a second on an 8650. I have also determined a different algorithim that should meet your requirements, and will be MUCH faster, however, it is late and I am too tired to code it just now. Tommorow night if the area router is up. .re .23 The other problem with a clock sampler is that it is not assured to be random if your program is the only one running. This might be the case, for instance if you were running a large simulation in batch overnight. What can happen is that the program and the operating system may get into a 'resonance' where the program runs through a loop that causes the random generator to sample the clock at fixed time intervals, as well as the usual numeric interval. Sounds unlikely, but I have seen it. -- Barry
531.26		ERIS::CALLAS	Jon Callas	`Wed Jul 16 1986 17:55`	8
	Yup, re-seeding isn't perfect, and can be downright lousy if you're standalone. However, if you're sufficently paranoid, you can find other random spoons to stir the pot with (error count of some device, number of times LIBRTL has been activated since system reboot, etc.). If you're clever, you can make checks for when you're standalone and not re-seed, or fall back to other quasi-random things. Jon
531.27	fast algorithm wanted	JON::MORONEY	Madman	`Mon Jul 28 1986 12:50`	7
	I need a FAST random number generator for a PDP-11 that generates integer numbers in the range 0-255 and 0-1500. Does anyone have suggestions for the constants to use in the seed'= A*seed+B (mod 65536) for constants A and B? Or a better algorithm? "Randomness" is not extremely important, so I can live with a period of 65536. -Mike
531.28		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Mon Jul 28 1986 13:32`	13
	Re .27: The American National Standards Institute's draft C standard gives the function next = next * 110351245 + 12345 for the random number generator. Assuming they knew what they were doing and the generator is good, you can use those numbers modulo 2^16, i.e., just take the last 16 bits and use next = next * 54157 + 12345. If you want to use your own numbers, the required criteria for A and B are given in .9. -- edp
531.29	Try Knuth Vol 2	MODEL::YARBROUGH		`Mon Jul 28 1986 13:38`	3
	See Knuth's The Art of Computer Programming, Vol 2. P.24, algorithm A. The algorithm requires neither * nor /, is very fast, long period. Randomness not well known.
531.30	You take the high bits and I'll take the low bits ...	EAGLE7::DANTOWITZ	David .. DTN: 226-6957 -- LTN2-1/H07	`Mon Jul 28 1986 14:06`	11
	> you can use those numbers modulo 2^16, i.e., just take > the last 16 bits and use next = next * 54157 + 12345. This refers to the last 16 bits of the new NEXT. To obtain a random number from next you should use R*NEXT/65536, where your goal is to have random numbers from 0 to R-1. (The above is easy for R=2^X, just take the high X bits of NEXT.) David
531.31		CLT::GILBERT	It's a Dusey	`Mon Jul 28 1986 14:39`	23
	Re .28: > Assuming they knew what they were doing and the generator > is good, you can use those numbers modulo 2^16, i.e., just take > the last 16 bits and use next = next * 54157 + 12345. Arggh! Not again! If 'x = Ax+B mod Z' is a good random number generator, there's no guarantee that 'x = Ax+B mod Y' is a good random number generator. As it happens, in this case it fits the guidlines, and so is probably good. Remember that the random number you want is not the low-order bits of 'x' (or 'next'). Instead, you should treat 'x' as a floating point number in the range of zero to one, multiply and truncate. In the case of random numbers in the range 0..255, multiply the 16-bit number by 256, and divide (with truncation) by 2^16 -- in other words, take the highest 8 bits of the 16-bit number. For the range 0..1500, ... it's a nuisance. Note that you have this problem regardless of the random number generator used. Interestingly, the additive number generator mentioned by Lynn does not have much theory behind it.
531.32		JON::MORONEY	Madman	`Mon Jul 28 1986 15:58`	13
	Thanks for the input so far. I was really wondering if there were any other critera for choosing A and B in the sequence <seed'=seed*A+B> mod 65536 other than A should be 4N+1 and B odd, from what was mentioned in .9. For example, A=1, B=3 is a bad choice since the "random" sequence produced is 1,4,7,10,... Will play with the other algorithm once I find a copy of the book. For the range 0..1500, I was planning on generating a number 0..2047 (2^11) by taking the upper 11 bits of the new number, and if >1500, rerunning the random # generator. Is anything theoretically wrong with this? (besides the time not being constant) -Mike
531.33		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Mon Jul 28 1986 16:41`	37
	Re .31: > Arggh! Not again! If 'x = Ax+B mod Z' is a good random number > generator, there's no guarantee that 'x = Ax+B mod Y' is a good random > number generator. Now, now, you should know me better than that! If Z and Y are both powers of some number (and 4 divides Z iff it divides Y), then the second will be a sequence with the longest possible period of linear congruential generators with the greatest power provided the first was too. Re .32: > I was really wondering if there were any other critera for choosing A > and B in the sequence <seed'=seed*A+B> mod 65536 other than A should be > 4N+1 and B odd, from what was mentioned in .9. .9 also mentions the power of the sequence. Consider (A-1)^s. What is the smallest positive integer s such that (A-1)^s is a multiple of the modulus, 65536? For A = 1, (A-1)^1 is a multiple of 65536 (it's zero, which is a multiple of everything). That's bad; you want the minimum s to be as large as possible. For the number I put forth, 54157, you must have (54157-1)^8 before you get a multiple of 65536, and that is not too bad for a cheap generator. > For the range 0..1500, I was planning on generating a number 0..2047 > (2^11) by taking the upper 11 bits of the new number, and if >1500, > rerunning the random # generator. That's okay, but you might get a little more speed by getting a number from 0 to 64499 (and trying again if you get a number to high) and dividing by 43. -- edp
531.34		CLT::GILBERT	It's a Dusey	`Mon Jul 28 1986 18:26`	8
	Re .33: ("Arggh! Not again!") Yes, but 'x = (34157*65536+5)x+12345 mod 2^32' is a plausible random number generator, while 'x = 5x+12345 mod 2^16' is not. I believe the criteria being quoted for A should be A mod 8 = 5, rather than A mod 4 = 1, for what it's worth.
531.35		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Mon Jul 28 1986 18:43`	15
	Re .34: A = 5 mod 8 follows from A = 1 mod 4 and you desire the largest power for the generator. > Yes, but 'x = (34157*65536+5)x+12345 mod 2^32' is a plausible random > number generator, while 'x = 5x+12345 mod 2^16' is not. What specific criteria are you using to say "x = 5x+12345 % 2^16" is not a good generator? I don't think Knuth mentions anything that covers this (unless it fails the Spectral Test). (I know 5 seems small, but is that it?) Is "x = 52429x+63067 % 2^16" a good generator? -- edp
531.36		ENGINE::ROTH		`Mon Jul 28 1986 22:14`	14
	Just as a practical aside, how important is the speed/randomness tradeoff? If your application is something like dithering the LSB of digitized images, or generating fractal mountains, then simply stepping thru a table of a few thousand (or less) numbers is fine, and you can have whatever distribution you want (gaussian, flat, etc). I remember seeing somewhere that Marsaglia endorsed XORing a shift- register sequence with a normal multiplicative congruential generator as a very good random number generator. Is there any theory behind this? It seems like it'd be a little slow. The VAX 69069 + 1 generator works pretty well for its simplicity (I think it took two multiplies on the -11, by splitting 69069 into 65536 + 3533). - Jim
531.37	SPEED	JON::MORONEY	Madman	`Wed Jul 30 1986 10:22`	8
	re .36: Speed is more important. I am generating random data for use in a diagnostic, and it currently can't generate it fast enough. (I have to generate as many as 1500 random bytes at once at times) I used the x'=Ax+B (mod 65536) method and it is SLOWER than the 31-bit pseudorandom shift register sequence I was using before. Looks like it is lookup tables for the data, I'll still use the random number generator elsewhere in the program. -Mike
531.38		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Wed Jul 30 1986 10:30`	10
	Re .37: I'm surprised the linear congruential method is slower. You are writing in Macro, aren't you? Computing x' = Ax+B should take exactly two instructions, assuming x is in a register. You can get a random byte out of that by doing a MOV to save the new x, then a SWAB and a MOVB to put the byte where you want it. -- edp
531.39		CLT::GILBERT	It's a Dusey	`Wed Jul 30 1986 11:29`	3
	For your purposes, even a normally bad random number generator should suffice. How about using x'=x+B mod 2^16, where B is odd and fairly big?
531.40		JON::MORONEY	Madman	`Wed Jul 30 1986 13:01`	18
	re .38: Suprised me too. The routine I used was 4 instructions: SWAB R1 MUL #x,R1 ADD #y,R1 SWAB R1 ;put "random" byte in lower byte for movb plus MOVB R1,(R2)+ Doubt if a MOV/SWAB is faster than a SWAB/SWAB, but maybe. The pseudorandom shift register sequence is 7 instructions, but with it, it runs faster! The processor is an F11 (Micro-PDP) so maybe the MUL instruction is s-l-o-w on it. re .39: I am currently planning on using a random # generator to compute an address, and using the program code as "random" data. -Mike
531.41	Another idea	AURORA::HALLYB	God throws loaded dice	`Sun Aug 03 1986 13:38`	41
	Mike, I don't know what you're up to, but it looks to me like maybe you're developing an Ethernet diagnostic -- what else needs 1500 bytes? Not that it matters much, just wondering. You could consider simply "assembling in" about 8K worth of pre-generated random numbers and just picking them up one at a time. Example 1: MOV (R2)+,R1 ; Get next number from list BNE OK ; And proceed if nonzero (normal case) ; Use 0 to flag end of list, if R1=0 we reset the pointer into the buffer MOV #FIRST,R2 ; Where FIRST is the address of an 8K ; table of pregenerated random numbers OK: <proceed> If 8K isn't enough then either enlarge the list, or save a constant in a register and add it to the value obtained -- example 2: MOV (R2)+,R1 ; Get next number from list BNE OK ; And proceed if nonzero (normal case) ; Use 0 to flag end of list, if R1=0 we reset the pointer into the buffer MOV #FIRST,R2 ; Having cycled back to the front of the list, let's change the ; offset we add to the number SOB R3,OK ; Start with 8, then 7, 6, 5, 4, 3, 2, 1 MOV #8,R3 OK: ADD R3,R2 <proceed> This approach takes slightly over 2 (or 3) very simple instructions to execute. You can hand-craft the 8K table at FIRST so that the generator has a period of 65K, if that is important. John
531.42		JON::MORONEY	Madman	`Sun Aug 03 1986 14:59`	14
	re .41: You win a cookie! It is an Ethernet diagnostic. I have optimized the code in .40 by moving all the constants into registers before starting the loop that generates the bytes, and making it inline rather than being called by a JSR call (yeah, I know, dumb). Nearly doubled the speed. I applied the same optimizations on the 7-instruction pseudorandom shift-register method I mentioned before and got a similar increase in speed. (It is _still_ faster!) In fact it is now fast enough for me to use. I may be using a modified version of the code in .41 if I have to squeeze any more microseconds from it. Currently it isn't too random, since zero can only appear once in it. (and the forward-jumping SOB is illegal) An address range check to see if we've gone past the end of the table would be better. -Mike
531.43	Another method	ENGINE::ROTH		`Tue Aug 05 1986 14:16`	55
	There are so called Fibonacci random number generators that operate using shift registers. For example, you could generate numbers in the range [0..1499] by using a 17 entry shift register of integers. Take the difference between the 5'th and 17'the entries, add 1500 if this difference is negative, shift the entries back (dropping the first entry) and store the result as a new 17'th entry. I tried a version of this, and even with the uninspired initial seeds of all 1's, the distribution looked good when plotted. I'm not certain if it can be shown that the generator never falls into an all zero's state though... this certainly seems possible. There is an article in the latest IEEE micro on random number generators that you may find useful. - Jim Here's an untested macro-11 routine that would work. It could be tightened up a bit. X: .REPT 16 .WORD 1 ; Nonzero seed numbers .WORD .+2 .ENDR .WORD 1 .WORD X X5: .WORD X+<44> ; Points to X[5] X17: .WORD X+<164> ; Points to X[17] ; ; Setup registers for next batch of numbers ; MOV X5,R1 MOV X17,R2 MOV (R2)+,R0 ; X[17] MOV (R2)+,R2 ; ; Gen next random number ; SUB (R1)+,R0 ; X[17]-X[5] BGE 10$ ADD #1500.,R0 ; Put in range [0..1499] 10$: MOV (R1)+,R1 ; Bump pointer to new X[5] MOV R0,(R2)+ ; Store new entry in shift register MOV (R2)+,R2 ; Bump pointer to new X[17] ; ; Save pointers when done ; MOV R1,X5 MOV R2,X17
531.44	A random number generator (the source code)	EAGLE1::DANTOWITZ	nothing personal ...	`Mon Jul 04 1988 13:32`	604
531.45		CLT::GILBERT		`Tue Jul 05 1988 14:43`	44
	MACRO Umul64 (A, B, Product) = ! ! FUNCTION: ! ! Multiply two 32-bit unsigned numbers producing a 64-bit result. ! ! ARGUMENTS: ! ! A (Passed by Value) Multiplicand ! B (Passed by Value) Multiplier ! Product (Passed by Reference) The 64-bit product ! ! NOTES: ! ! The EMUL instruction multiplies signed twos-complement numbers. ! ! Let sA and uA be the values of A interpreted as signed and unsigned ! numbers, respectively. Similarly for sB and uB. We have: ! ! uA = (if sA>=0 then sA else sA+2^32) ! ! sA>=0 sB>=0 uAuB ! ----- ----- ----- ! yes yes uAuB = sAsB ! yes no uAuB = sA(sB+2^32) = sAsB + 2^32sA ! no yes uAuB = (sA+2^32)sB = sAsB + 2^32sB ! no no (sA+2^32)(sB+2^32) = sAsB + 2^32sA + 2^32sB ! ! Thus, we've converted the operations into signed multiplies, ! and some additions. Note that since the result is a quadword, ! we need not propagate carries (from the additions) beyond the ! second longword. ! BEGIN BUILTIN EMUL; ! This multiplies signed* twos-complement numbers. EMUL (%ref(A), %ref(B), %ref (0), Product); IF (A) LSS 0 THEN (Product+%UPVAL) = .(Product+%UPVAL) + (B); IF (B) LSS 0 THEN (Product+%UPVAL) = .(Product+%UPVAL) + (A); END%;
531.46		CLT::GILBERT		`Tue Jul 05 1988 15:12`	16
	Note 531.44 uses a randomization technique proposed by Don Maclaren (it appears in Knuth). For its simplicity, it's pretty good. The routine Random$number(R) returns a random integer X in the range 0 <= X < R, but not all these X values are equally probable. By way of example, consider 4-bit integers (instead of 32-bits), and R = 5: 0 1 2 3 => X = 0 4 5 6 => X = 1 7 8 9 => X = 2 10 11 12 => X = 3 13 14 15 => X = 4 We see that P(X=0) is 4/16, while P(X=1) is 3/16. This may not be much a problem in practice, but the user of the routine should be aware of it.
531.47		EAGLE1::DANTOWITZ	nothing personal ...	`Mon Jul 11 1988 12:02`	19
	re .-1: With 4-bit integers the bias is noticeable, but with 32-bits the difference is very small. Using Random$number (5) to generate 10 million numbers I got the following results: random number occurrences 0 2001906 1 1999140 2 2000111 3 2001285 4 1997559 This is close enough for simulating horse-shoes! David
531.48	New and improved ...	EAGLE1::DANTOWITZ	nothing personal ...	`Thu Jul 14 1988 18:35`	466
531.49	This is from memory, but	CHALK::HALLYB	The smart money was on Goliath	`Fri Jul 15 1988 10:00`	7
	re: .48 Doesn't Knuth recommend a B_increment near (.211 * M_modulus)? The choice of 1 seems a bit off... John
531.50	Nope	PBSVAX::COOPER	Topher Cooper	`Fri Jul 15 1988 12:33`	12
	RE: .49 (John) From Knuth, Vol. II, 2nd edition page 171: v) The value of c [the increment, Knuth uses b=a-1] is immaterial when a is a good multiplier, except that c must have no factor in common with m. Thus we may choose c=1 or c=a. One is sort of traditional since it is cheap on many machines to add 1 (including the VAX). Topher
531.51	Oh, no, I need a new edition!	POOL::HALLYB	The smart money was on Goliath	`Wed Feb 22 1989 18:24`	14
	RE: .50 (Topher) From Knuth, Vol. II, 1st edition page 156: v) The constant c should be an odd number (when M is a power of 2) and also not a multiple of 5 (when M is a power of 10). It is preferable to choose C so that C/M has approximately the value given in Section 3.3.3, Eq. (41). This value is roughly .211324 I wonder if some work was done between the two editions to discredit this. John
531.52	Simple shift register?	TROA09::PIERCE	awk!! I've been greped!	`Wed Aug 29 1990 14:16`	26
	I need to generate a very simple pseudo-random integer (8-16 bits, low "quality", it's to generate an "interesting" display) and I seem to recall a very simple shift register way of doing this (in hardware), but can't recall all details: ----------------xor------------------ \| \| \| \| bit x \| bit y \| ------------------------------------- \| \| \| \| \| \| \| \| \| \| \| \| \| \|<---- \| \| \| \| \| \| \| \| \| \| \| \| \| ------------------------------------- n bit shift register (shifts left) The key information I am missing is: How many bits in the register? Which bits get fed back? Thanks
531.54	Correction to my reply	ALLVAX::JROTH	It's a bush recording...	`Wed Aug 29 1990 20:14`	41
	<<< Note 531.53 by ALLVAX::JROTH "It's a bush recording..." >>> Oops - I made a mistake in the shift register diagram below! - Jim � <<< Note 531.52 by TROA09::PIERCE "awk!! I've been greped!" >>> � -< Simple shift register? >- You can base it on a prime polynomial such as x^17 + x^5 + 1 or x^31 + x^17 + 1. There are tables of primitive polynomials in books on coding theory such as Petersons book. Think of the polynomial as the characteristic polynomial for a linear recurrance. For example, using x^17 + x^5 + 1 we would have the recurrance x[n+17] + x[n+5] + x[n] = 0 which leads to x[n+17] = x[n+5] + x[n] (all modulo 2 of course) So +-----------------------[xor]<------------------------+ \| \| \| \| \| \| +->[16]--[15]-- .. --[5]--+--[4]--[3]--[2]--[1]--[0]--+----> out ^^^^^ ^^^^^ (I incorrectly labeled this as [17] - it should be 1 less than the polynomial degree) - Jim
531.55	OK...	TROA01::PIERCE	awk!! I've been greped!	`Mon Sep 03 1990 13:08`	10
	Thanks, the math terminology is over my head, but let me see if I understand the "hardware" correctly: it is a 17 bit shift register it shifts from left to right the MSB is bit 0 the LSB is bit 16 the input to the LSB is the MSB.xor.Bit5 Eric
531.56		RICKS::AKHIANI		`Thu Nov 08 1990 09:25`	45
	The following routine generates 5 bit sequences using prime polynomial. Sources: Numerical Recipes in C The Art of Computer Programming: Seminumerical Algorithms D.E. Knuth The first book contains 2 C procedures, the second get to it mathematically and have a table of coefficients of prime polynomials. /homayoon P.S. I am still amazed by it. I use it as a EXHAUSTIVE sort of RANDOM generator. ================================================================================ #include <stdio.h> main() { int x, i; printf("Enter initial: "); scanf("%d",&x); for(i=0;i<=30;i++) { irbit2(&x); x &= 0x1F; printf("%2d ",x); }; }; #define IB1 1 #define IB2 2 #define IB5 16 #define IB18 131072 #define LAST IB5 #define MASK IB2 int irbit2(iseed) unsigned long iseed; { if(iseed & LAST) { iseed=((iseed ^ MASK)<< 1) \| IB1; return 1; } else { *iseed <<= 1; return 0; }; };
531.57	Fast RNG with period 2^249	VMSDEV::HALLYB	Fish have no concept of fire	`Wed Jan 15 1992 15:22`	89
	I have included below a random number generator along the lines of those discussed in .54ff. This generator, 10 years old now, is called "r250". It has a period of 2^249 (or so) and is, if anything, faster than the linear congruential method. (Thanks to EDP for pointing out a stupid-but-nonfatal error in the previous posting of this routine.) Conceptually it uses 32 (BITS) parallel 250-bit shift registers that each XORs the 103rd-back bit and low bit. Or something like that. You see, my knowledge of shift registers and binary polynomials evaporated shortly after the day I took my final in Algebraic Coding Theory (in 1971). Good luck, John /**************************************************************************** * Module: r250.c * Description: Implements "R250" random number generator * Reference: Dr. Dobb's Journal, May, 1991 p. 152 * Reference: S. Kirkpatrick and E. Stoll, Journal of Computational Physics 40, p. 517 (1981) * Written-by: W. L. Maier * Hacked-by: John C. Hallyburton, Jr. ***************************************************************************/ #include stdlib / Should also work with BITS 16, but who knows / #define BITS 32 static unsigned int r250_buffer[250], r250_index; void r250_init(int seed); unsigned int r250(); /**** r250_init: initializes random number generator to argument ***/ /*** side effect: changes rand() sequence ****/ void r250_init(int seed) { int j,k; unsigned int mask, bit = 1<<(BITS-1); srand(seed); r250_index = 0; / Initialize array. Half the time turn on the high bit. (Make UNSIGNED) / for(j=0; j < 250; j++) { r250_buffer[j] = rand(); if (rand() & 0x4000) r250_buffer[j] \|= bit; } / Ensure the array has a basis. This assures maximum period, at the cost / / of setting a few bits to known values. You could avoid that by doing / / some verifying there are BITS linearly-independent elements out of 250, / / something very likely, but the bonehead approach below seems to work OK. / for(j=0, mask= -1; j < BITS; j++, bit>>=1, mask >>=1) r250_buffer[j] = (r250_buffer[j] & mask) \| bit; } /********** r250() -- returns random unsigned integer ************/ unsigned int r250() { int j; unsigned int new_rand; if (r250_index >= 147) j = r250_index - 147; else j = r250_index + 103; new_rand = r250_buffer[r250_index] ^ r250_buffer[j]; r250_buffer[r250_index] = new_rand; if (r250_index >= 249) r250_index = 0; else r250_index++; return new_rand; } int main() { int j; r250_init(42); / BE SURE TO CALL THIS OR YOU WILL GET ALL ZEROES */ for (j=0; j<33; j++) printf("\n%12u %12u %12u %12u %12u %12u", r250(), r250(), r250(), r250(), r250(), r250()); }
531.58	period is 2^250-1	ALLVAX::JROTH	I know he moves along the piers	`Wed Jan 15 1992 21:17`	33
	Actually it appears that the program in .57 is based on the primitive polynomial x^250 + x^103 + 1 (or its mirror image, x^250 + x^147 + 1). These are the only primitive trinomials of degree 250, and would lead to a period of 2^250-1. (I have an ineffecient program and checked.) However, a better way of making random numbers would be to do addition modulo a large number (such as 2^32) instead of an XOR. You could even do addition mod 1 in floating point, and make H format random numbers! This would lead to a considerably longer period (as if that matters!) and at least according to Knuth such generators have good empirical properties though the theory is not as well understood as the classical multiplicative congruental generators we all know. If you choose a Mersenne exponent it simplifies checking for primitivty, and a few such trinomials that would also give maximal period would be x^521 + x^k + 1, k = 32, 48, 158, 168, 353, 363, 473, and 489 x^607 + x^k + 1, k = 105, 147, 273, 334, 460, and 502. There's a list of these in some paper to absolutely absurdly high powers. I don't know how they test for these, since the only way I know of doing it has cubic behaviour. Is there a fast (FFT style) polynomial multiplication for binary polynomials like there is for integers? - Jim
531.59	SET VOICE = Andy_Rooney	VMSDEV::HALLYB	Fish have no concept of fire	`Fri Jan 17 1992 12:45`	11
	> Actually it appears that the program in .57 is based on the primitive > polynomial > > x^250 + x^103 + 1 (or its mirror image, x^250 + x^147 + 1). One thing that always bothered me is the way the new random number is constructed using only 2 history terms, while the above polynomials have 3. Why is that? John
531.60	I'm no expert, but this is how it seems to work	ALLVAX::JROTH	I know he moves along the piers	`Fri Jan 17 1992 17:37`	105
	>> Actually it appears that the program in .57 is based on the primitive >> polynomial >> >> x^250 + x^103 + 1 (or its mirror image, x^250 + x^147 + 1). > One thing that always bothered me is the way the new random number is > constructed using only 2 history terms, while the above polynomials have 3. > Why is that? Actually I believe it's better to look at a linear relation satisfied by the terms - then there are the same number as coefficients in the polynomial. Assume that x[i] + x[i-k] + x[i-n] = 0 and assume further that an exponential eigenfunction satisfies the recurrance, x[i] = r^i for some nonzero root, r Then r^i + r^(i-k) + r^(i-n) = 0 and r^n + r^(n-k) + 1 = 0 Another way to see this is to write the matrix for the linear transformation of the components of the shift register from one step to the next. The characteristic polynomial for the matrix comes out the same - essentially because the algebraic structures are isomorphic in the end. Here's a simple C routine that'll do an additive generator of this type, for some (absurdly?) high degree Mersenne power polynomials. You have to run it millions of times for the initial transient to die away leaving pseudorandom behaviour for the "bad" initial value I set the generator to, but it does work, returning 32 bit unsigned random numbers. I'm surprised that no one has devised a "spectral" test for such a generator but even Knuth doesn't have anything definitive. If you had to make bufferfulls of randoms, you could dovetail the pointer wrapping and the buffer fencposts to only have to test one limit at a time in some inline code, so this could be perhaps the fastest way yet; your buffer could contain the shift register... - Jim /* * Portable random number generator * * Generate a random integer in the range [1..2^32-1] using additive * generator based on prime trinomials x^n + x^k + 1 mod 2 * * Examples based on Mersenne prime exponents: * * x^521 + x^k + 1, k = 32, 48, 158, 168, 353, 363, 473, 489 * x^608 + x^k + 1, k = 105, 147, 273, 334, 460, 502 * x^1279 + x^k + 1, k = 216, 418, 861, 1063 / #define N 1279 #define K 216 static unsigned long seed[N]; static int index_1, index_2; / * Return the next random number / unsigned long randsr() { index_1--; if (index_1 < 0) index_1 += N; index_2 = index_1-K; if (index_2 < 0) index_2 += N; seed[index_1] += seed[index_2]; return seed[index_1]; } / * Reinitialize the shift register */ void init_rand(new_seed) unsigned long new_seed; { int i; for (i = 0; i < N; i++) seed[i] = 0; seed[N-K] = new_seed \| 1; index_1 = 1; }
531.61		RUSURE::EDP	Always mount a scratch monkey.	`Tue Dec 29 1992 10:55`	8
	Notes .44 and .48, which contain Digital-confidential information, have been hidden just so that they are not accidentally included in mass extracts of the conference. They contain a pseudo-random-number generator written in Bliss and are available to any Digital employee; just ask the moderator. -- edp
531.62	Nice generator	VMSDEV::HALLYB	Fish have no concept of fire	`Tue Jan 25 1994 14:28`	16
	.60> You have to run it millions of times for the initial transient to die > away leaving pseudorandom behaviour for the "bad" initial value > I set the generator to, but it does work, returning 32 bit unsigned > random numbers. Hmmm. For the N and K used I found it needed about 2 million iterations to get rid of the high density of zeroes. Even then it turns out that 95% of the numbers are, ummm, even. After 4 million iterations still 90% of the "random" numbers are even. Is there a better way to initialize the array so fewer throwaways are needed? Like maybe alternating 0s and 1s instead of almost all 0s? John
531.63		3D::ROTH	Geometry is the real life!	`Wed Jan 26 1994 09:22`	24
	Which N and K did you use? I do recall running it for a long time to get good random behaviour, partly to convince myself that it would eventually get random... There is a paper by Marsaglia on these generators that also predicts the exact cycle length based on the polynomial and the word length. I'm sure that initializing the array with "random" values will be better, but how long it takes for the pathologically bad starting conditions I used (all zero, except for a one) is a good test of how long it takes for the mixing to be thorough enough to erase all starting state. I'll try and find the paper since I recall asking the library for it once. Another thought, is that you can view the iteration as a matrix multiply - hence to get very high order iteration counts, one could multiply matrices by successive squaring faster than running the iteration itself. Never bothered myself though. - Jim
531.64	Ah, I see	VMSDEV::HALLYB	Fish have no concept of fire	`Wed Jan 26 1994 13:37`	20
	I used the N and K that were defined in the example: 1279 and 216. I ran some tests to see how long it would take to go from the pathological starting state to one where there were about as many even numbers generated as odd numbers. In order to be reasonably confident, it took 500 million iterations or just over 2� hours of CPU time on my 6400 class CPU. That makes it tough to use the generator for quick simulations. If it's OK to initialize the array with "random" numbers then I'll make that change. I was afraid the "pathological" case was a necessity of the algorithm, or perhaps the simplest initialization technique that satisfied some unspecified constraints. Perhaps a truly paranoid approach would be to modify randsr() to return the builtin rand() for the first 500 million requests while continuing to generate (but not return) the array elements. After that then stop calling rand() and return seed[index_1] as is done currently. John
531.65	How safe is random seeding.	CADSYS::COOPER	Topher Cooper	`Thu Jan 27 1994 11:56`	82
	First off, a couple of modifications, without functional significance to make the next step a bit easier. First, view the numbers in the array as fixed-point values between 0 and 1. Second, generate the numbers in "batches." Take the numbers in the array and do all N iterations -- replacing the values in the vector as before, but not "emitting" any. Then at each request, emit a number from the request. At the N+1st request, repeat the process. Now we replace this generator with a similar one. This one is functionally quite different, but behaves somewhat similarly and is easier to analyze in some respects. (But we do have to watch out for the "gotcha's" at extremes -- a back of the envelope calculation based on this models predicts that the extreme case of an initial single 1 will "regularize" after only about 53,000 generations when N=1279). The new generator uses two vectors -- one of which is active. At a given time, the active vector contains some set of values which follow some distribution (limited to the range 0..1). When it is time to produce a new batch of values, we select two elements from the active vector at random (I didn't say this was a useful method) add them, take the fractional part of the sum, and place the result in the first element of the non-active vector. Repeat for each of the elements of the non-active vector. Upon completion "swap" the active and inactive vectors. If you play with this generator for a while, you will find that if the values in the vector are uniformly distributed in one generation, the expectation is that they will be uniform in the next. Furthermore, if the vector is non-uniform and not completely degenerate (i.e., all the same value) the next generation will tend to be closer to uniform. Obviously, since the numbers generated are the numbers in the vector, uniform numbers in the vector means uniform numbers generated. The "bad" behavior of the extreme seeding can be seen to be the consequence of its extreme non-uniformity, such that not only is that generation bad, but the next can only go a little way towards being uniform. (Keep track of the mean, to see this). So how likely is a "randomly" chosen inital set of seed values to be too far from "uniform"? As an indicator lets look at the mean. If the mean of the values in the vector are 1/4 (instead of the "proper" 1/2), then, ignoring the wraparound, the mean in the next generation (i.e., N values later) will be 1/2. Playing around shows a similar result for a mean of 3/4. So it will take a generation or two to make an otherwise reasonable distribution in the vector, with a mean 1/4 off the ideal, to start acting uniformly. We can therefore reasonably set this as a ball-park limit for "good enough". How likely is a randomly selected intial vector to deviate by more than 1/4 from the ideal mean of 1/2? The mean of N values drawn independently from a distribution with mean m and variance v, is approximately normally distributed with a mean also of m, and a variance of v/N. By randomly in this case, we presumably mean drawn from a uniform distribution, so the variance of the result around 1/2 will be 1/(12N). The standard deviation is sqrt(1/(12N)). We then need to know how many standard deviations 1/4 is from 1/2, and what the probability of a normal value at least that many standard deviations away from the mean is. That probability is then doubled, to take into account deviation in either direction. The result is that 1/4 is roughly 31 standard deviations from the mean and the probability of that extreme a deviation, in either direction is roughly 6E-211. If you seed well, you are not very likely to run into this problem, I would say. Of course there are other deviations -- especially in the variance -- which could cause problems, but they all have similar extreme statistics. Reasonable random seeding is unlikely to produce a bad "tail" which needs more than a generation or two of cranking to eat up. Topher
531.66	My generation is more uniform than yours, daddy	VMSDEV::HALLYB	Fish have no concept of fire	`Thu Jan 27 1994 16:08`	33
	I think I follow the argument. One doesn't really need to "batch" things, that just explains the process better. You can think of each single standard iteration as a small step in the longer process. Some good thoughts and comforting results. Thanks. The argument that each "iteration" gets you closer to the mean would seem to imply that there is a danger of converging to some sort of orbit, i.e.: > if the vector is non-uniform and not completely degenerate (i.e., all > the same value) the next generation will tend to be closer to uniform. ... but will it ever turn out to be useless? In particular, could generation I+1 turn out to be a permutation of generation I? Perhaps cycling after some period? I don't think this problem is tractable for large N. > -- a back of the envelope calculation based > on this models predicts that the extreme case of an initial single 1 > will "regularize" after only about 53,000 generations when N=1279). Which appears to correspond to 53000 * 1279 ~= 68M single iterations. This generates well-distributed fixed [0,1) numbers but 3/4ths of them will have a 0 low bit�. Suitable in some but not all environments. Presumably ensuring you start with approximately half your vector odd will eliminate this problem. It took me almost 10x as long to evenly distribute the low bit, starting from this worst-case setup. John ------------------- �At least according to my sample of size 198 after 64M iterations, where only 52 numbers were odd.
531.67		3D::ROTH	Geometry is the real life!	`Fri Jan 28 1994 08:45`	33
	> The argument that each "iteration" gets you closer to the mean would > seem to imply that there is a danger of converging to some sort of > orbit, i.e.: But a close approximation to the period for an (n,k) generator with m bit words is (2^n - 1) * 2^(m-1), though the exponent (m-1) may be less for some generators but typically not very much less. Note that you can do the iterations mod 1 using floating point numbers directly (!) and this is how Knuth suggested doing it, where he used a n (55,24) generator. At least one of the numbers in the array has to have a least significant bit on to ensure you get the 2^(m-1) factor above... > Which appears to correspond to 53000 * 1279 ~= 68M single iterations. > This generates well-distributed fixed [0,1) numbers but 3/4ths of them > will have a 0 low bit�. Suitable in some but not all environments. > Presumably ensuring you start with approximately half your vector odd > will eliminate this problem. It took me almost 10x as long to evenly > distribute the low bit, starting from this worst-case setup. The least significant bit is the bit in an (n,k) mod 2 generator, and will be well distributed once you are into the sequence to any degree. So I'm surprised that you have seen that many zero low bits... this bears a little more experimentation since it seems counterintuitive. Of course, you could in principle get very high order iterates by squaring the companion matrix for the iteration a sufficient number of times - that's about an n*n process for each squaring, but after 30 of these, you have done around a billion iterates... - Jim
531.68	floating point isn't always the way to go	VMSDEV::HALLYB	Fish have no concept of fire	`Fri Jan 28 1994 12:05`	21
	> Note that you can do the iterations mod 1 using floating point > numbers directly (!) and this is how Knuth suggested doing it, It seems to me that if you "seed" your array improperly (randomly?) you might run into problems with some floating point architectures, e.g., generating ill-formed floating point numbers. Another problem is that some chips don't do floating point and end up trapping or calling a simulation routine. So much for speed. Also in some situations you want to avoid doing floating point because that may require saving and restoring floating point registers. Thus it seems that if you want a fast, highly portable, runs-well-under- any-circumstance, minimal ramp-up, huge-period random number generator you are better off with an integer version that is initially seeded via an algorithm yet to be determined, but which could simply be rand() or possibly some unspecified deterministic seeding process. Or have I missed something? John
531.69	Ambiguous usage.	CADSYS::COOPER	Topher Cooper	`Fri Jan 28 1994 12:23`	45
	RE: .66 (John) >> -- a back of the envelope calculation based >> on this models predicts that the extreme case of an initial single 1 >> will "regularize" after only about 53,000 generations when N=1279). > > Which appears to correspond to 53000 * 1279 ~= 68M single iterations. Whoops, I'm guilty of carefully using the same word to mean two different things. Elsewhere I used "generation" to mean N iterations, but here I meant it to mean "the generation of a single random number", i.e., a "single iteration". For what it is worth, here is the calculation. The mean of the sum of samples from two (possibly identical) independent distributions is the sum of their means (assuming only that the mean is defined). The mean of each full generation (i.e., N iterations), is, neglecting the "wrap-around" caused by the modulo, twice the mean of the previous generation. In this case, through most of the generations, the wrap- around will not occur because most of the numbers will be too small, so I am justified in a rough estimate in ignoring it. If m0 is the initial mean, and the "correct" mean is 1/2, the mean will get where it is supposed to be after g generations where g is solved using the relationship: g 1/2 = 2 * m0 or g = lg(1/m0) - 1; The single low-order one is 2^(-32) in the fixed-point interpretation. The mean is that plus N-1 (1278) zeros divided by 1279 so 1/m0 equals 1279*2^32, from which we get 41 full generations, which multiplying by N gives the cited result. Similar calculations for the variance indicates that 71 generations (approx 91,000 iterations) are needed to restore the correct variance, neglecting the wrap around. But after the mean gets to be around 1/2, the wrap around, which reduces the variance, slowing the "climb" towards the correct value. Perhaps this explains part or all of the failure for the model to match reality. Topher