T.R | Title | User | Personal Name | Date | Lines |
---|
664.1 | | CLT::GILBERT | eager like a child | Mon Feb 09 1987 14:32 | 16 |
| I've always liked the following approach.
As in .0, let ran(m) return a random integer (with a uniform distribution)
in the range 0..m-1. Let swap(x,y) exchange the values of x and y.
! Algorithm to perform an in-place shuffle of an n-element array,
! x[1..n].
!
for i := n downto 1 do
swap (x[i], x[ran(i+1)]);
Note that care must be used in creating the ran(m) function -- it's
tempting to use a linear-congruential random number generator, and just
take that integer modulo m. Instead, you should convert the number to a
floating-point number in the range [0..1), multiply by m, and take the
integer part of the product. This has been discussed elsewhere.
|
664.2 | Peter, you're a card! | SQM::HALLYB | Are all the good ones taken? | Mon Feb 09 1987 15:01 | 7 |
| Any reason why you use ran(i+1) instead of ran(n+1) in the .1 algorithm?
It seems to me that restricting the target range would somehow bias
the deck -- e.g., card 1 has two chances to swap for card 51 but
only one chance to swap for card 52. But this is just off the top
of my head, it may all work out...
"double-down" John
|
664.3 | Ante Up! | COMET::ROBERTS | Dwayne Roberts | Mon Feb 09 1987 15:57 | 20 |
| (To borrow and modify Peter's example in .1:)
! Algorithm to perform an in-place shuffle of an n-element array,
! x[1..n].
!
for i := 1 to n do
swap (x[i], x[1+ran(n)]);
RE .2: A randomly shuffled deck of cards is not made more random by
re-shuffling it, so it shouldn't make any difference if a card is
swapped once or a thousand times.
Note that the APL language has a built-in dyadic operator "?"
(pronounced "deal") that does a shuffle. For example, the statement
DECK _ 52 ? 52
would create a 52-element array named "DECK" with the integers from
1 to 52 "randomly" distributed within it.
|
664.4 | Shuffle by sorting | JON::MORONEY | Legalize Liberty | Mon Feb 09 1987 17:15 | 13 |
| One shuffling algorithm I used follows. Create 2 arrays, A[N] and B[N].
N is the size of the "deck".
Step 1: Fill array A with integers from 1 to N. Doesn't matter if they're
in order.
Step 2: Fill array B with random numbers, using your favorite random number
generator that returns a real x such that 0 <= x < 1.0.
Step 3: Using your favorite sorting algorithm with 1 modification, sort array
B. The modification is, whenever you swap 2 numbers B[I] and B[J],
you also swap A[I] and A[J].
Step 4: When the numbers in B are sorted, those in A are shuffled.
-Mike
|
664.5 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Mon Feb 09 1987 19:26 | 12 |
| The method presented in .0 does not give all permutations an equal
chance of being selected. For example, the second number you move
has a 2/n chance of going immediately after the first one, when
this chance should be 1/(n-1).
The method in .4 requires unnecessary time and space, since the method
Gilbert presented in .1 achieves the same effect quickly and gives an
equal probability to each permutation, subject to the precision of the
random-number operation.
-- edp
|
664.6 | Skinning cats | CLT::GILBERT | eager like a child | Mon Feb 09 1987 21:35 | 25 |
| There was a typo in .1 -- it should read:
for i := n downto 1 do swap (x[i], x[1+ran(i)]);
I find this easiest to prove correct. If the loop order is changed:
for i := 1 to n do swap (x[i], x[1+ran(i)]);
it's still correct, and you need not know n before processing. Dwayne's
algorithm in .3:
for i := 1 to n do swap (x[i], x[1+ran(n)]);
may well produce a better shuffle for standard (linear-congruential based)
random number generators. I've used Mike's algorithm in .4 when shuffling
large data files (prepend random numbers, and feed them to a sort utility).
Mike's note "Shuffling by Sorting" reminds me of a sort algorithm proposed
by Phil Emma -- "Sorting by Shuffling": Check whether the array is in order,
and if not, swap two randomly selected elements, and repeat (an early version
of the algorithm swapped the two elements only if they were out of order, but
that bordered on the practical). Out of curiousity (and because we're getting
good at calculating expectations), what's the expected number of swaps to sort
three elements? How about 4 elements?
|
664.7 | Looks random, but... | CHOVAX::YOUNG | Back from the Shadows Again, | Mon Feb 09 1987 21:42 | 19 |
| It is easy to demonstrate that the method listed in .0 is not 'fair'.
This kind of 'lookaside' shuffling tends to result in clumps of
cards early on. Thus cards the are near each other in the beginning
and of the same residue modulo 2 tend to be near each other when
done.
The method mentioned in .1 was for years my preferred method of
doing shuffles, and all that time I believed it to be fair. Then
while trying to solve a related problem, I ran several tests to
'prove' that it was fair. Much to my chagrin very clear patterns
in the redistribution showed up. I would now say that it is only
marginaly fairer than the .0 method.
The method mentioned in .4 is new to me, and has great appeal.
It is clearly a fair shuffle, and I would bet that it has optimal
performance for any fair shuffle. Very clever, this is definitely
the method that I plan to use in the future.
-- Barry
|
664.8 | Efficiency Counts | COMET::ROBERTS | Dwayne Roberts | Mon Feb 09 1987 22:01 | 6 |
| .4 should shuffle very well; however, it is only as efficient as the
sort used. What's the fastest sort? nlogn?
The methods that Peter and I advance are _linear_. There could be a
significant difference with large values of n.
|
664.9 | random thoughts... | CHOVAX::YOUNG | Back from the Shadows Again, | Mon Feb 09 1987 22:54 | 23 |
| Re .8:
I would maintain that getting a correct answer was more important
than getting an incorrect one efficiently.
As for the efficiency of sorting, it is usually considered to be
O(N*log(N)), though there are some radix sorts that are aguably
O(N).
It has just occured to me though that a fair algorithim with almost
O(N) time could be designed by taking the original algorithim in
.0 and using the more standard seperate lookaside lists, instead
of the simpler 'in-deck' lookaside algorithim. This involves building
a seperate list for every 1..N. Then scan up the main array, adding
lists with a single element to the output list, skipping those with
no elements, and sorting (.4) those with more than one element.
This is a fair algorithim, and runs in average time:
O(N + N + (N^.5)*log(N^.5)) --> O(N)
Hows that? (I learn something new every day in this conference 8^)
-- Barry
|
664.10 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Tue Feb 10 1987 08:54 | 13 |
| Re .7:
I would like to know more about the problems you found with the method
Gilbert presented in .1. Given a truly random random-number generator,
Gilbert's procedure gives each possible permutation an exactly equal
chance of occurring.
The only chance for failure should be in the random-number generator.
What did you use, and what sensitivities do the various methods
presented in this topic have to the failings of common generators?
-- edp
|
664.11 | .1 visits small subset of permutations | MATRIX::ROTH | | Tue Feb 10 1987 12:28 | 12 |
| I agree with .10 in principle, that each permutation is equally
likely. However, in practice there are usually vastly fewer random number
states available than permutations. For example, shuffling 52 cards
there are around 8.052E+67 permutations, but with a 32 bit state
word there are only 4.295E+09 random numbers. So you can only
visit 1.0/(1.875E+58) of the possible permutations. Given that
multiplicative congruential generators are notorious for falling
on hyperplanes and so on, such use of the generator may cause problems.
Just a guess...
- Jim
|
664.12 | | MATRIX::ROTH | | Tue Feb 10 1987 12:31 | 5 |
| Of course the comments in .11 apply equally well to the idea of
sorting an array of random numbers, but there the 'bad behaviour'
of the generator may be less of a problem.
- Jim
|
664.13 | | CHOVAX::YOUNG | Back from the Shadows Again, | Tue Feb 10 1987 13:26 | 19 |
| Re .10:
I may be dense but it is not apparent to me that Peter's algorithim
"gives each possible permutation an exactly equal chance of occuring."
If you could (or anyone else), would you post an argument, or proof
of this?
My observations had nothing to do with the validity of the random
number generators. I came to my conclusions by running some simple
tests that measured the redistribution results. These tests used
several different random-number generators, and most came from Knuth
and had very good figures from the spectral test. I will try to
recreate these tests, and then submit the results here.
While the end result of Peter's method may indeed be subtle, that
does not mean that it is fair. Many subtle random-number generators
for instance, turned out to have poor distributions.
-- Barry
|
664.14 | | SQM::HALLYB | Are all the good ones taken? | Tue Feb 10 1987 14:42 | 23 |
| > I may be dense but it is not apparent to me that Peter's algorithim
> "gives each possible permutation an exactly equal chance of occuring."
> If you could (or anyone else), would you post an argument, or proof
One simple argument goes like this, using a 52-card deck as an example.
Let J represent any card in the deck. Clearly card J has a 1/52 chance
of ending up in position 52, since that would occur only at the outset
(n=52) and once placed there the card doesn't move. (Peter's algorithm)
Now we ask ourselves the chance that card J has of ending up in
position 51. For this to happen (a) J may not be card 52, and
(b) J must be selected for position 51. Recall from probability that
P(A & B) = P(A) * P(B|A). Here, P(a) = 51/52, clearly the probability
that card J is _not_ card 52 and P(b) = 1/51, since no matter where J
is in the remainder of the deck, it has an even chance of being
selected therefrom. Since there are 51 cards to choose from we get 1/51.
Multiplying those together gives us the probability 1/52 that J ends up
as card 51. Proceed inductively to demonstrate that all ppositions
have an even chance of ending up with card J. This easily generalizes
to decks of other than 52 cards. The algorithm appears to be as
reliable as the random number generator.
John
|
664.15 | Just wiggle a little and new ones appear | MODEL::YARBROUGH | | Wed Feb 11 1987 09:08 | 14 |
| > ... each permutation is equally
> likely. However, in practice there are usually vastly fewer random number
> states available than permutations. For example, shuffling 52 cards
> there are around 8.052E+67 permutations, but with a 32 bit state
> word there are only 4.295E+09 random numbers. So you can only
> visit 1.0/(1.875E+58) of the possible permutations. Given that
> multiplicative congruential generators are notorious for falling
> on hyperplanes and so on, such use of the generator may cause problems.
>
You can get more permutations by NOT starting from the same original 'deck'
of numbers each time, but retaining the current permutation as the starter
for the next. There are any number of simple transformations of the
original 'deck' that will produce a different sequence, with a very high
probability that any two such sequences has no common permutations.
|
664.16 | Oopps | CHOVAX::YOUNG | Back from the Shadows Again, | Wed Feb 11 1987 12:51 | 19 |
| Re .14:
Well John, you have shown me the error of my obstinate ways. I
can now see that Peter's method using the expression:
Swap (Card[i], Card[1 + ran(i)]);
is indeed fair. I was misled because of its similarity to the method
that I had been using which was:
Swap (Card[i], Card[1 + ran(n)]);
which is NOT fair. Thus Peter's method does seem to be the fastest
and simplest fair method.
Like I said, I learn something new every day ... 8^)
-- Barry
|
664.17 | | CLT::GILBERT | eager like a child | Wed Feb 11 1987 17:54 | 8 |
| Re .16:
> I was misled because of its similarity to the method
> that I had been using which was:
> Swap (Card[i], Card[1 + ran(n)]);
> which is NOT fair.
Huh? It looks fair to me.
|
664.18 | I want 52! possibilites, not just 2^32 | VIDEO::OSMAN | and silos to fill before I feep, and silos to fill before I feep | Wed Feb 11 1987 23:17 | 14 |
| Dot something or other raised an interesting point, that there are "only"
2^32 maximum random numbers available from MTH$RANDOM (are there 2^32,
or are many unreachable, but that's a different topic), but there are
52! desired permutations.
So, how do we program VMS to "truly" shuffle a deck ? That is, how
do we give an equal chance to every one of the 52! combinations
coming up.
Perhaps 2^32 is plenty enough shuffles that it's not worth pursuing,
but certainly from an academic perspective it's an interesting
question.
/Eric
|
664.19 | Carry results forward and repeat | MODEL::YARBROUGH | | Thu Feb 12 1987 09:32 | 19 |
| Starting from the sequence 1..52 and a different random seed each time, you
can in principle generate 2^31+1 permutations. (The period of the VAX RNG
is chosen that way because it is prime. It's not guaranteed that these
permutations are all distinct, although it is very probable.) Then it's
time to recycle the random numbers.
Next time, start each calculation with the last permutation generated by the
previous cycle. This gives a different set of 2^32 permutations. Again,
it's not guaranteed that there won't be two alike, but if there are, they
will have been generated by a different seed.
Repeat this process. It will duplicate a sequence of permutations only when
the last permutation of the cycle is 1..52.
I don't have the energy to figure out what the period of this process is.
A group theory expert could tell you. However, if the period of the RNG is
indeed prime as it is supposed to be, then there is a good chance that the
entire sequence of 52! permutations can be generated this way, and any
subsequence can be found from the appropriate seed and initial permutation.
|
664.20 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Thu Feb 12 1987 09:43 | 21 |
| Re .18:
Find a number, a, relatively prime to 52! such that (a-1) is divisible
by four and not eight, and by the first powers but not the second
powers of 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 39, 41, 43, and 47.
Take another number, c, relatively prime to 52!, and your random number
generator is Xnext = (a*Xcurrent + c) mod 52!.
The potency of this generator is 25, which is enormous, and I would bet
it passes the spectral test. However, since we would be using every
bit of the generated number, patterns would show up, so it might be
better to use "mod 52!^n", where n is some integer greater than one.
Use this generator in Gilbert's algorithm: Given a generated number,
r[0], let x[i] be the quotient and r[i+1] be the remainder of dividing
r[i] by 52!^n/P(52,i+1), as i goes from 0 to 51, and where P(m,n) is
the number of permutations of n objects selected from m objects.
Return the x's.
-- edp
|
664.21 | A reference seen on this... | ENGINE::ROTH | | Wed Feb 25 1987 08:39 | 14 |
| Re .20 - that's pretty elegant. However, there are probably cheaper
ways of getting more 'state' into a random number generator, for
example an additive generator based on a primitive trinomial will
have an astronomical period and only requires a few integer adds
per number. I know such generators have been proposed for encryption,
but don't have the list handy (many are based on Mersenne primes).
What brings this to mind is that I happened to come across this
paper yesterday while browsing around for something else -
"Faster Methods for Random Sampling", J. Vitter
CACM, July, 1984
- Jim
|
664.22 | Typo alert! | ZFC::DERAMO | Daniel V. D'Eramo | Tue Dec 01 1987 12:40 | 11 |
| Re: .19
2^31 + 1 is not prime; however, 2^31 - 1 is.
Re: .20
You do not need to include 39 in the list of numbers, because 3
and 13 are already in the list. (39 = 3 * 13). It's inclusion
is not incorrect, it is just redundant.
Dan
|
664.23 | Shake well before using. | ZFC::DERAMO | Daniel V. D'Eramo | Tue Dec 01 1987 18:51 | 29 |
| The last swap of X[1] with itself in .1 can be avoided by
for i := n downto 2
swap(X[i], X[random integer in {1,...,i}])
Instead of verifying this by calculating probabilities as in
.14, I like to think of this algorithm as putting X[1]
through X[n] into a bag, shaking them up, and WITHOUT
LOOKING reaching in and taking one. That goes into X[n].
Now repeat this, with the bag now holding i=n-1 elements,
and put the randomly selected one into X[i] = X[n-1].
Repeat for i=n-2,...,2 -- that is, until there is just one
element left in the bag. Put that one into X[1]. The "bag"
at any one point in the loop is {X[1],...,X[i]} and the swap
& decrement of i takes a random element out of the bag and
puts it into its final resting place in the array.
If you did this with a real deck of cards and a real paper
bag, then it seems obvious that all of the 52! possible
permutations are equally likely [if you shake the bag well
enough. :^)].
I would suggest that both this method from .1 [as well as
the fair method in .4] could be used to test your favorite
random number generator. Shuffle <1,2,3,4,5> a few thousand
times and record how many times each permutation comes up.
Then run your favorite statistical test on the results.
Dan
|