[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

191.0. "Random tiles" by TURTLE::GILBERT () Sun Dec 02 1984 16:33

Problem #1:

	A large urn contains an unknown number of tiles marked "0" and "1"
	(although there are some of each, the number of each is unknown).
	The tiles are randomly sampled and replaced, and the numbers written
	in sequence, eg: "010010101001010101101111...".  Given such a sequence,
	how can it be transformed into a random sequence of "0"s and "1"s?
	(this problem is equivalent to making a biased coin fair).  That is,
	the transformation should result in a sequence where each position in
	the sequence is independent of all other positions, and "0"s and "1"s
	are equally likely to occur.

Problem #2:

	The large urn now contains an unknown number of tiles marked "0", "1",
	"00", "01", "10" and "11".  Again they are sampled and replaced.  How
	can the resulting sequence be transformed into a random sequence of
	"0"s and "1"s?
T.RTitleUserPersonal
Name
DateLines
191.1TURTLE::STANTue Dec 04 1984 12:202
Can the transformation start out biased and slowly converge to
a fair algorithm? or must it be correct immediately from the start?
191.2TURTLE::GILBERTTue Dec 04 1984 13:066
For the first problem, there is a solution that's 'fair' from the start;
it involves "tossing out" certain substrings.

I have no solution for the second problem; whatever you have would be
appreciated, since a solution implies a way to 'balance' a biased set of
random samplings to produce a truly random number.
191.3ULTRA::HERBISONTue Dec 04 1984 16:2526
If you have a solution to the first problem, it can be applied to the
second problem -- just ignore all tiles with more than one digit on them.

Another way to tackle the problem would be to treat the output from the
urn as three separate streams, as follows.

    I believe that a solution to the first problem is to divide the
    output stream into pairs of digits.  All "0" "0" pairs and "1" "1"
    pairs are tossed out, all "0" "1" pairs produce a "0" output and
    all "1" "0" pairs produce a "1" output.

    Use that same algorithm three times for the second problem;
    once with " 0" " 1" producing "0" and " 1" " 0" producing "1",
    once with "00" "01" producing "0" and "01" "00" producing "1", and
    once with "10" "11" producing "0" and "11" "10" producing "1".
    Each of these three procedures goes on simultaneously -- like
    dividing the input into three streams and combining the output.

    An input of "1" "00" "00" "10" "0" "11" would produce
    an output of "1" "0". [The "1" gets saved, the "00" gets saved,
    the "00" gets paired with the previous "00" and dropped, the "10"
    gets saved, the "0" gets paired with the "1" and a "1" is output,
    and the "11" gets paired with the "10" and a "0" gets output.]

You could also try a complicated scheme. :-)
						B.J.
191.4ULTRA::HERBISONTue Dec 04 1984 16:3919
Re: .3

The answer seemed to simple.  I now realize that you can only work with
the output stream and can't tell what tiles the digits came from.  This
means that my solution in .3 does not apply and the problem gets harder.


OK, how about this:  From the stream of output, throw away the digits
in the even positions, this eliminates any special relationships between
any adjacent pairs of digits in the series.

After some initial segment of the series, the probability of a digit
being a "1" is the same as the probability that a particular digit on
a tile is a "1".  [A tile with "00" on it counts as 2 zeros, "01" is one
zero and a one, "1" is a one, etc.]

This means that you can use the same method that you used in the first
problem and get a random series.
						B.J.
191.5TURTLE::GILBERTTue Dec 04 1984 17:474
Thanks, that's for what I was looking (hanging prepositions are something
up with which I will not put).

Does dropping every other digit really make the now adjacent digits independent?
191.6RANI::LEICHTERJFri Dec 07 1984 01:0446
Throwing away every other digit does NOT make the adjacent digits independent.
Suppose the pot contains 1/3 "0" tiles, 2/3 "10" tiles.  The initial string
then has 2/3rd's 0's, 1/3rd 1's.  Throwing out every other digit has no
effect on this distribution.  (Suppose it did.  Let the original sequence be
S, the Evens, which we kept, E, the Odds, which we discarded, O; S=E U O.
It's clear from the symmetry of the situation that E and O must have the same
distribution of 0's and 1's.  That distribution is hence the same as S's.)

But now working with E, suppose we see a 1.  In S, it MUST have come from
the first position of a "10" tile.  What could have followed the "10" tile?
Either a "0" - 1/3rd of the time; or a "10" - 2/3rds of the time.  If it
was a "0", the next digit in E is that 0.  If it was a "10", the next digit
in E is the leading 1.  Hence, a 1 in E has a 2/3rds chance of being followed
by another 1.  Since we know that over-all 2/3rds of the digits are 0's,
however, the chance of a 1 following a 0 must be much less - certainly less
than 1/3.  (In fact, it's not hard to calculate.  If I see a 0 in E, it
came either from a "0" tile - 1/3 chance - or a "01" tile - 2/3 chance.
Whichever it was, the only way to get a following 1 is for S at this point to
have continued "0" "10" - a 1/3 * 2/3 = 2/9 chance.)

Hence, in the resulting sequence:

	P(0|0) = 7/9
	P(1|0) = 2/9

	P(0|1) = 1/3
	P(1|1) = 2/3

Hardly uncorrelated!

Note that for S, the probabilities are:

	P(0|0) = 1/3
	P(1|0) = 2/3

	P(0|1) = 1
	P(1|1) = 0

Suppose we measure deviation from independence as the sum of the squares
of the differences between the probabilities for given conditions.  (That
is, m = (P(0|0)-P(0|1))^2 + (P(1|0)-P(1|1))^2); this is clearly 0 exactly for
pairwise-uncorrelated streams.)  m(E) = (7/9-1/3)^2+(2/9-2/3)^2 = 32/81.
m(S)=(1/3-1)^2+(2/3-0)^2 = 8/9 = 72/81.  So the intuition behind tossing the
odd digits is ok - we HAVE improved things.

							-- Jerry
191.7TAV02::NITSANMon Dec 10 1984 05:4022
 I don't understand something. What's wrong with the following transformation,
done by the following "black box":

                   +-------+
   input string    | black |    output string
          -------> |  box  | ------->
                   +-------+

which is really:

                   +------------------+
   input string    |      +--------+  |    output string
          -------> |--.   | pseudo |--| ------->
                   |  |   | random |  |
                   | ___  | gener. |  |
                   |  -   +--------+  |
                   +------------------+

after all IT IS a well defined transformation on the input string (take the
next output bit of the... pre-defined shift register, instead of the next
input bit)...???
                                            NITSAN
191.8TAV02::NITSANMon Dec 10 1984 06:576
...or is it not?

 In order to give the "formula" of this "transformation" we should either
consider the current bit sequence number, or the whole input string until now
as arguments, and what's more, the output is as random as a "pseudo" random
generator can be...
191.9TURTLE::GILBERTMon Dec 10 1984 15:5527
re .7.8
	The problem is to convert a truly random sequence of tiles into a
	truly random sequence of independent 1s and 0s; where the distrubution
	of tiles is unknown.

re .6
	In the first paragraph..., if 1/3 are "0" tiles and 2/3 are "10" tiles,
	then 3/4 of the digits are 0's, and 1/4 are 1's.  This doesn't affect
	the argument that adjacent digits are not independent.

re .3.4
	This is how I understand the method:

	    Delete every other digit in the original sequence.
	    Now delete pairs of adjacent 0's and pairs of adjacent 1's.
	    Delete every other digit.

	For example:
	    01010001010101110101011111010100101010111110101110101
	    0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1
	                0 1     0 1 1     0 1
	                0       0   1       1
	(the aesthetic pattern is coincidental)

	Does this method (as a whole) produce a digit-wise independent sequence?

	The use of Markov chains might help answer this question.
191.10METOO::YARBROUGHMon Dec 10 1984 16:058
One way of assuring that 0's and 1's are equiprobable is to complement every
2nd output. This leaves some 2nd-order corellations, which can be handled
by grouping the outputs in pairs and complementing every one of those. Keep
this up (i.e. groups of 4, 8, ...) until you are happy with the degree of
uncorrelation that results. (TYPO! complement every 2nd group of whatever
size.) The group sizes can be anything, not just powers of 2.

Lynn Yarbrough
191.11RANI::LEICHTERJSun Dec 16 1984 15:1643
re: .10
No deterministic technique like this can possibly work.  The situation is best
modelled as a game against an adversary.  The adversary will supply you with
a series of 0's and 1's.  You must generate a random, uncorrelated serious of
0's and 1's.  By hypothesis, you do not have access to a random number generator
(that is, you can be modeled as a (deterministic) Turing machine).

The adversary know ahead of time what algorithm you will run.  (This is just a
way of saying that your algorithm must work given ANY input string, and that
you can't change your algorithm after seeing the string.)

It's clear that no solution exists without some "randomization" constraint on
the adversary.  (Otherwise, he can force you to output nothing but 0's, for
example, by inverting your algorithm and feeding in whatever will make all 0's
come out.  Of course, there may be NO input that produces all 0's; to take a
trivial case, you could always output 01010101....  But then you aren't a
source of random numbers.  In fact, if your adversary simply feeds you all
0's, you can't possibly "add randomness" - you are now completely determinis-
tic.)

So how are we to constrain the adversary?  It's not easy.  What you'd like
to say is that he generates 0's and 1's with n'th order statistics which
never show a probability of either 0 or 1, for all n.  It's not clear how to
formalize this.  Here's a stab at it:  The adversary chooses an infinite
number of random bits S[i].  Each bit is chosen from a biased, but random,
source of 0's and 1's; the sources are different for each S[i].  Each of the
sources must be independent of the others, and the bias of each toward both 0
and 1 must be <1.  (That is, there is a non-0 probability of getting a 0, and
of getting a 1.) Let the adversary's stream be A{i}.  The adversary then
computes:

	A{1} = S[1]
	A{i+1} = Fi(A{1},...,A{i}) XOR S[i+1]

where each of the Fi are (arbitrary) Boolean functions.  For example, an
adversary can make the A{i}'s "almost constant" by choosing Fi to return
it's i'th argument, and S[i] from a distribution heavily biased toward 0.
(If S[i] is from a distribution heavily biased toward 1, the resulting
stream "almost alternates" between 0 and 1.)

Does this sound like a reasonable model?  Does anyone have any ideas?

						-- Jerry
191.12TURTLE::GILBERTMon Dec 17 1984 00:156
re.10
	I'd never be satified with the results.  Consider the case where there
	are 9999 "0" tiles, and 1 "1" tile.  The algorithm may produce a random
	looking sequence, but would be easily reproducible (and thus unsuitable
	for cryptographic purposes), and doesn't take advantage of the positted
	randomness of the drawn tiles.
191.13TURTLE::GILBERTMon Dec 17 1984 18:5040
The following uses a finite-state machine to solve the first problem.
The finite-state machine starts in state START and reads 0s and 1s from
the sequence.  The entries in the table indicate the next-state, and the
character to write the output (if any).

	   \  next input character
	    \
	state\|	"0"	| "1"     |
	------+---------+---------+
	START | GOT_0	| GOT_1   |
	GOT_0 | START	| START,1 |
	GOT_1 | START,0	| START   |
	------+---------+---------+

A better diagram is the following, where p0 and p1 are the probabilities of
a "0"-tile and a "1"-tile, respectively.

		p0		       p1
	     /-------->	+-------+ <--------\
	    /		|	|	    \
	+-------+   p0	|	|  p1	+-------+
	| GOT_0 |<------| START	|------>| GOT_1 |
	+-------+	|	|	+-------+
	    \	p1,0	|	|    p0,1   /
	     \-------->	+-------+ <--------/

Null transitions can be removed, giving:

		 p0*p0		   p1*p1
		+----->	+-------+ <-----+
		|	|	|	|
		+-------|	|-------+
			| START	|
		+-------|	|-------+
		|p0*p1,0|	|p0*p1,1|
		+----->	+-------+ <-----+

Simply sitting in the START state without producing any output has
no effect on the probabilities of 0s and 1s in the output sequence.
Note that the probabilities of generating a 0 or a 1 are equal.
191.14HARE::STANThu Dec 20 1984 00:285
Call a tile large if it contains two digits.  Call it small if
it contains 1 digit.

:-) Problem 2 can be reduced to problem 1 by throwing away any large tiles
    drawn.
191.15TURTLE::GILBERTThu Dec 20 1984 14:221
Ah, but you aren't told that a double tile is drawn.
191.16CLT::GILBERT$ no /nono vaxnotesFri Jul 11 1986 22:3910
Would it help if the second puzzle were simplified, so that the set
of tiles is simply "0", "1" and "00"?

With this in mind, note that the solution given in 191.13 generates
a 0 for every odd string of "0"s, and a 1 for every odd string of "1"s.
If the tiles are simply "00" and "1", we could generate a 0 whenever
a string of consecutive 0s has length 2 modulo 4, and a 1 when the
length of a string of consecutive 1s is odd (equal to 1 modulo 2).

Might a solution for the "0,1,00" problem be somewhat similar?
191.170, 1, & 00GALLO::JMUNZERMon Jul 14 1986 15:5021
Adapting the finite state machine of .13 to the problem of .16, how about:

The finite-state machine starts in state START and reads 0s and 1s from
the sequence.  The entries in the table indicate the next-state, and the
character to write the output (if any).

	      \   next input character
	       \
	state   \|   "0"    |   "1"    |
	---------+----------+----------+
	START    | START    | GOT_1    |
	GOT_1    | GOT_10   | GOT_11   |
	GOT_10   | START    | GOT_101  |
	GOT_101  | START    | GOT_1011 |
	GOT_1011 | GOT_1011 | GOT_1,0  |
	GOT_11   | GOT_110  | GOT_1    |
	GOT_110  | START    | GOT_1101 |
	GOT_1101 | GOT_1101 | GOT_1,1  |
	---------+----------+----------+

John
191.18KOBAL::GILBERTOwnership ObligatesMon Dec 19 1988 19:294
Let p(0), p(00), and p(1) be the probabilities of a 0-tile, a 00-tile, and
a 1-tile, respectively.  A problem with .17 is that if the p(0) = 0, the
finite state machine will never output anything, even though some machine
should be able to generate random numbers from the sequence of 00's and 1's.