[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

1930.0. "Crux Mathematicorum 2000" by RUSURE::EDP (Always mount a scratch monkey.) Thu Jan 05 1995 14:22

    Proposed by Marcin E. Kuczma, Warszawa, Poland.
    
    A 1000-element set is randomly chosen from { 1, 2, ..., 2000 }.  Let p
    be the probability that the sum of the chosen numbers is divisible by
    5.  Is p greater than, smaller than, or equal to 1/5?
T.RTitleUserPersonal
Name
DateLines
1930.1CSC32::D_DERAMODan D'Eramo, Customer Support CenterFri Jan 06 1995 14:189
>> Is p greater than, smaller than, or equal to 1/5?
        
        Spoiler:
        
        Yes.
        
        :-)
        
        Dan
1930.2Anybody have the exact value for C(2000,1000) ?EVMS::HALLYBFish have no concept of fireFri Jan 06 1995 16:2611
    I think p is exactly 1/5.
    
    It seems to me one can use some sort of pigeonhole principle to
    associate every multiple-of-5 with 4 other non-multiples-of-5.
    
    Of course if you could do that then you ought to be able to use
    similar logic to construct 5-tuples of subsets of 10 elements
    randomly drawn from {1, 2, ... 20}. But there are 184756 of those!
    p *can't* be exactly 1/5 for this smaller version of the problem.
    
      John
1930.3CSC32::D_DERAMODan D'Eramo, Customer Support CenterFri Jan 06 1995 20:4444
> -< Anybody have the exact value for C(2000,1000) ? >-
        
        C(2000,1000) is divisible by 5.
        
        The number of times a prime p divides n! is
        
        	[n/p] + [n/(p^2)] + [n/(p^3)] + ...
        
        where [x] is the greatest integer less than or equal to x.
        (The above "infinite" series is actually eventually zero.)
        
        So 5 divides C(2000,1000) (but 25 doesn't).
        
        C(2000,1000) is about 2.048 * 10^600, or approximately
        
        	2048151626989489714335162502980825044396
        	4248879813970338203826376717481862020837
        	5582893299418261020620146476631999802369
        	2415481798004524792018047549769261578563
        	0128966343206471485115239525165122776858
        	8611539546256147907378668464154444533617
        	6137700738556738145896300713065104559595
        	1447988874620636871851455182855117316627
        	6253663773084682932255389049743859481431
        	7550307837964443708100851637248274627914
        	1701661988376484084354143081778594703774
        	6565188475514680749694674923803033101818
        	7232980096685674585602525499101181135253
        	5346588879419666536749045113061100963119
        	06270342502293155911108976733963991149120
        
        The symmetry "feels like" it should be close.  And smaller
        cases are: there are 252 five-element subsets of { 1, 2, ..., 10 },
        and the sums of the five elements are
        
        	sum mod 5       # combinations
        	---------       --------------
        	    0                 52
        	    1                 50
        	    2                 50
        	    3                 50
        	    4                 50
        
        Dan
1930.4Proof by inductionSTKAI1::T_ANDERSSONThe Tank EngineMon Jan 09 1995 11:3261
Answer:

    p = 1/5


Proof (not complete):

    Let

	S = set of 2000 numbers 0, ..., 2000

	Sr (r = 0, ..., 4) = set of all numbers N: N mod 5 = r

        "N $ Sr" mean that N is a member of Sr

	"xyz..." represent a strem of numbers N1 $ Sx, N2 $ Sy, N3 $ Sz, ...

    Pick one number N1 from S at random. Obviously p(N1 $ S0) = 1/5.

    Pick another number N2 (not N1) from S at random. Let N = N1 + N2.
    Now p(N $ S0) = 1/5, because of symmetry:

	p(00) = 400/2000 * 399/1999
	p(01) = p(02) = p(03) = p(04) = 400/2000 * 400/1999

	p(11) = 400/2000 * 399/1999
	p(10) = p(12) = p(13) = p(14) = 400/2000 * 400/1999

	...

	p(44) = 400/2000 * 399/1999
	p(40) = p(41) = p(42) = p(43) = 400/2000 * 399/1999

    and

	p(N $ S0) = p(00) + p(14) + p(23) + p(32) + p(41) =

	= (400*399 + 400*400*4)/(2000*1999) = (400*1999)/(2000*1999) = 1/5.

    In fact,

	p(N $ S0) = p(N $ S1) = ... = p(N $ S4) = 1/5.

    This means that when a third number N3 (not N1 or N2) is picked from S
    at random, the same argument as above can be used again, only with
    slightly different numbers, to show that p((N1 + N2 + N3) $ S0) = 1/5
    etc.

    Hence, if NsumX = N1 + N2 + ... + Nx, then induction shows that, if
    
	p(Nx $ S0) = p(Nx $ S1) = ... = p(Nx $ S4) = 1/5

    then

	p(Nx+1 $ S0) = p(Nx+1 $ S1) = ... = p(Nx+1 $ S4) = 1/5

    for all x<1999.
    
    Induction breaks down for x = 1999 when certain probabilities are
    multiplied by zero.

1930.5CSC32::D_DERAMODan D&#039;Eramo, Customer Support CenterMon Jan 09 1995 14:0940
        Symmetry arguments for p=1/5 for drawing 1000 elements from
        { 1, 2, 3, ..., 2000 } should note the following cases of
        drawing n elements from { 1, 2, 3, ..., 2n }

	n=5	sum mod 5       # combinations (252 total)
        	---------       --------------
        	    0                 52
        	    1                 50
        	    2                 50
        	    3                 50
        	    4                 50
        
	n=10	sum mod 5       # combinations (184756 total)
        	---------       --------------
        	    0                 36956
        	    1                 36950
        	    2                 36950
        	    3                 36950
        	    4                 36950
        
	n=15	sum mod 5       # combinations (155117520 total)
        	---------       --------------
        	    0                 31023520
        	    1                 31023500
        	    2                 31023500
        	    3                 31023500
        	    4                 31023500
        
        The 'sum mod 5 = k' counts are the same for k=1,2,3,4 but
        the count for k=0 is higher by 2, 6, 20 for n=5,10,15.  In
        the n=15 case, the total number of combinations is itself
        divisible by 5, making a p=1/5 result theoretically possibile,
        although the actual counts showed otherwise.
        
        Note that while the absolute differences 2,6,20 increased,
        the relative differences  2/252, 6/184756, 20/155117520
        decreased.  Will the n=1000 case have p ever so slightly
        greater than 1/5?
        
        Dan
1930.6Bigger (by 4c(400,200)/5c(2000,1000) - rel. small :-)IOSG::TEFNUT::carlinDick Carlin IOSG ReadingFri Jan 20 1995 12:4888
Using Dan's notation in .5, the "discrepencies" 2, 6, 20 ... are given by the 
c(2r,r) term in the following formula:

(n(x,y) = number of sets of y numbers out of x summing to 0 mod5)
(c(x,y) = x!/y!(x-y)!)

n(10r,5r) = c(10r,5r)/5 + 4c(2r,r)/5		(I)

which is a particular case of the following:

n(5a,5b) = c(5a,5b)/5 + 4c(a,b)/5    (a >= b >= 0)	(II)

To prove (II) by induction we need to establish the recurrence relation that 
gets us from a to a+1. To do this we temporarily introduce a more general 
n(5a,5b,c) which is the number of sets summing to c mod5, where c is 0 or 
+/-1 or +/-2.
Obviously n(5a,5b) = n(5a,5b,c).

Partition {1, ... ,5a+5} as follows and consider how 5b numbers could be 
apportioned between the partitions:

	{1, ... ,5a}   | {5a+1, ... ,5a+5}
	----------------------------------
	       5b      |      0			(III)
	       5b-1    |      1			(IV)
	       5b-2    |      2			(V)
	       5b-3    |      3			(VI)
	       5b-4    |      4			(VII)
	       5b-5    |      5			(VIII)

The contributions to n(5a+5,5b) are as follows:

(III)	n(5a,5b)

(IV)	  n(5a,5b-1,0)*n(5,1,0)
	+ n(5a,5b-1,1)*n(5,1,-1)
	+ n(5a,5b-1,-1)*n(5,1,1)
	+ n(5a,5b-1,2)*n(5,1,-2)
	+ n(5a,5b-1,-2)*n(5,1,2)

	and since n(5,1,z) = 1 this simplifies to

	c(5a,5b-1)

(V)	similar to (IV) but n(5,2,z) = 2 so we get

	2c(5a,5b-2)

(VI)	similar reasoning leads to

	2c(5a,5b-3)

(VII)	similar reasoning leads to

	c(5a,5b-4)

(VIII)	n(5a,5b-5)

so the recurrence relation we need is:

n(5a+5,5b) = n(5a,5b) + n(5a,5b-5) +
             c(5a,5b-1) + 2c(5a,5b-2) + 2c(5a,5b-3) + c(5a,5b-4)  (IX)

which must be supplemented by:

	n(5a+5,0) = 1
	n(5a+5,5a+5) = 1

which the recurrence relation won't provide, but they are obvious.

Applying (IX) to (II) we need to prove that:

  c(5a,5b)+5c(5a,5b-1)+10c(5a,5b-2)+10c(5a,5b-3)+5c(5a,5b-4)+c(5a,5b-5) 
  = c(5a+5,5b)

  c(a,b) + c(a,b-1) = c(a+1,b)

both of which are Pascal triangle properties (G=A+B+C+D+E+F and G=H+I)

   .   .   .   .   .   .   .   .
     F   E   D   C   B   A   .   
   .   .   .   .   .   .   .   .
     .   .   .   .   .   .   .
   .   .   .   .   .   .   .   .
     .   .   I   H   .   .   .
   .   .   .   G   .   .   .   .

Dick
1930.7cosmetic nitAUSSIE::GARSONachtentachtig kacheltjesSat Jan 21 1995 22:247
re .-1
    
>both of which are Pascal triangle properties (G=A+B+C+D+E+F and G=H+I)
    
    A+B+C+D+E+F must be a weighted sum where the weights are a row from
    Pascal's triangle. The weights are present where you actually use this
    result so this is a cosmetic issue only.
1930.8CSC32::D_DERAMODan D&#039;Eramo, Customer Support CenterThu Feb 02 1995 11:3915
	re .6 (Dick Carlin)
        
>Using Dan's notation in .5, the "discrepencies" 2, 6, 20 ... are given by the 
>c(2r,r) term in the following formula:
>
>(n(x,y) = number of sets of y numbers out of x summing to 0 mod5)
>(c(x,y) = x!/y!(x-y)!)
        
        c(2r,r) for r=1,2,3,4 is 2, 6, 20, 70.  2,6,20 were in an
        earlier reply.  A background computation started Jan 9 (before
        your reply .6 on Jan 20) finished early this morning.  I let
        it finish to test your formula. :-)  It confirmed that the
        (40,20) or r=4 discrepancy is indeed 70.
        
        Dan
1930.9IOSG::TEFNUT::carlinDick Carlin IOSG ReadingFri Feb 03 1995 13:1515
Derek (is that right?)

Thanks for ploughing through the solution. Are these "Crux" questions 
originally set under exam conditions? If so, I don't think I'd fare that 
well. As with most problem solving exercises these days, I seem to need the 
odd 24 hr diversion interspersed before I go back to them with a bit more 
inspiration.

Spotted another typo:
Obviously n(5a,5b) = n(5a,5b,0).

Dan

You left it running for nearly a month!? What patience!

1930.10CSC32::D_DERAMODan D&#039;Eramo, Customer Support CenterFri Feb 03 1995 14:218
>>You left it running for nearly a month!? What patience!
        
        Not really.  At "nice -20" it wasn't taking much out of my
        DECstation 5000/25 :-).  It did shove aside my longstanding
        background process which is participating in a distributed
        factoring effort.
        
        Dan
1930.11RUSURE::EDPAlways mount a scratch monkey.Fri Feb 03 1995 14:2911
    Re .9:
    
    Most of the problems in Crux's problem column are initially published
    there.  Crux also reprints Olympiad and other problems.
    
    
    				-- edp
    
    
Public key fingerprint:  8e ad 63 61 ba 0c 26 86  32 0a 7d 28 db e7 6f 75
To find PGP, read note 2688.4 in Humane::IBMPC_Shareware.
1930.12simple counting argumentHERON::BUCHANANEt tout sera bien etFri Feb 10 1995 10:3424
	One way to see that the answer given in the last few replies is
correct is as follows. We want to "match off" the subsets of {1,...,2000}
5 at a time, such that in each matching, one subset adds up to 0, one subset
adds up to 1, ... , and one subset adds up to 4, mod 5. Then we will see
that there are some subsets left over, and hopefully we can count these.

	So let's consider the subsets which contain exactly k elements of
the set {21,22,23,24,25}. For k=1, it's obvious that the subsets can be 
matched off suitably. E.g.: S+{21}, S+{22}, S+{23}, S+{24}, S+{25} will be
a matching for any S disjoint with {21,22,23,24,25}. Similarly for k = 2,3
or 4. We only have subsets "left over" if k=0 or 5.

	So now we can restrict ourselves to subsets which either contain or are
disjoint with {21,22,23,24,25}. Consider those which contain exactly k elements
of the set {101,102,103,104,105}. Once again, we get a matching except for those
for which k = 0 or 5.

	Carrying on this process, we find that the only subsets which are
left over are those which contain or are disjoint with the sets T_j 
= {5j+1,5j+2,...5j+5} for all j. There are exactly 400!/200!200! of these,
as predicted.

Cheers,
Andrew.