[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

1068.0. "Door-number problem" by TRIBES::CREAN (A closed mouth gathers no feet) Wed Apr 26 1989 06:28

    A strange problem I encountered recently.
    
    I bought a new front door for my house (the previous one
    got smashed in by burglars). I needed a new number for
    this door. Where I live (Galway City) most streets are
    quite short, numbers rarely exceed 30 - 40 and are often
    much less. My house number is 21.
    I visited six hardware stores and all of them could sell me
    a '2' but none of them had a '1'.  "We always run out of
    ones!" I was told.
     
    Why do they run out of ones?

T.R	Title	User	Personal Name	Date	Lines
1068.1		DWOVAX::YOUNG	Sharing is what Digital does best.	`Wed Apr 26 1989 06:54`	1
	Because about 50% of all numbers used start with a 1.
1068.2	Where's my slipstick when I need it?	NIZIAK::YARBROUGH	I PREFER PI	`Wed Apr 26 1989 11:55`	6
	Someone once remarked that his log table was dirtier near the front, for essentially the same reason. Of all numbers occurring in nature, a disproportionate number - I think it's something like ln(radix) - start with 1. Someone want to work out the exact probability? You can get a good approximation by measuring the length between 1 and 2 on a slide rule, if you can still find one. - Lynn
1068.3	Topher will be here soon	HERON::BUCHANAN	Andrew @vbo DTN 828-5805	`Wed Apr 26 1989 15:27`	84
	Hang on. When we say "about 50% of all numbers used start with a 1" and "something like ln(radix) start with 1. Perhaps someone can work it out." there is a modelling step involved. What is the basic probability of a number n being used in nature? And how should that relate to housenumbers in Galway (Ireland?) Let me boringly abstract. We have some probability distribution, p(n), over the whole numbers, {1,2,...}. We have a bunch of b functions, fb_i(n), from the whole numbers to the naturals, where b is a base (10 normally) and i runs from 0 through b-1. fb_i(n) is much easier to describe than to write down. Loosely, fb_i(n) is the number of occasions the symbol i appears in the representation in base b of the number n. What we're interested in is the expectation, Eb_i: Eb_i = sum(i=1..%) p(n).fb_i(n) where % denotes infinity. In particular, what can we say about the relative magnitudes of the Eb_i for different i? Clearly this is dependent on p(n). We don't know what p(n) is exactly, but we do know: (1) sum(i=1..%) p(n) = 1 (2) for all n, p(n) >= p(n+1) since it is reasonable to assume that if a street containing house number n+1, it has house number n. (Random aside, this is not always true. In Edinburgh, Scotland, where I grew up, the house on the left was number 1, and the house on the right was number 11. Our house number? 61, of course.) Now it seems to me that (1) & (2) are enough to prove: i > i' >= 1 => Eb_i =< Eb_i' i >= 1 => Eb_0 =< Eb_i Let me write Eb_i in a slightly non-obvious way as: sum(c=0..%)sum(k=0..b^c-1)sum(j=0..%) p(j.b^(c+1) + i.b^c + k) for i <> 0 and the very similar: sum(c=0..%)sum(k=0..b^c-1)sum(j=1..%) p(j.b^(c+1) + k) for i = 0 There is a 1-1 correspondence between the addends of Eb_i and Eb_i': which maps p(j.b^(c+1) + i.b^c + k) to p(j.b^(c+1) + i'b^c + k). By the monotonicity of p, we are proved. Comparing i and 0, the result is very similar except that there are some extra addends corresponding to j=0, which do not appear in Eb_i. However, to go further, I think we need to postulate a specific distribution for the numbers. Let's say that: p(n) = (1-�).�^(n-1) where � is in (0,1). One calculation we can do now is to work out what the expected number of symbols (digits, where b=10) is: Eb = sum(i=0..b-1) Eb_i = sum(c=0..%)sum(k=b^c..%) p(k) = (1/�) sum(c=0..%) �^(b^c) which I don't think has any neat analytic form. In our case, the answer is 1 + �^9 + �^99 + �^999 + ... So this means tht we shouldn't expect to see a neat analytic form for all the constituent Eb_i, and indeed I don't. Eb_0 = sum(c=0..%) (1-�^(b^c)).(1/�).�^(b^(c+1))/(1-�^(b^(c+1))) Eb_i = sum(c=0..%) �^(i.b^c-1).(1-�^(b^c))/(1-�^(b^(c+1))) In no case is there a hint of log(b). Anyone see a log lurking anywhere? Andrew.
1068.4	Leading digits.	CADSYS::COOPER	Topher Cooper	`Wed Apr 26 1989 16:58`	30
	We're dealing with several different things here. First the door number problem: We know that streets in the town rarely have numbers higher than 40 on them (a given). If a typical street has numbers 1-20, then there is one 1 for 1, ten more for the first digit of 10-19, and one more, in addition, for twelve 1's out of the twenty house numbers (involving 31 separate digits). After that (until we get to 100) there is one 1 out of every 10 house numbers. If there are few streets with over 40 numbers, than it is not unreasonable to suppose that most streets have twenty house numbers or a few more. 50% seems reasonable. Next the ubiquity of leading low digits. There is a well documented and oft observed tendency for most of the leading (most significant) digits to be low (1,2,3) rather than high (7,8,9) in tables of figures. This is not quite the same as "numbers in nature", since it involves tabulations of frequently quite artificial quantities and does not involve quantities like fundamental constants at all. Closer examination does seem to show a rough logrithmic scale. I've seen several theoretical explanations of this phenomena, most but not all of which are similar. The most recent I've seen is an explanation by Mandelbrot in terms of fractals. It would not be surprising if several different mechanisms (e.g., rounding practices, rescaling to produce "nice figures",the effects of overlapping, approximately uniform distributions, and circumstances like the house numbering phenomenon) contribute, in different tables to different extents. If you wish to observe this phenomenon yourself pick up virtually any almanak, pick some tables, and start counting leading digits. Topher
1068.5	Not a leading-digits problem	POOL::HALLYB	The Smart Money was on Goliath	`Thu Apr 27 1989 08:49`	24
	The interesting part of the problem -- that of leading digits -- has been around for almost 100 years. I think Knuth has it worked out somewhere in his first 3 volumes, and it IS logarithmic. It's not clear how much THAT problem has to do with the question raised by .0, given that house numbering more resembles the numbers you'd find in telephone book rather than an almanac. Topher's analysis assumes conveniently a mean "last house number" of 20, thereby assigning lots of houses (NPI) to "1" and ignoring the 2-rich 20-29 range. The question in .0 asked specifically about running out of 1s but not 2s, so it seems a bit unfair to arbitrarily select 20 as the mean "last house number". But as it turns out, the same conclusions arise even if you assume a mean "last house number" of 30. With a mean of 30 you'll have some streets that go to 31, fewer that go to 32 (edge to the 1s), and some streets that only go to 27 or 28 (edge to the 1s again). And if you select a mean "last house number" less than 30 you're certainly leaving the 2s behind. And with a practical maximum of 40 it's hard to imagine a mean "last house number" much larger than 30. So of course they run out of 1s if they order in the same quantities as the others. And they must have an absolute glut of 0s. John
1068.6	Very interesting	TRIBES::CREAN	Never play leapfrog with a unicorn	`Thu Apr 27 1989 09:32`	9
	Well, that was interesting. I recall hearing once of 'Zipf' distribution, applied to a problem like this: though perhaps this is a red herring.... Nice to learn that somebody else uses a slide rule. I still have mine from the olden days and I find it a tremendous aid to computation. I keep it beside my terminal and whenever somebody grabs at my calculator I swipe at their knuckles with it.
1068.7	There isn't one answer.	CADSYS::COOPER	Topher Cooper	`Thu Apr 27 1989 11:42`	30
	RE: .5 The problem with saying anyone has "worked the problem out" is that it is a very complex phenomenon. People can make various assumptions about how numbers are selected and manipulated for inclusion and they will get rough or exact exponential laws (if they don't their assumptions are incorrect or incomplete). It's very hard to prove that any answer is the complete right one for all or even most tables or books of tables. I suspect that no single answer is the complete correct one for all tables. Logrithmic laws seem to pop out very naturally in a variety of ways. As for the house numbering problem, first off I did not talk about a mean "last house number" but a typical last house number. The closest formal concept is "mode". I wasn't "cheating" when I used 20 as a number, because I was not attempting to predict that one would be used heavily. Rather I went from an observation (that ones disappear disportionately quickly) and posited the value of a parameter (that the largest house number was typically around 20) that would explain that observation (other explanations, are, of course possible; for example, the kid's at the local high school use them to make signs for sporting events which say "WE ARE NUMBER 1"). I did not assume it, rather I deduced it. Of course, I checked if it was plausible, which, with a general upper limit of 40, it seemed to me to be. Moderate overshoot or undershoot of that parameter, weakens the extent of the phenomenon but does not do so so much that it fails to explain the observations (as you point out). The technical term is that the estimate of the parameter is robust within the problem. Topher
1068.8	A statistically correct proof, I see	POOL::HALLYB	The Smart Money was on Goliath	`Thu Apr 27 1989 13:52`	20
	.7> Rather I went from an observation (that ones .7> disappear disportionately quickly) and posited the value of a .7> parameter (that the largest house number was typically around 20) that .7> would explain that observation Well seeing as how .4 begins with: .4> have numbers higher than 40 on them (a given). If a typical street has .4> numbers 1-20, then there is one 1 for 1, ten more for the first digit it looks to me like you're starting with a desired solution and then noticing that it fits the observation, thus deducing .4> that most streets have twenty house numbers or a few more. All of which is correct as far as it goes. But that alone is not sufficient to answer the question, you also need to show the parameter is robust. My error was failing to deduce that you did that. John
1068.9	My fault.	CADSYS::COOPER	Topher Cooper	`Thu Apr 27 1989 16:08`	10
	Not my day for making myself clear... I was not trying to say that I was right and that you were wrong. On the contrary, you were quite right to critisize .4. I was only attempting to explain what I had meant to explain before, but had failed. Obviously, in doing so, I failed to properly explain what it was that I was explaining. I certainly hope that this isn't a case of infinite regress :-). Topher
1068.10	Try this for insight	NIZIAK::YARBROUGH	I PREFER PI	`Thu Apr 27 1989 16:28`	31
	The following BASIC program counts the digits in the sequences 0..n, where n takes on the values from 0 to 99. It shows that there is a tendency toward the smaller digits (except for 0, since leading 0's are ignored). The digits, their frequency, and the percentage of total are displayed. In other words, if you start numbering things and tend to run out of objects before you get to a power of 10, this program shows the expected frequency of digits used. 10 for i = 0 to 9 20 bin(i) = 0 30 next i 100 for i = 0 to 99 110 for j = 0 to i 120 jj% = j 130 if jj% = 0 goto 200 140 kk% = jj% - 10%*(jj%/10%) 150 bin(kk%) = bin(kk%) + 1% 160 jj% = jj%/10% 170 goto 130 200 next j 210 next i 220 print using " #", i; for i = 1 to 9 230 print using " #", 0 310 tot = bin(0) 320 for i = 1 to 9 330 print using " #####", bin(i); 335 tot = tot + bin(i) 340 next i 350 print using " #####", bin(0) 410 print using " .####", bin(i)/tot; for i = 1 to 9 430 print using " .####", bin(0)/tot
1068.11	A sample...	TRIBES::CREAN	Never play leapfrog with a unicorn	`Fri Apr 28 1989 09:47`	10
	As an experiment, I collected the house-numbers of all the people in the department. They are as follows: 21 85 114 25 8 42 22 16 34 68 It is only a small sample, but the range surprises me. We do not have many long streets in Galway, but the long ones do have more houses in them....