T.R | Title | User | Personal Name | Date | Lines |
---|
963.1 | Then there's the ever-popular and highly NONoriginal: | RDVAX::KALIKOW | The Gods of the Mill grind slowly... | Tue May 05 1992 14:02 | 26 |
| While I don't think it possible to rathole this particular topic, it
puts me in mind of such meta-trivia from the Markovian and Chomskavian
(?) literature:
Time flies like an arrow.
Colorless green ideas sleep furiously.
There must be more of this ilk... Do these humble offerings qualify
for any of the free prizes? :-)
And here's another one recalled from George Gamow's treatise on the
monkeys in the British Museum... I read this as a child and it must
have warped me towards the study of linguistics and info theory...
I like apples cooked in turpentine.
Finally, I quote from one of the recent collections of lightbulb
jokes... This too seems apropos, but from yet another orthogonal
direction from the above.
Q: How many surrealists does it take to change a lightbulb?
A: Two -- one to get the ladder and one to fill the bathtub with
brightly colored machine tools.
|
963.2 | And a third to eat them? | ESCROW::ROBERTS | | Wed May 06 1992 07:37 | 7 |
| re .1
> ...one to fill the bathtub with brightly colored machine tools.
I LOVE it!!!!
-ellie
|
963.3 | RE: .1 | XNTRIK::MAGOON | | Wed May 06 1992 12:41 | 4 |
| That's supposed to be "Time flies like and arrow, fruit flies like a banana."
Larry
~
|
963.4 | RE: .3 | XNTRIK::MAGOON | | Wed May 06 1992 12:43 | 1 |
| That's supposed to be "an arrow"
|
963.5 | | WHO301::BOWERS | Dave Bowers @WHO | Wed May 06 1992 13:45 | 1 |
| "Time races with a stopwatch."
|
963.6 | | JIT081::DIAMOND | bad wiring. That was probably it. Very bad. | Wed May 06 1992 19:10 | 2 |
| In particular, time races with a running stopwatch.
Now where was that note on oxymorons?
|
963.7 | I bet that noone has ever written this before! | VANINE::LOVELL | � l'eau; c'est l'heure | Thu May 07 1992 03:16 | 42 |
| How fascinating to read .0 - I have often daydreamed along this theme, although
my interest is more along the lines of the statistical (im)probability of
an otherwise perfectly correct sentence having never, ever been uttered or
written before. I'm not too interested in amusing little constructs uttered out
of context, rather enthralled by the fact that every day, one may be acting
as a sort of linguistic "pioneer", navigating the treacherous rips and
currents of previously unuttered verbage.
I have tried to generalise this idea to the extent that one might with some
reasonable reliability, predict that the probability was 50% or better that a
syntactically correct sentence had never before been "committed" in English.
For example, being a betting person, I will give any reader of this conference
odds of 1000 to 1 that they cannot find a previously written example of the
paragraph above. This then gets interesting as you consider the number of
variables one might need to consider to run a book on this ;
- Total number of words to choose from (Universal set)
- Mean and standard deviation of the length of a sentence
- Statistical frequency distribution of common words vs.
others from the Universal set.
- Cumulative number of sentences already committed
- Rate at which additional sentences are being committed
Given that there is a finite universal set and a (presumably) huge historical
commitment and a reasonably small mean sentence length, the problem reduces
to a fairly simple exercise in statistics.
I have tried to work out some of the variables above to formalise this notion
but my maths is not up to it. It is further complicated by the fact that
differential calculus is required to cope with the hypothesis that the
stock of remaining uncommitted sentences must be diminishing over time,
and be some function value of a number of variables such as rate of normal
commitment, rate of increase of the Universal set, rate of expansion or
contraction of correct syntax, etc.
Since these are all discrete, the problem whilst horrendously diificult
appears to have a theoretical solution. However, it becomes impossible when
the continuum of delivery accent, intonation, emphasis, etc. is applied. I
haven't got my mind around that yet :-)
Confused of Newbury
|
963.8 | | SHALOT::ANDERSON | Brown for Messiah | Fri May 08 1992 06:27 | 7 |
| > written before. I'm not too interested in amusing little constructs uttered out
> of context, rather enthralled by the fact that every day, one may be acting
Well, I'm not too interested in your boring little theories,
( ;^) ), so here's another one:
Moo, moo, I'm a chalkboard.
|
963.9 | Well, if nobody else will do it... | MARVIN::KNOWLES | Caveat vendor | Fri May 08 1992 07:04 | 18 |
| �Given that there is a finite universal set and a (presumably) huge historical
�commitment and a reasonably small mean sentence length, the problem reduces
�to a fairly simple exercise in statistics.
Given by whom? The universal set isn't finite (people might argue that
it is at any instant; but that's where notions like competence/
performance come in). Anyway, it certainly wouldn't be measurable - the
OED started to measure _just_English_ more than a century ago, and they
_still_ haven't got it right.
Also, whatever the mean sentence length (which couldn't be measured,
incidentally), the possibility of exceeding it (by a similarly
immeasurable number of words) would always exist. So any purely
statistical attempt to measure this aspect of natural language is
doomed to failure. Don't blame your maths.
b
|
963.10 | 18 | COQAU::LOVELL | � l'eau; c'est l'heure | Fri May 08 1992 16:52 | 22 |
| Seriously, I am sure that there is a fairly easily measurable mean
word count for an English sentence. I remember that it is
shorter in spoken English than in written (which has a value of
eighteen).
Some automated grammar and spelling checkers can perform these
counts for you - returning indicators like average sentence length.
For statistical purposes, any extremes in shorter or longer sentence
length would be indicated by the standard deviation.
Regarding the finite set of legal words. It is irrelevant how many
words have fallen into disuse or how many new ones have come along,
the set will always be finite for statistical purposes. If you accept
the premise that a word can NEVER be longer than an arbitrary letter count
(say 40 to safely include increases over "antidisestablishmentarianism"),
then it is mathematically provable that the Universal Set is finite,
(albeit large and unknown), expressed by a polynomial of no higher order
than 10 to the power of 56 - therefore, I contend that this starting
condition is, as stated, "given".
(woops, that one's a bit longer than 18 :-)
|
963.11 | | JIT081::DIAMOND | bad wiring. That was probably it. Very bad. | Sun May 10 1992 19:19 | 9 |
| I think there's no finite bound on the length of a word, because for
example I think that names of chemicals are regarded as English, and
although only a finite number of such words describe chemicals that
have actually been created, but an infinite number of them describe
chemicals whose names and structures can be computed.
And now to return to my usual style, this sentence is so mean that
it cannot be measured, 'cause if you even come close with your
measuring tape, it'll sentence you to death.
|
963.12 | Boldly going | MARVIN::KNOWLES | Caveat vendor | Mon May 11 1992 07:09 | 38 |
| Aha. Exactly.
�I think there's no finite bound on the length of a word, because for
�example I think that names of chemicals are regarded as English, and
�although only a finite number of such words describe chemicals that
�have actually been created, but an infinite number of them describe
�chemicals whose names and structures can be computed.
But for `chemicals' I`d have said `semantically-meaningful bits of
word'. Taking .10's example: `establishment' got the suffixes `-arian'
and `-ism' appended, and the prefixes `dis-' and `anti-'; in _that_
word, `establishment' has a single meaning (something to do with
Anglican church political/temporal power). But when I use the word
`establishment' to mean `setting up' then I'm appending the suffix
`-ment' to `establish'. And going back to the roots of the word
`establish', you find the inchoative suffix `-iscere'. And in that
suffix, there's the verb-ending `-ere.'
The point is that it is in the nature of natural language to stick
words and bits of word together like this. Look at the words of three
or more syllables in this note; in how many of them can you _not_
discern this sort of creativeness (not on _my_ part; worked into the
language)?
In any rigidly defined corpus of data, there must be a mean
word-length. But I don't see how that measurement can have any
significance in the real world.
Besides, why must words be `legal'? If I say "I said
`anautoinfractorisuperextimbunctiferousliness' for the first time ever
last night", everyone in this conference will understand the sentence.
(They may doubt that it's true, and they may doubt my sanity if it is,
but that's neither here nor there.)
Still, what I know about statistics could be written on the backs
of a random selection of postage stamps.
b
|
963.13 | | JIT081::DIAMOND | bad wiring. That was probably it. Very bad. | Mon May 11 1992 21:13 | 6 |
| >Still, what I know about statistics could be written on the backs
>of a random selection of postage stamps.
Well, that's probably roughly somewhere around half of all the postage
stamps that have ever been printed, and their backs provide an awful
lot of space for you to fill. NOW you're in trouble :-)
|