[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

277.0. "shrtst str containing everythin" by SPRITE::OSMAN () Thu May 02 1985 16:32

Here's a problem about substring matching.  I'm looking for the shortest
possible string of letters of the alphabet such that a search for any substring
of 8 (arbitrary choice) or less letters can be found.

Here's one string of letters:

	AAABACADAE...

In this string, we can clearly find AA or AB or AC or AD etc.  We quickly
realize that we can let the AA and AB overlap to reveal a shorter string, such
that AA, AB, AC, AD etc. are still to be found, namely:

	AABACADAE...

How long a string is needed, so all substrings up to 8 letters long can
be found ?
-------------------------------
A little history:  I thought of this problem a few years ago while designing
an EMACS macro.  The macro had internal need for the "search" command.  However,
I didn't want the macro to leave the default search string altered when it
was through executing, lest the user's attempt to issue a "search next" command
look for the macro's internal string rather than the last string the user
searched for.

EMACS didn't (does it now ??) have a command for revealing the current default
search string.  If it did, I'd use it at the beginning of my macro in order
to save it, then restore it at the end.

So, I was thinking about how I might reveal the current default search
string in lieu of a builtin function to do it.  My theoretical solution was
to have a buffer called ALLCOMBOS which would contain an appropriate huge
string, constructed so that ALL possible search strings could be found !
That way, I could figure out the default search string by putting the cursor
at the beginning of that buffer, issuing "search next", marking the spot,
"search previous", after which point and mark would surround sought default
search string.

Of course, this was just theoretical, since such an ALLCOMBOS buffer would
probably be larger than the universe.  However, it does lead to the
aforehereto mentioned substring puzzle.

/Eric

T.R	Title	User	Date	Lines
277.1		GOLLY::GILBERT	`Fri May 03 1985 00:27`	4
	Oddly, there is some old work on this problem. It's similar to the problem of ringing 'cycles' (I think that's what they're called) on a carillion, or other sets of chimes. The Encyclopaedia Brittanica had a pretty interesting article about this.
277.2		METOO::YARBROUGH	`Wed May 08 1985 09:17`	8
	Here's another related problem: find a sequence of 1's and 0's with the following property: given a boolean function B, the leftmost ocurrence in the string of bits x and y is immediately followed by B(x,y). (I have used this to implement Boolean algebra in a SNOBOL program, for example.) Some interesting relationships fall out of this: for example, AND(x,y) can be represented by '11100010' and OR(x,y) is represented by its complement, '00011101'. The symetric function XOR(x,y) can be represented either by '1011000' or by its reversal, '0001101'. And so forth. - Lynn Yarbrough
277.3		R2ME2::GILBERT	`Wed May 08 1985 23:03`	18
	Problem: Given a boolean function B(x,y) construct a sequence of 1s and 0s such that the leftmost occurence of x,y in the sequence is immediately followed by B(x,y). Solution. We construct the sequence by appending digits to a partial solution. (1) Initially, set S (the sequence being constructed) to 00. (2) If the 2**2 triples x,y,B(x,y) occur in S, then we are done. Otherwise, let x,y be the last two digits of S. (3) If this is the leftmost occurence of x,y in S, then append B(x,y), and go to step (2). (4) If y,z does not appear in S (for some z), append z to S, and go to step (2). (5) Let x',y' be some pair that doesn't appear in S; append x',y' to S, and go to step (2). Note that a similar procedure can be used to generate a finite solution for any function B(x,y,...z) over a finite alphabet.
277.4		RANI::LEICHTERJ	`Mon May 13 1985 23:09`	10
	"A string over an alphabet of k characters is said to be an n-meander if it contains every possible substring of length n over the alphabet. For example, 0001111011001010000 is a 4-meander for the alphabet 01." This is quoted from the "Reference Manual for the Icon Programming Language", which proceeds to give an example Icon program that generates meanders. It referes to the following article: "Minimal Meandering Strings", by James F. Gimpel and William Keister. Technical report, Bell Labs, Holmdel, NJ July 1970. The minimal meandering string has length k^n+n-1. -- Jerry
277.5		R2ME2::STAN	`Fri May 24 1985 02:40`	20
	Re: .0 What you want is very closely related to De Bruijn Sequences. Given m+1 symbols (0,1,2,...,m), a De Bruijn sequence is the minimal sequence which when arranged in a circle, contains as subsequences of consecutive symbols, every sequence of length n of the symbols. [you don't want the sequence arranged in a circle] Anyhow, there is much literature on this and related problems. I suggest you start with Anthony Ralston, De Bruijn Sequences, Mathematics Magazine, 55(1982)131-143. See also S. W. Golomb, Shift Register Sequences, Holden Day, 1967, pp. 118-122 and see also Knuth, volume 2.