[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

870.0. "find smallest non-substring of larger string" by VIDEO::OSMAN (type video::user$7:[osman]eric.vt240) Wed May 04 1988 10:41

What's the most efficient way of finding the smallest string that is
*not* a substring of a larger one.

For example, consider:

	iowuehefwefyuwgfuwegfuwegfwuygftrpotiyrptoyurdfbvdbhfvbdfhxwqxqwa

We could scan for "a", then if not found, "a" is the answer.  But if
found, we could then scan all over again for "b", then "c".

Of course, we might find all the single letters, so then we'd try
"aa", then "ab" etc.

But there must be faster ways.

There are various possible applications of this:

o	For computer languages and networking, if a delimiter is needed
	to sandwich some data, one might want the smallest possible
	delimiter that isn't also contained within the data.

o	For implementing searching algorithms, one might want to put
	a fence at the end of the data.

o	I often say "search file/log garbage" in order to see how
	many lines are in the file, and I merely "hope" that "garbage"
	won't match any.

o	Sometimes we want to create a file name or process name on the
	computer that is guaranteed not to clash with any existing one.

/Eric
T.RTitleUserPersonal
Name
DateLines
870.1Just re-trie-ve the answer.PBSVAX::COOPERTopher CooperWed May 04 1988 12:259
    How about ... 
    
    Build a trie structure (a more sophisticated related structure called
    a DAWG might be cheaper for long strings -- it requires, if I remember,
    linear in the length of the string time and length to build).  Do
    a breadth first search of it -- the first null link found is the
    answer.
    
    					Topher
870.2CLT::GILBERTWed May 04 1988 20:021
    See note 514.