[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Mathematics at DEC |
|
Moderator: | RUSURE::EDP |
|
Created: | Mon Feb 03 1986 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2083 |
Total number of notes: | 14613 |
571.0. "Probability analysis of nearest neighbor alg." by EVE::R_LI () Fri Aug 22 1986 15:22
Hello!
I am new to Math Notes, and I have a math problem. Any
solutions or help would be appreciated.
Briefly, I am trying to do a probabilistic analysis of the nearest
neighbor algorithm which is an algorithm that can store and recall
stored strings. This algorithm can do 'error correction' in that
it has the ability to return the original stored string given a
partial or incorrect string. The recall process is very simple:
you merely take the recall string and compare it against every stored
string and return the string that is closest to the recall string,
using a measure termed the HAMMING DISTANCE.
The hamming distance is merely the number of bits between any two
strings that are different. Examples, the following two strings
have a hamming distance of 2 (2nd and 4th bits different)
1100
1001
In this example, if we had a recall string of 1101 (which is the
first stored string with 1-bit error in the fourth position), the
nearest neighbor algorithm would correctly return the first string.
However, if we tried to recall using 1000, the algorithm would not
be able to tell whether it was originally the first string with
an error in the second bit, or the second stored string with an
error in the fourth bit (This happens whenever the hamming distance
between the original stored string and the recall string equals
the hamming distance between the recall string and any other stored
string). And of course if we tried to correct more bits of error
(e.g. 1001 from 1100), it is clear the algorithm could fail.
*** Now, the question is... What is the probability of correct recall
given 'b'-bit totally random strings, 'n' stored strings and 'x'
bits of error in the recall string?
Some general info to get you started:
* Totally random means that it is equally likely for any
particular bit to be a '1' as it is to be a '0'.
* If 'b' = number of bits in a string, then there are 2**b unique
strings that can be stored.
* Correct recall occurs whenever the hamming distance between
the recall string and the original stored string is less than the
hamming distance between the recall key and all other stored strings.
In other words, let
X = recall key with 'x'-bits error
N_x = original stored string we want to recall
N_s = all other stored strings
then,
HD(X,N_x) < HD(X,N_s) for all 's'
or
x < HD(X,N_s) for all 's'.
* When only 1 string is stored, correct recall is guaranteed
regardless of the string length, 'b' or the number of bits in error,
'x'. Obviously, the probability decreases as more strings are stored.
----------------------------------------------------------------------
I started doing analysis of this problem, but have discovered it
is much more complicated than I had thought. A solution would be
fantastic! Hints or references to a solution would also be great!
Please write if you have questions. Thanks for all your time!
Ruby Li
EVE::R_LI
3-6606
T.R | Title | User | Personal Name | Date | Lines |
---|
571.1 | | LATOUR::JMUNZER | | Fri Aug 22 1986 17:55 | 15 |
| Ruby:
> 1100
> 1001
>
> In this example, if we had a recall string of 1101 (which is the
> first stored string with 1-bit error in the fourth position), the
> nearest neighbor algorithm would correctly return the first string.
If HD (1101, 1100) = 1 and HD (1101, 1001) = 1,
why does the algorithm return 1100?
Isn't this a tie (i.e. a failure)?
John
|
571.2 | | CLT::GILBERT | eager like a child | Fri Aug 22 1986 18:04 | 5 |
| Are you looking for the probability that a particular string will be
correctly recalled, or that any/all strings will be correctly recalled
when they have 'x' bits of error? Also, is the error pattern (the 'x'
bits) chosen randomly, or is the question whether *any* x-bit error
pattern will cause an incorrect recall?
|
571.3 | Problem clarification (RE: 571.1 and .2) | EVE::R_LI | | Mon Aug 25 1986 09:43 | 31 |
| RE: Note 571.1
Yes, you are correct, the example I gave for recalling string
number 1 is actually a tie (which equals failure). Here is a correct
example (sorry about that...)
String 1 = 1100
String 2 = 1001
Recall = 0100
HD(1100, 0100) = 1
HD(1001, 0100) = 3
Therefore, algorithm returns string #1 (which is what we want).
-----------------------------------------------------------------------
RE: Note 571.2
I would like to find the probability that *any* string will
be correctly recalled with x-bits of error. Also, the bit errors
are chosen randomly, though no bit may be flipped twice (this merely
cancels the error causing no change). I.e. if we pick a
random stored string and put x-bits of error in it, what is the
probability we can correctly recall it with 'n' b-bit length strings
stored. It would also be useful to know the probability that *all*
stored strings will be correctly recalled with any combination of
'x'-bits of error.
Hope this clarifies the problem a little bit... good luck (and thanks!)
|
571.4 | A try | TAV02::NITSAN | Nitsan Duvdevani, Digital Israel | Thu Aug 28 1986 09:34 | 32 |
| b
There are ( ) different strings with a distance of "d" from the "recall string".
d
2^b
There are ( ) different options to store n strings of b bits.
n
The number of options in which ONE of the stored strings has a distance of "d"
from the "recall string" and the rest have distance >d is:
b S d b b b
N(d) = ( ) * ( ) , for S = 2^b - SUM ( ) = SUM ( )
d n-1 i=1 i i=d+1 i
so it looks that the number of options in which there is ONE stored string
with minimal distance (from the recall string) is:
b
M = SUM N(d)
d=0
2^b
and the probability is M / ( )
n
All this is if you know nothing about "x" (the number of error bits). For
example, if you know x=0 (no errors in the recall string), then ofcourse
the probability to choose the right stored string is 1.
or maybe I made a mistake (it's late in the day...)
Nitsan
|