[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

2053.0. "optimal communication channel coding" by SNOFS1::JONESCHRIS (Chris Jones) Tue Jun 18 1996 20:34

I was speaking with someone who was talking about trying to establish the 
otptimal coding scheme for a communications channel that has some noise on it.  
The 3 of 4 coding scheme was mentioned as the optimal coding scheme for 
communications channels with data recovery (error detection and correction).  
Unfortunately I did not have the opurtunity to pursue the conversation further 
at the time, but what is this coding scheme (assuming I have the correct 
terminimology).

Additionally, there are other coding schemes that I have come across before in 
other lives, which were used for this type of function, as used in the tape 
industry.  I used to work in this area years ago, but have lost track of the 
information and am about the information theory side. - if any pointers could be 
given, much appreciated.
T.RTitleUserPersonal
Name
DateLines
2053.1Try looking for a book by Robert McElieceFLOYD::YODERMFYWed Jun 19 1996 09:5812
    I worked on coding theory at JPL a long time ago under Robert McEliece.
    I believe he has since written a book on the subject, and I can
    certainly vouch for his competence to write on the subject.
    
    My recollection is that Claude Shannon proved that there existed
    coding schemes which approached arbitrarily closely to a certain
    bound (now called the Shannon bound), but that the coding schemes
    actually known didn't come very close to this bound.  I'd therefore
    be very skeptical of optimality claims unless the criteria for
    "optimality" were chosen in ways that might not have practical
    significance.
    
2053.2Algebraic Coding TheoryEVMS::HALLYBFish have no concept of fireWed Jun 19 1996 12:2020
.0> The 3 of 4 coding scheme was mentioned as the optimal coding scheme for 
.0> communications channels with data recovery (error detection and correction).  
    
    This sounds to me like majority logic coding, meaning you "vote" on
    what the value of a bit is, and the majority rules. In this case you
    would be able to correct all single-bit errors and detect all double
    bit errors, but the overhead is horrendous: 4 bits of data for every
    1 bit of information. That doesn't sound optimal. (Though I may have
    the wrong idea of what is meant by the term "3 of 4 coding").
    
    Are you doing error detection/correction in hardware or software?
    
    What is the nature of the communications channel? Is it conceptually a
    distance transmission medium, like radio or phone cable? Then you'll
    probably want a different coding scheme from, say, ECC memory. What is
    your expected bit error rate? This impacts the kind of code you'll need.
    
    Richie Lary probably knows all about our disk ECC algorithms.
    
      John
2053.3the communication model is more abstract than you may thinkSNOFS1::JONESCHRISChris JonesWed Jun 19 1996 20:3912
The communication model that was being discussed, was that of DNA.  The 
other aspects of this were the same as a communication channel that was 
susceptible to noise, in that when the DNA duplicates, occasionaly the 
duplication (of some cells) is not correct, but the structure of the DNA 
(apparently with coding) is sufficient to detect and correct 'simple' 
errors within the coding stream.  Large gross errors it can't handle.

The 3 of 4 coding scheme that was mentioned, relative to this topic, was 
(and I may have got this screwed up) was for every three bits of 
information, there was a four bit protecting it somehow. 

DNA is a digital code stream, with ECC!
2053.4straightforward interpretations failFLOYD::YODERMFYThu Jun 20 1996 11:0916
    I don't think DNA's scheme is outside the abstractions considered
    in normal coding theory.
    
    There must be some details missing.  If simple detection of errors is
    enough, the 4th bit can be a parity bit; but this wouldn't be optimal,
    because you could use a single parity bit for an entire strand of DNA.
    
    If correction is wanted, a 4th bit can't suffice, because there are
    16 words, and the sets of code words + 4 nearest neighbors can't
    overlap, so there can be at most 16/5 = 3 code words, not enough
    for 2^3 = 8 inputs.
    
    However, if the "bits" are actually one of the 4 possibilities AGCT,
    then there are 13 code words per group, and... there's still too
    few code words to go around.  So is there a biologist in the house?
    
2053.5I'm not a biologist but...CSSE::NEILSENWally Neilsen-SteinhardtFri Jun 21 1996 13:1014
I did read a book once :-)

My impression is that the coding of DNA is fairly well understood.  The
characters are triplets of the four bases, making 4^3 = 64 possible characters. 
These characters code for STOP characters, NULL characters and about 20 amino
acids, with a lot of duplication.  As far as I know, there is no error detection
or correction at the level of DNA replication or decoding.

The major error detection and correction mechanism seems to be death, which
operates at a higher level.

There also appears to be some repair mechanism which operates at a cellular
level, but this is not well understood, last I heard.  It may use redundant
information stored in enzymes and/or RNA.