[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::hackers_v1

Title:	-={ H A C K E R S }=-
Notice:	Write locked - see NOTED::HACKERS
Moderator:	DIEHRD::MORRIS

Created:	Thu Feb 20 1986
Last Modified:	Mon Aug 03 1992
Last Successful Update:	Fri Jun 06 1997
Number of topics:	680
Total number of notes:	5456

387.0. "Huffman compression" by SMAUG::FLOWERS (I know what I know, I think.) Tue Jan 20 1987 16:42

    Has anyone out there had any experience with Huffman data compression
    algorithms?   
    
    I understand how the algorithm works (conceptually).  In a nutshell,
    the algorithm uses a predefined table to compress the data.  Therefore,
    each implementation of the Huffman algorithm uses its own table
    and therefore generates it own compressed data.  The decompression
    of that data must have knowledge of the table used.
    
    My problem - I am receiving Huffman compressed data from a remote
    source which I need to decompress.  I know not the table used to
    compress - nor can I get access to the code that generated the
    compressed data.  
    
    My question - has anyone 'reverse engineered' Huffman compression
    before?  If so, how did you do it and how long does it take.
    
    Time is precious and help is greatly appreciated,
    Dan

T.R	Title	User	Personal Name	Date	Lines
387.1		CAFEIN::PFAU	You can't get there from here	`Tue Jan 20 1987 17:50`	5
	There are various programs on PAR5 which implement Huffman compression/decompression. Look for SQ, USQ or any file containing these strings in MSDOS:, CPM:, CC:, TURBO:. tom_p
387.2		60592::MOSS	cogito cogito ergo cogito sum.	`Wed Jan 21 1987 03:39`	14
	It could come from Un*x pack. This is another very popular compression program. Unfortunately, there are many different compression programs around, and many use different representations of the compressed data. This can make recovery difficult if the program used to compress the original is not known. Huffman coding uses a representation based on the frequency of occurence of each symbol, and represents frequent simbols with a minimal number of bits. The first information in the compressed file is a table enabling the decompression program to reconstruct the data from the compressed info.
387.3	table is in the file	PLDVAX::ZARLENGA	Bigger they are, Harder they hit	`Thu Jan 22 1987 08:43`	18
	> I understand how the algorithm works (conceptually). In a nutshell, > the algorithm uses a predefined table to compress the data. Therefore, > each implementation of the Huffman algorithm uses its own table > and therefore generates it own compressed data. The decompression > of that data must have knowledge of the table used. The table that is used is dumped into the compressed file, usually right at the beginning. The way to de-compress the data is to read that table (you must know it's format) and then read the file looking for bit patterns to selectively expand your data back to its original state. To do this you must know the way the table is represented (its format), and the compression/de-compression algorithm. As mentioned in a previous note, SQ (squeeze) and UQS (unsqueeze) make use of Huffman compression, you can scan the sources for the algorithm if necessary. -mike
387.4	The table doesn't need to go with the data...	YALI::LASTOVICA	Norm Lastovica	`Thu Jan 22 1987 17:38`	4
	I don't think that the compression table (that's what it's called isn't it??) HAS to be part of the compressed object. Sometimes a program will put it there, but it is not a requirement of the Huffman system.
387.5	checking...	SMAUG::FLOWERS	I know what I know, I think.	`Fri Jan 23 1987 12:46`	19
	Thanks for the help. The PAR5 reference will come in handy in due time. I've just receive a couple traces of huffman compressed data. I'll be trying to locate a common table in these files soon. Thanks for the help. It was an 'eye-opener'. A good huffman compression table will generate on the average 2.5 bits per character. It would seem to me that including the compression table with each transmission of a compressed file would defeat its own purpose - just my thoughts... Let's hope you're right anyway (that the table is included), it'll make this task much easier. Thanks, Dan
387.6		CLOVAX::COBURN	John Coburn, Cleveland	`Fri Jan 23 1987 15:02`	17
	< Note 387.5 by SMAUG::FLOWERS "I know what I know, I think." > The Huffman implementations that I have seen (used on CPM and MSDOS) always include the table in the compressed file because each file generates a unique compression table. This means that compressing using the Huffman technique can in some cases (Even distribution of the 255 different codes) cause the 'compressed' file to be larger. This is not often the case, especially in executables (lot of nulls etc..). If you are interested in some FORTRAN code see VMSSWEEP.FOR located at PAR5""::VAX_TOOLS: Most of the other programs are written in C. John
387.7	crazy stuff kids	SMAUG::FLOWERS	I know what I know, I think.	`Fri Jan 23 1987 17:24`	15
	Well, I've determined that 'they' don't use the same compression table each time. In fact, consecutive transmissions of the same file produce different results. I attribute this to time-of-day stamps that probably get used in the generation of each table. Conclusions : the table must be included in the transmission and it is different each time. Any thoughts as to how I can locate the table in this mess. Does it have a set format - or does the table also have a variable length and size.?.?.? Thanks again, Dan
387.8	Hum, I'd think that the compression would be constant	YALI::LASTOVICA	Norm Lastovica	`Fri Jan 23 1987 19:14`	8
	Generating the compression table on a per file basis is usually used so that the file will be compressed the most. That is the compression program will look for the most used character through the least used and build a table based on this information. I'd think that the same file being compressed would yield the same results. The encoding is based (in my experience) on the most compression. If data encription is wanted as well, then putting the table in the file doesn't help that much.
387.9		CAFEIN::PFAU	You can't get there from here	`Fri Jan 23 1987 20:29`	8
	Do you really need Huffman encoding? The Limpel-Ziev algorithms usually do quite well (and quick, too) and aren't that difficult to understand. This algorithm works 'on the fly'. No need for intermediate files, no need to pass the table to the decompressor. Just read from your input stream, compress, and write to your output stream. Look in PAR5::MSDOS:LZ*.ARC for a bit more information... tom_p
387.10	unlocking mysteries	SMAUG::FLOWERS	I know what I know, I think.	`Sun Jan 25 1987 13:29`	18
	First : Yes I do have to use Huffman. I am receiving Huffman compressed data from a remote source (see .0). Second : I may have jumped the gun a bit saying that identical files are compressed differently. I sent the same file four (4) times through the remote source and at first glance the compressed data doesn't even look close. However, with variable length bit patterns, one bit off at the beginning (such as a different time-of-day stamp, which I think does get sent) will make the whole thing look different each time. I really don't relish having to sit down and analyze the compressed data in binary looking for patterns. I keep this note updated.... Dan
387.11		CLT::GILBERT	eager like a child	`Mon Jan 26 1987 00:08`	3
	How many bytes get sent for a null file? For a 256-byte file containing one (each) of the 256 characters? For a 256-byte file containing the same character duplicated 256 times?
387.12	some comments	60592::MOSS	cogito cogito ergo cogito sum.	`Mon Jan 26 1987 20:42`	8
	1) What are the first 3 or 4 bytes in the compressed file? Most compression programs use a "magic number" to tell that the file has been compressed. If we know this, we can probably tell what the compression program was. 2) Is the data sensitive? If not, why not make it readable and have a 'decompress this' contest.
387.13	If I only had the time...	SMAUG::FLOWERS	I know what I know, I think.	`Tue Jan 27 1987 09:23`	10
	Unfortunately this task has taken a back seat to more important (but related) issues. When I get back into the swing of it - maybe about 3-4 days - I'll let you know what's happening. In the meantime, any other thought/comments/suggestions are more than welcome... Thanks, Dan
387.14	tell us more	PLDVAX::ZARLENGA	Bigger they are, Harder they hit	`Wed Jan 28 1987 17:24`	14
	The table will be in the beginning. Most likely it will start within 16 bytes. The format is anyone's guess. If you know the source (assuming you're not stealing confidential information), you should be able to acquire (in other words, not have to reverse engineer) the knowledge to de-compress. Please explain further what kind of data this is and what the name of the compressor is. The time of date stamp may indeed be part of the file header, in that case, the table should start within 16 bytes of the end of the file header. -mike
387.15	origin currently confidential	SMAUG::FLOWERS	I know what I know, I think.	`Thu Jan 29 1987 10:57`	13
	We don't have any access to the source (aside from illegal actions, we can't get it either). All we have to work from is a hex dump of the file (only part of which is compressed - but we know where it starts). We can get unlimited hex dumps of any file we wish to create and push through the compressor (except a null file - it won't allow that). I plan on starting to look at it seriously starting today and/or tomorrow.
387.16		ANGORA::ZARLENGA	Bigger they are, Harder they hit	`Thu Jan 29 1987 14:30`	13
	Can you disassemble the compressor program? Depending upon your abilities, it may not be too difficult to read through the assembler code to figure out how the program works. I did this once and can tell you it will not be as hard as you expect as long as you're familiar (make that very familiar) with the op sys and assembler. Another possibility is feeding the program known files which should produce known results. Knowing, for example, that the file contained on A's might make spotting and deciphering the table easier. -mike
387.17	we're trying	SMAUG::FLOWERS	I know what I know, I think.	`Thu Jan 29 1987 14:59`	11
	Yes, the source could be deassembled. We had thought of this originally. However, for two reasons it is not an option. 1) The source is copyright protected and deassembling is illegal, and 2) the person in charge of the remote system wouldn't let us near it to try that. We are sending known patterns, and are trying to decipher and recognize patterns that way. But damn is that a long process. Dan
387.18	May be usefull	BARNA::SOLEPONT	Jaume, �Barcelona 1992� more than ever	`Thu Jan 29 1987 15:34`	26
	Try feeding the program with a file containing this record: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXYYYYYYYYYYYYYYYYZZZZZZZZ (32X, 16Y, 8Z, Cr, Lf) I think this should compress to something like this: [Header] [Tables] 00000000 AAAAAAAA DB6DB6 10 or [Header] [Tables] FFFFFFFF 55555555 249249 EF or similar repeating paterns as long as X translates to 1 bit Y translates to 2 bits Z translates to 3 bits Cr and Lf to 4 bits each one Repeat the operation for: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBCCCCCCCC and compare [Tables]. Regards, Jaume
387.19	Is this the right problem?	PASTIS::MONAHAN		`Thu Jan 29 1987 16:55`	29
	This sounds like a most amusing technical problem, but as a practical problem something is wrong. 1) Huffman coding is public. That is what the VMS supplied compression routines do. They do not violate anyone's copyright. It is legal to read copyright code, just as it is legal to read a copyright book. 2) Reverse engineering is legal in most countries. Why not crack their code to find out what they are doing, and then put a minor patch in the VMS routines. It can't be much of a change since they are both Huffman algorithms. 3) If there is really something copyright then you may be wasting your time anyway. Apple got a judgement (I heard) against someone who made something that "looked like" a Mac, so your output had better not look like the output from their decoder???? 4) It has to be in someone's interests other than DEC to crack this code. We have enough data of our own without worrying about anyone else's. :-( Can they not do better than play games with you by supplying a black box that you are expected to solve? In summary, I am not well acquainted with U.S. law, but it sounds to me as if someone is being deliberately awkward, and maybe you need legal advice rather than technical. Dave
387.20	Let's have a crack at the code	YALI::LASTOVICA	Norm Lastovica	`Thu Jan 29 1987 19:17`	15
	It appears (from .0) that you've got a file that is encoded (compressed) such that you can not read it. If the sender wishes you to read it, they'd supply a decription program/technique. A Huffman encoding scheme will change with each encription since the count of bits per character is based on the frequency of occurance of the character in the file. Thus, the encription code will change. If the program that does the compression is not available to you (and this sounds true) and if they (the suppliers of the program) are unwilling to let you decode the data, then you are stuck. However, in the true spirit of hacking, if you can encript a number (dozen) files of known input and then DUMP the resulted files, perhaps a good decriptor (there is a word for this?) or two among us could help. I do get excited about these challenges. How 'bout it?
387.21		PASTIS::MONAHAN		`Fri Jan 30 1987 03:40`	8
	We already have perfectly good Huffman decompression routines. Why not use those, and get the sender of data to match. He can use our routines if he has a VAX, or patch his own routines to match the minor details of the algorithm if he has not. In any case, I doubt it is much more difficult to code the compression than the decompression. Maybe you are trying to implement the wrong end of the link.
387.22		SMAUG::FLOWERS	I know what I know, I think.	`Fri Jan 30 1987 11:01`	10
	We are seeking legal advice. Until I get the word from them, I can't give anymore details about the remote source than what I've already given. Suffice it to say that the remote source would rather we didn't figure out how it built it compression table(s). We wish to communicate with them (not them with us) and therefore we have to understand what/how they're saying. Regards, Dan
387.23	Too much mistery	BARNA::SOLEPONT	Jaume, �Barcelona 1992� more than ever	`Fri Jan 30 1987 12:12`	7
	> I can't give anymore details about the remote source than what I've > already given. We wish to communicate with them (not them with us) > and therefore we have to understand what/how they're saying. It seems like CLOSE ENCOUNTERS OF THE FOURTH KIND! ;-) *Jaume
387.24		MKTUP1::EIBEN		`Fri Jan 30 1987 13:07`	41
	Aside from the questionable 'mystery' - as long as You know enough about the internals of the 'other' system - You might be able to to 'reverse-engineer'... Here the 'mights': HUFFMAN encoding IS NOT BOUND to ANY BYTE-size -- infact started using it about 10 years ago on TOPS [and used 6,7 and 8-bit bytes, depending on data-structure [6 for COBOL and executables,7 for text and 8 for micro-executables]. The translation-table is 'fixed-size' in that it has to hold all occuring members of the SET in the compressed file -- so it can be relatively short for a file comprised of only same characters. Depending on implementation original specs of FILENAME etc are typically first piece of compressed file [plus VERSION-NR of compressor] - that probably 'explains' that same file has slightly different contents in beginning . LZW techniques don't need to send 'tables' since COMPRESSOR and DE-Compressor follow the same 'SET OF RULES in building the compression-table'. Here again typically a Compression-Version Nr is included - followed by a file-spec. There are currently [aside from the more-or-less standard UNIX-COMPRESS] atleast 3 more 'enhancements' to LZW in use. Both Compression techniques have been mentioned as 'encryption' techniques - i.e. IF YOU MISS a SINGLE BIT - You'll be even with FULL KNOWLEDGE completely OFF SYNCH till You happen to see a 'out-of-band' RESET {LZW} or get another file {Huffman}. Since one of the current LZW implementations doesn't use a straight RESET but only an 'adaptive' RESET it'll be even more 'crytic'- although also more compressing. Huffman {old technology - SLOW and CUMBERSOME} compresses typically text by 40% and executables by 15%. LZW crunches text by 50% and executables by 20% - and [best of all] crunches Pixel-info by about 90%. Rgds, Bernie.
387.25		PASTIS::MONAHAN		`Fri Jan 30 1987 16:26`	45
	"the remote source would rather we didn't figure out how it built it compression table(s)" If they don't want you to figure this out, then they had better well supply them. If ever you work out how to decompress without being given the tables you will be most of the way to knowing how the tables were built. You will not be able to decompress without at least a functional equivalent of their tables, and if you work it out for yourself then you will be in a much better position to understand how they were built than if you just took them and used them. Good compression looks just like a random bit stream if you do not know the decompression algorithm (otherwise it could be compressed more). Please admit that this is a bet that you can't crack it within a month, and then we can forget the politics and enjoy the technicalities. Try taking a large file, and running it through the VMS compression utility. You will probably reduce it by quite a bit. Then try reducing the compressed file. That will probably make it a little smaller, but after about 3 or 4 tries it will start getting larger. This indicates that the compression algorithm can discover no regularities, that is, it essentially random bits to the limits of its analysis. If I was your customer, and really wanted you not to discover the compression algorithm, then I would take whatever you supplied and return a random bit stream about half the length. That would keep you guessing for months. If the remote end is as unhelpful as you imply, then my guess is that you are trying a "known plaintext" attack against some form of encryption that you suspect might be Huffman encoding. If this second guess is nearer than the ten dollar bet guess, then I would suggest you get all of this note removed from such a widely available forum. Incidentally, the previous suggestions for patterns of characters to try are good. Other things you should try are sending the same file multiple times until you understand where the time or sequence number affects things. Then try the same file but with one bit changed, and see how far that bit change affects things. How sure are that the encoding is Huffman? Dave
387.26	1-gram, 2-gram ,3	TAV02::NITSAN	Duvdevani, DEC Israel	`Thu Feb 12 1987 05:30`	14
	Re. -1,-2: The simple way to use Huffman code is to compress on a single character distribution basis. For example: "E" is much more common than "Z", therefore "E" will be represented by less bits than "Z". This is why after the compression you get those "random" bits. HOWEVER, if this "simple" method was chosen, NOT all the redundancy has been moved. There are PAIRS or even larger sequences of letters that are more common (for example: "TH" is much much more common than "CB" and "THE" is much more common than "QQQ"). That ofcourse depends on the source language (maybe a Fortran program source is compressed...). For very large compressed data, with knowledge of the source language behavior, this may give some clue... Nitsan.
387.27	A new kid hitting the streets ... on his head.	SMAUG::MENDEL		`Wed Feb 25 1987 16:32`	81
	The Huffman code of the .0 mention has been 99% cracked. (I am a co-worker of Mr. Flowers who has taken on the project.) It was accoplished by sending a text containing all 256 characters, each character seperated by another, known, character. I.e. /A/B/C/... Then the output of the 'compressor' was expanded into binary form and analyzed. If this sounds tedious, it was. After an hour of staring at 1's and 0's, they start moving on the paper... The results were then verified against another known text. Bingo! Some things I found out are: 1) There is no table in the beginning of the compressed text. The compressor uses the same table every time, and expects the receiver to have it also. This leads me to beleive that the intent was not so much towards compression as towards encryption. 2) There are "dead ends" (more than one) in the tree. Huffman codes are generally represented by a binary tree, each node being a leaf, containing a character, or having exactly two branches. The farther down the tree, the less common the character. The bit string of the coded character is the path from the root of the tree to the leaf the character is on: 0 = take left branch, 1 = take right branch. Anyways, after decoding I found that there were nodes with only one branch being used, the other a "dead end". A bit sequence beginning with a path to the dead end is, thus, un-decodeable. Could these dead ends not be dead ends at all, but be paths to leafs containing data other than characters? For example, multiple character sequences (as mentioned some reply earlier) or control information? Could they be there as sanity checks? If you hit one unexpectedly while decoding, you know you have made a mistake and dont have to finish the document. Normally, in Huffman decoding, if you make a mistake, you dont know about it: you just decode erroneously until the end. This seems plausable, too. The tree-depth/bit-length of the dead ends was 5, up there in frequency with the letters "E", "1", and "9". 3) This is the worst part. After the code for each byte of the text, there is MORE compressed data. I cannot as yet decode it. It begins with a bit sequence that (my luck!) begins with a dead end sequence! Which means I cannot decode this trailer, because I don't know if the dead end hides only a leaf, or an large subtree, and thus I don't know the end of the bit sequence. I've had all kinds of ideas, none of which seem to work. Is it just a terminator? Its much too long for that - the equivelent of about 8 coded characters. Further, the end is otherwise demarked in the endcoded text by means of a field giving the length of the coded data down to the byte. This trailing data is INSIDE this length. Is it a checksum of some kind? Byte sum? Bit sum? CRC (God! No!) By testing, I've found that the trailing bits can very greatly in content and in length. It doesn't seem to be consistant with anything. Different texts sometimes end almost the same. Similar texts sometimes end differently. One thing I'm certain (hackers intuition) it isn't is data external to the text (as in file name or something. ) ****** Well, I'm hard pressed to figure out just what the trailing data at the end of the Huffman coded text is. So I submit it to the rest of the hacking world ... Write back! Kevin Mendel
387.28	is it meaningful data?	37935::ZARLENGA	Bigger they are, Harder they hit	`Thu Feb 26 1987 16:05`	17
	Possibly it's random data thrown in at "encode" time and thrown out at "decode" time? Either uninitialized memory chunks or time-varying data such as time-of-date would qualify? If you run the same file (with the EXACT same name) twice, using the EXACT same command and you get different results as output, then the "extra" data is time-varying. That means the kind of info I listed above. I once did this exact same thing a few years ago. At encode time I added a sequence to the begining of each record that told me the file had been encoded (so the next time it would be decoded) length info, and selected bits of the time of day. This program was limited in use since we could not encode .EXEs because the original might contain the "encode sequence" and the file would then be decoded instead of encoded. -mike z
387.29		MKTUP1::EIBEN		`Tue Mar 03 1987 12:23`	25
	[Only for the record - since I stated earlier already, that neither HUFFMAN without 'dictionary' nor LZW [-that one even WITH getting everything - but not knowing the 'startup-state'] are 'encodable' in the sense - that You would be able to 'port' rules You found out before to another 'unknown' file.] Early implementations of HUFFMAN encoding [be aware HUFFMAN only described the 're-ordering' of byte-encoding - he never gave an implementation example - BTW - same as LEMPEL-ZIPF-WELCH ] were working with "unbalanced tree's" [just the stuff You desribe]- "Leaf balancing" was added in/about 1983. Pre-sending the 'dictionary' is only necessary if one expects byte- accumulations with heftily changing composition [i.e. software sources and binaries - which vary from language to language] - its quite effective to use an only slightly changing dictionary [by time and/or by file-type] - same as LZW, which assumes an 'empty' translation table - but could in case of 'text-only' start with the alphabet ordered in decreasing usage pattern - or anything else for that matter. Good luck in the learning experience Rgds, Bernie.
387.30		SMAUG::MENDEL		`Fri Mar 06 1987 09:42`	22
	I tried very hard not to say "balancing" in .-3. If the Huffman code is ideal for the data, the more unbalanced the tree is the more the data will be compressed. I meant something else. According to textbook stuff, the way a tree is created is to do this: Start with a list of characters and their frequencies. Take the two least common and combine them into one node, as left and right sons. The parent node is given the frequency of the sum of the sons. Add the parent to the list, remove the two sons from the list. Then find the two elements that are least common .... ad infinitum. In the end, there will be one left, and you will have your tree. The point is, by this process, every non-leaf node has exactly two sons. So how do nodes with one son get involved? was my question. Do they have a use? Or, because I built the tree from reverse-engineering the code of a known text, have I just not found out what the other son contains? Clearer?
387.31		VIDEO::LEICHTERJ	Jerry Leichter	`Thu Mar 12 1987 23:10`	72
	Several previous notes contain (at least) two errors in fact about Huffman encoding that need correction: 1. "Huffman compression needs either a fixed table, or a table included in the text sent". This is true of the original Huffman algorithm, but one can do better with adaptive Huffman compression. In this technique, the two ends start off with the same table, but collect statistics about the characters that actually occur. As the more is learned about the file, the table is changed. If you think about an implementation in terms of trees, it's easy to see how this works. Start off with a balanced full binary tree (equivalent to assuming that all characters are equally likely). In each node, store a count; initially all are the same. When a character has been sent/received, update its count. Make sure the tree is still a "proper Huffman tree". If not, re-order as necessary before processing the next character. The original Huffman algorithm is provably optimal among all possible compression algorithms, subject to appropriate constraints on how "large" the units of redundancy are. Adaptive Huffman compression is NOT optimal, but converges to optimal given a "regular enough" input. "Optimal" here counts only the number of bits sent. In the non- adaptive case, the table is NOT counted - it's assumed that the frequency distribution is known at both ends. If your measure of "optimal" includes some notion of efficient implemen- tability on typical hardware, the variable-length codes of Huffman compression cost you; LZW, which uses fixed-length codes, but deals with a varying-width compression window, does better. 2. It is NOT true that losing one bit from a Huffman-compressed message screws up the rest. In fact, such a message has an interesting redundancy property: If you simply continue "reading", you are highly likely to fall back into proper sync after a couple of codes. To see why, note that if you EVER, in reading, find yourself "back in sync", you will STAY in sync, since the decoding process has no memory. (The reason you can get away with this is that no code can be a leading substring of another code - think of the tree - only leaves are code words.) Now suppose the distribution is "uneven", so that the Huffman compression really helps. Then there is at least one "small" code. It is small exactly because the symbol it codes for occurs often; so the code occurs often. Every time the reader reaches this small code, it is offset from the beginning of the code by at most the number of bits in the code. So, if the code is k bits long, there is a 1/k chance of being aligned. Note that this argument works just as well for errors that lose a bit, insert a bit, or simply flip a bit. The more skewed the original symbol distribution, the faster conver- gence is likely to occur. In the worst case, equal proba- bilties and 2^k symbols, all the codes will be k bits long. A single bit flipped will screw up only that code, but the reader will not be able to recover from the insertion or loss of a bit. Of course, with the adaptive Huffman algorithm - or, in gene- ral, ANY adaptive algorithm, screwing up a symbol may cause the sender and receiver to update their tables differently, thus leading to yet more errors. For adaptive Huffman as described above, however, it may be that the tables will eventually re-converge, assuming well-enough behaved message distributions. -- Jerry