Huffman编码的理解问题

Pollywog1983 2004-08-18 09:08:04

看Huffman编码的介绍文档，其中有一段不是很理解（第二段和第三段）：
Huffman code decoding is performed using a multi-level table lookup.
The fastest way to decode is to simply build a lookup table whose
size is determined by the longest code. However, the time it takes
to build this table can also be a factor if the data being decoded
is not very long. The most common codes are necessarily the
shortest codes, so those codes dominate the decoding time, and hence
the speed. The idea is you can have a shorter table that decodes the
shorter, more probable codes, and then point to subsidiary tables for
the longer codes. The time it costs to decode the longer codes is
then traded against the time it takes to make longer tables.

This results of this trade are in the variables lbits and dbits
below. lbits is the number of bits the first level table for literal/
length codes can decode in one step, and dbits is the same thing for
the distance codes. Subsequent tables are also less than or equal to
those sizes. These values may be adjusted either when all of the
codes are shorter than that, in which case the longest code length in
bits is used, or when the shortest code is *longer* than the requested
table size, in which case the length of the shortest code in bits is
used.

There are two different values for the two tables, since they code a
different number of possibilities each. The literal/length table
codes 286 possible values, or in a flat code, a little over eight
bits. The distance table codes 30 possible values, or a little less
than five bits, flat. The optimum values for speed end up being
about one bit more than those, so lbits is 8+1 and dbits is 5+1.
The optimum values may differ though from machine to machine, and
possibly even between compilers. Your mileage may vary.

...全文

211 9 打赏收藏转发到动态举报

写回复

用AI写文章

9 条回复

切换为时间正序

请发表友善的回复…

发表回复

Pollywog1983 2004-08-19

打赏
举报

在建立Huffman树函数的代码中，还有一段：
Return zero on success, one if the given code set is incomplete (the tables are still built in this case), two if the input is invalid (all zero length codes or an oversubscribed set of lengths), and three if not enough memory.
前面说执行成功返回0，但是后面给出的三种可能性却都是错误的状态，为什么会这样呢？

Pollywog1983 2004-08-19

打赏
举报

主要是第二段不是很理解，尤其是：“lbits is the number of bits the first level table for literal/length codes can decode in one step, and dbits is the same thing for
the distance codes.”.其中的“the first level table”是指什么？请指教！

jp1984 2004-08-19

打赏
举报

it is pretty easy when you program this thoery to code ,but it is a hard thing when refering to mathematical proof...being a beginner ,forget all the mathematical symbols ,code it...

programfanny 2004-08-19

打赏
举报

up to wait...

Pollywog1983 2004-08-19

打赏
举报

如果“the first level table”是指“shorter table”的话，那下面这一段又是什么意思呢：
These values may be adjusted either when all of the codes are shorter than that, in which case the longest code length in bits is used, or when the shortest code is *longer* than the requested table size, in which case the length of the shortest code in bits is used.

zzwu 2004-08-19

打赏
举报

第一段的第一句就说要采用多个level的 lookup table:

"Huffman code decoding is performed using a multi-level table lookup."

而其后说了一大堆使用单个表的坏话,排除了 "simply build a lookup table " (简单地使用一个表),

然后接着就提出了2个 lookup table ,一个 for shorter code, 另一个 for longer code,

这里虽没有明确说 shorter code 是“first level table”, for longer code 的是“second level table”, 但在上下文中不再有其他的table了.

Pollywog1983 2004-08-19