http://giflib.sourceforge.net/whatsinagif/lzw_image_data.html吉夫LZW压缩
我在阅读本页,了解吉夫的LZW压缩。它示出了从它的样本图像的编码代码:
#4#1#6#6#2#9#9 ..
可变长度压缩成字节后,就变成:
8C 2D 99 ..
这意味着:
#4 - 3比特
#1 - 3比特
#6 - 3个比特
#6 - 3比特
#2 - 4位
#9 - 4比特
该压缩图像数据是正确的,因为我生成的吉夫样品图像使用Photoshop和验证的二进制内容。
它清楚地表明位大小增加发生在输出代码#2
然而,这是怎么一回事位大小增加页面会谈: When you are encoding the data, you increase your code size as soon as your write out the code equal to 2^(current code size)-1
Jumping back to our sample image, we see that we have a minimum code size value of 2 which means out first code size will be 3 bits long. Out first three codes, #1 #6 and #6, would be coded as 001 110 and 110. If you see at Step 6 of the encoding, we added a code of #7 to our code table. This is our clue to increase our code size because 7 is equal to 2^3-1 (where 3 is our current code size). Thus, the next code we write out, #2, will use the new code size of 4 and therefore look like 0010.
但其编码表中,第6步是将条目#7添加到LZW词典中,但为输出添加的代码是第一个#6。根据算法,这两个#6应该是每个4位,但他们怎么实际上是3位?
本页面 https://www.eecis.udel.edu/~amer/CISC651/lzw.and.gif.explained.html
它说,关于位大小相同的事情If you're encoding, you start with a compression size of (N+1) bits, and, whenever you output the code (2**(compression size)-1), you bump the compression size up one bit
这样有什么不好?
如果没有看到实际的代码表正在构建就很难遵循该示例,但很显然,只要有机会输出下一个代码以溢出当前宽度,代码宽度就会增加,但不会更早。因此,如果描述所需的宽度不会增加,但是如果解压缩仍然有效,则它必须是因为该算法有逻辑来检测添加的新代码(这会溢出当前宽度)实际上不可能相当尚未出现在输出流中。 –