Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"Engel Coding and Length Limited Huffman Heuristic Revisited"

2 Comments -

1 – 2 of 2
Blogger Cyan said...

Interesting read!
A few years ago, I experimented with overshooting correction :
https://fastcompression.blogspot.com/2015/07/huffman-revisited-part-3-depth-limited.html,
but the conclusion was, it rarely paid off.

So either the test was incorrectly setup, which is quite possible,
or the benefit of this strategy depends on distribution, and the samples used were not reflecting that properly.

April 4, 2018 at 6:50 PM

Blogger cbloom said...

Well, even the monotonic heuristic is not bad if you look at averages :

mean : 0.027%

only a quarter of 0.1% excess size over package merge.

Improving these heuristics is really about avoiding the rare bad cases :

max : 2.512%

that's bad enough that we'd like to avoid it.

Aside from running these tests on lots of real data to try to find real-world-occuring worst cases, I also set up a synthetic test to just create histograms to try to find worst cases.

It's easy to make a histogram of 6 symbols or so and just try all possible symbol counts from 1-100. You can restrict to sorted order of counts without loss of generality.

April 5, 2018 at 1:11 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.