Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"09-10-12 - LZ4 - Large Window"

2 Comments -

1 – 2 of 2
Blogger Will said...

> (though you don't really get to avoid the branch, it's just moved into the looping on the run len)

How about always having a multiple of some number of match-literal pairs? Then you'd only have to check every so often.

If the point is for big windows, then the tail end where there might be some padding is no big deal.

Probably a bad idea.

September 10, 2012 at 10:45 PM

Blogger cbloom said...

Well, back in the ancient days we did things like this :

Put match/literal flag in a byte, so you put 8 single-bit flags together.

Decoder just does a switch() on the control byte with 8 flags in it

Jump to 256 copies of unrolled decoder that spits out 8 matches or literals at a time

On old chips that had tiny icaches, this didn't really hurt much (because you had very little hope of keeping much in icache anyway). (and of course instructions were relatively expensive compared to cache misses in the old days)

Nowadays I-cache is usually very large and you can often keep your whole decoder in I-cache, so doing something like this that creates a huge amount of code is a no-go. (as bad as a branch is, it's a lot better than an unpredictable I-cache miss).

September 13, 2012 at 2:44 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.