Code Poetry
and Text Adventures

by catid posted (>30 days ago) 2:49am Wed. Apr 10th 2013 PDT
It was actually surprisingly easy to write a 2D LZ-like algorithm so that blocks of pixel matches can be written out.  It just took me a few hours to do from scratch.  And while I don't care much about how fast it runs, it runs at an interactive rate.  That being said I'm sure there's some room for improvement.

There is still some work to be done to incorporate the LZ code into the rest of the encoder, but it will actually be easier than LZ4 was since I wrote it myself.  I wasted two whole days adapting LZ4 when it was muuch better to just write my own.  Surprising but true.

For some realistic examples:

NOTE: For comparison, context modeling yields about 3.8:1 compression ratio in my test images.  The higher ratio is the whole point.  Further note that these images have a lot of fully-transparent regions but any matches made in those regions are discouraged by the algorithm and are not counted towards the number of pixels "covered."  This is because a highly efficient fully-transparent mask is already used to compress those regions.

testset/resources-images-backgrounds-lab0.png
7886 initial matches
783450 initial collisions
37300 covered / 1048576 = 3.55721% of file
6400 bytes overhead / 111900 bytes saved = 17.4844:1 ratio

testset/resources-images-backgrounds-lab1.png
5796 initial matches
308207 initial collisions
29214 covered / 524288 = 5.57213% of file
3910 bytes overhead / 87642 bytes saved = 22.4148:1 ratio

original.png
21391 initial matches
794873 initial collisions
81008 covered / 1048576 = 7.72552% of file
18040 bytes overhead / 243024 bytes saved = 13.4714:1 ratio

testset/resources-images-powers-fire-firebomb0.png
2039 initial matches
190911 initial collisions
20193 covered / 262144 = 7.70302% of file
4790 bytes overhead / 60579 bytes saved = 12.647:1 ratio

testset/resources-images-Shop0.png
19354 initial matches
786270 initial collisions
231984 covered / 1048576 = 22.1237% of file
46130 bytes overhead / 695952 bytes saved = 15.0868:1 ratio

testset/resources-images-backgrounds-home0.png
3696 initial matches
373286 initial collisions
78606 covered / 524288 = 14.9929% of file
24150 bytes overhead / 235818 bytes saved = 9.76472:1 ratio

So what I expect to see is that on average about 7% of the file data can be encoded this way with high compression ratios.  The rest will have to be done with context modeling.  I expect it to save about 50KB on average from 1024x1024 spritesheets.


Some aside on LZ versus context-filter-huffman encoding

I think it's important for a good lossless image codec to support both.  Where LZ can be used, it does better than context modeling by about 5x.  If a codec only does one or the other then the result can never be better than applying both.


How it works

The 2D LZ77 algorithm is here:

https://github.com/gameclosure/gcif/blob/master/ImageLZWriter.hpp
https://github.com/gameclosure/gcif/blob/master/ImageLZWriter.cpp

It uses a rolling hash and a hash table lookup to quickly check if a block matches one that has been seen before and where it's located.  Then it expands the match zone as much as possible, preferring some directions of growth over others.  Then the match is checked to see if it covers any pixels that haven't been covered by previous matches.  It scores each candidate match, counting zero pixels as 1/4 other pixels, since those would compress better with an entropy encoding scheme and are worth less to encode with LZ77.  If the score is good enough it accepts the match.


2D RLE for free

As an interesting side-note this algorithm provides 2D RLE for free by producing output like this:

22,0 -> 23,0 [4,22] unused=85

Note that the source and destination rectangles overlap.  In the transmitted data, only the first literal pixel column is emitted, and then the remaining 21 columns are copied across.  Slick!


How to get a lot more out of it

I've noticed that a lot of the spritesheet graphics are repeated but slightly changed.  It would make a lot of sense to allow "fuzzy" matching.

In the encoder, after exact matches have been found, fuzzy matches are sought out.  The definition of a fuzzy match is one where the overall entropy of the image is reduced by subtracting two parts of it.

In the case of a fuzzy match, the decoder would not just copy the original data over the new location.  Instead it would perform entropy-based decoding first across the whole image, and then as a final step add the source blocks to the destination blocks to recover the original image.
last edit by catid edited (>30 days ago) 5:01am Wed. Apr 10th 2013 PDT