[Haskell-cafe] Toy compression algorithms [was: A very edgy language]

Thu Jul 12 20:50:15 EDT 2007

G'day all.

Quoting Bulat Ziganshin <bulat.ziganshin at gmail.com>:

> in this case why you proposes to encode lzw words with a huffman
> codes? :)

I don't. :-)

> oops. ppm build tree of contexts and use context to encode one char.
> lzw builds dictionary of words and encode word in empty context.

As you noted later, the "dictionary of words" in LZW is also tree-
structured.  "Words", as you call them, are built by extending words
with single characters, in pretty much exactly the same way that PPM
contexts are built by extending contexts with single characters.

The main difference is this:

To encode a "word" in PPM, you encode the conditional probability
that it takes to get to the end of the word from the start of the word.
It looks like you're encoding single characters (and, programmatically,
you are), but because of the way that arithmetic coding works, it's
mathematically equivalent to encoding the probability of finding the
word as a whole.  You're just decomposing the probability of finding
the word into the probabilities of finding the individual input symbols
along the way.

Does that make sense?

> you are right. so lzw is just dictionary-based transformation which
> replaces some words with special codes while ppm replaces chars with
> their probabilities. i hope you will agree that ppm with flat codes
> will be totally useless

Absolutely.  But augmenting LZW with probabilities to allow for
arithmetic coding wouldn't be so dumb, if it weren't for the fact that
a) we already have more efficient compression algorithms, and b) it'd
destroy the main benefit of LZW (which is that it's effective on slow
CPUs with small memories; perfect for 80s-era micros and 8-bit embedded
systems).

> about which particular algorithm you said? Moffat?

Yes, though if your local university library has a copy, "Managing
Gigabytes" might be a better introduction to the topic.

Cheers,
Andrew Bromage