Personal tools

Toy compression implementations

From HaskellWiki

(Difference between revisions)
Jump to: navigation, search
(Lots of lushous example code.)
 
m
Line 3: Line 3:
 
This code is provided in the hope that someone might find it interesting/entertaining, and to demonstrate what an excellent programming language Haskell truly is. (A working polymorphic LZW implementation in 10 lines? Try ''that'' in Java!)
 
This code is provided in the hope that someone might find it interesting/entertaining, and to demonstrate what an excellent programming language Haskell truly is. (A working polymorphic LZW implementation in 10 lines? Try ''that'' in Java!)
   
This is 'toy' code. Please don't try to use it to compress multi-GB of data. It has not been thoroughly checked for correctly, and I shudder to think what the time and space complexity would be like! However, it is enlightening and entertaining to see how many algorithms you can implement with a handful of lines...
+
This is 'toy' code. Please don't try to use it to compress multi-GB of data. It has not been thoroughly checked for correctness, and I shudder to think what the time and space complexity would be like! However, it is enlightening and entertaining to see how many algorithms you can implement with a handful of lines...
   
 
[[User:MathematicalOrchid|MathematicalOrchid]] 16:46, 15 February 2007 (UTC)
 
[[User:MathematicalOrchid|MathematicalOrchid]] 16:46, 15 February 2007 (UTC)

Revision as of 16:46, 15 February 2007


This code is provided in the hope that someone might find it interesting/entertaining, and to demonstrate what an excellent programming language Haskell truly is. (A working polymorphic LZW implementation in 10 lines? Try that in Java!)

This is 'toy' code. Please don't try to use it to compress multi-GB of data. It has not been thoroughly checked for correctness, and I shudder to think what the time and space complexity would be like! However, it is enlightening and entertaining to see how many algorithms you can implement with a handful of lines...

MathematicalOrchid 16:46, 15 February 2007 (UTC)

module Compression where
 
import Data.List
 
 
-- Run-length encoding
 
encode_RLE :: (Eq x) => [x] -> [(Int,x)]
encode_RLE = map (\xs -> (length xs, head xs)) . groupBy (==)
 
decode_RLE :: [(Int,x)] -> [x]
decode_RLE = concatMap (uncurry replicate)
 
 
-- Limpel-Ziv-Welsh compression (Recommend using [Word8] or [SmallAlpha] for input!)
 
encode_LZW :: (Eq x, Enum x, Bounded x) => [x] -> [Int]
encode_LZW [] = []
encode_LZW (x:xs) = work init [x] xs where
  init = map (\x -> [x]) $ enumFromTo minBound maxBound
  work table buffer []     = [maybe undefined id (elemIndex buffer table)]
  work table buffer (x:xs) =
    let new = buffer ++ [x]
    in  case elemIndex new table of
          Nothing -> maybe undefined id (elemIndex buffer table) : work (table ++ [new]) [x] xs
          Just _  -> work table new xs
 
-- TODO: Matching decode_LZW function.
 
-- TODO: Huffman encoding.
 
-- TODO: Arithmetic coding.

It may also be useful to add the following for test purposes:

import Data.Word
 
data SmallAlpha = AA | BB | CC | DD deriving (Show, Eq, Ord, Enum, Bounded)
 
parse1 'a' = AA
parse1 'b' = BB
parse1 'c' = CC
parse1  _  = DD -- For safety
 
parse = map parse1
Anybody know how to use
newtype
to make a type like
Char
but with
minBound
and
maxBound
much closer together?