Shootout/Fannkuch
From HaskellWiki
This is a ShootoutEntry for [1]
WARNING: The permutations (at least the printed ones) must be in the order shown below.
Each program should
- "Take a permutation of {1,...,n}, for example: {4,2,1,5,3}.
- Take the first element, here 4, and reverse the order of the first 4 elements: {5,1,2,4,3}.
- Repeat this until the first element is a 1, so flipping won't change anything more: {3,4,2,1,5}, {2,4,3,1,5}, {4,2,3,1,5}, {1,3,2,4,5}.
- Count the number of flips, here 5.
- Do this for all n! permutations, and record the maximum number of flips needed for any permutation.
- Write the first 30 permutations and the number of flips.
The conjecture is that this maximum count is approximated by n*log(n) when n goes to infinity.
FANNKUCH is an abbreviation for the German word Pfannkuchen, or pancakes, in analogy to flipping pancakes."
Correct output N = 7 is: 1234567 2134567 2314567 3214567 3124567 1324567 2341567 3241567 3421567 4321567 4231567 2431567 3412567 4312567 4132567 1432567 1342567 3142567 4123567 1423567 1243567 2143567 2413567 4213567 2345167 3245167 3425167 4325167 4235167 2435167 Pfannkuchen(7) = 16
The fannkuch benchmark is defined in [Performing Lisp Analysis of the FANNKUCH Benchmark|http://www.apl.jhu.edu/~hall/text/Papers/Lisp-Benchmarking-and-Fannkuch.ps], Kenneth R. Anderson and Duane Rettig (26KB postscript)
Contents |
1 New Code
These are being scavenged from the discussion on the haskell-cafe mailing list.
Performance is summarised (on OpenBSD/x86 1.6Ghz Pentium M):
| Author | Time in seconds (N=10) |
|---|---|
| Original entry: | > 2 minutes |
| Sebastian Sylvan: | 15.4 |
| Kimberley Burchett: | 7.2 |
| Cale Gibbard: | 5.8 |
| Bertram Felgenhauer: | 4.7 |
| Clean imperative | 2.1 |
| Fastest pure | 1.9 |
| Fastest impure | 1.4 |
| C gcc | 1.35 |
| C gcc -O2 | 0.5 |
1.1 Proposals
Port to bytestrings
1.2 Fastest pure version
Bertram's with a couple of refactors, a small let binding by Don, a new permutation by Matthias Neubauer, and an ugly flop by Josh and another by David Place. Compile with -O2 -optc-O3.
{-# OPTIONS -O2 -optc-O3 #-} import System import Data.List rotate 2 (x1:x2:xs) = x2:x1:xs rotate 3 (x1:x2:x3:xs) = x2:x3:x1:xs rotate 4 (x1:x2:x3:x4:xs) = x2:x3:x4:x1:xs rotate 5 (x1:x2:x3:x4:x5:xs) = x2:x3:x4:x5:x1:xs rotate 6 (x1:x2:x3:x4:x5:x6:xs) = x2:x3:x4:x5:x6:x1:xs rotate 7 (x1:x2:x3:x4:x5:x6:x7:xs) = x2:x3:x4:x5:x6:x7:x1:xs rotate 8 (x1:x2:x3:x4:x5:x6:x7:x8:xs) = x2:x3:x4:x5:x6:x7:x8:x1:xs rotate 9 (x1:x2:x3:x4:x5:x6:x7:x8:x9:xs) = x2:x3:x4:x5:x6:x7:x8:x9:x1:xs rotate 10 (x1:x2:x3:x4:x5:x6:x7:x8:x9:x10:xs) = x2:x3:x4:x5:x6:x7:x8:x9:x10:x1:xs rotate n (x:xs) = rotate' n xs where rotate' 1 xs = x:xs rotate' n (x:xs) = x:rotate' (n-1) xs permutations l = foldr permutations' [l] [2..length l] where permutations' n = foldr (takeIter n (rotate n)) [] takeIter 0 f x rest = rest takeIter n f x rest = x : takeIter (n-1::Int) f (f x) rest flop :: Int -> [Int] -> (Int, [Int]) flop 2 (x2:xs) = (x2, 2:xs) flop 3 (x2:x3:xs) = (x3, x2:3:xs) flop 4 (x2:x3:x4:xs) = (x4, x3:x2:4:xs) flop 5 (x2:x3:x4:x5:xs) = (x5, x4:x3:x2:5:xs) flop 6 (x2:x3:x4:x5:x6:xs) = (x6, x5:x4:x3:x2:6:xs) flop 7 (x2:x3:x4:x5:x6:x7:xs) = (x7, x6:x5:x4:x3:x2:7:xs) flop 8 (x2:x3:x4:x5:x6:x7:x8:xs) = (x8, x7:x6:x5:x4:x3:x2:8:xs) flop 9 (x2:x3:x4:x5:x6:x7:x8:x9:xs) = (x9, x8:x7:x6:x5:x4:x3:x2:9:xs) flop 10 (x2:x3:x4:x5:x6:x7:x8:x9:x10:xs) = (x10,x9:x8:x7:x6:x5:x4:x3:x2:10:xs) flop n xs = rs where (rs, ys) = flop' n xs ys flop' 2 (x:xs) ys = ((x, ys), n:xs) flop' n (x:xs) ys = flop' (n-1) xs (x:ys) steps :: Int -> [Int] -> Int steps n (a:as) = steps' n (a,as) steps' n (1,_) = n steps' n (t,ts) = steps' (n+1) (flop t ts) main = do n <- getArgs >>= return . read . head let p = permutations [1..n] mapM_ (putStrLn . concatMap show) $ take 30 p putStr $ "Pfannkuchen(" ++ show n ++ ") = " putStrLn . show $ foldl' (flip (max . steps 0)) 0 p
1.3 Fastest impure
Another translation of the C version, using unboxed math, and no IORefs. Much faster again. Compile with -O2 -optc-O3 -fglasgow-exts. Probably a little more could be squeezed.
-- -- translation of the C version to Haskell by Don Stewart -- import Control.Monad import Foreign import System import GHC.Base import GHC.Ptr import GHC.IOBase main = do n <- getArgs >>= return . read . head k <- if n < 1 then return (0::Int) else fannkuch n putStrLn $ "Pfannkuchen(" ++ show n ++ ") = " ++ show (k - 1) fannkuch n@(I# n#) = do perm <- mallocArray n :: IO (Ptr Int) (Ptr c#) <- mallocArray n :: IO (Ptr Int) perm1@(Ptr p1#) <- newArray [0 .. n-1] :: IO (Ptr Int) (Ptr rP) <- newArray [n] :: IO (Ptr Int) (Ptr flipsMaxP) <- newArray [0] :: IO (Ptr Int) let go didpr = do didpr' <- if didpr < (30 :: Int) then ppr 0 n perm1 >> putStr "\n" >> return (didpr + 1) else return didpr IO $ \s -> case readIntOffAddr# rP 0# s of (# s, r# #) -> case setcount c# r# s of (# s, _ #) -> case writeIntOffAddr# rP 0# 1# s of s -> (# s, () #) t <- IO $ \s -> case readIntOffAddr# p1# 0# s of (# s, p1 #) -> case readIntOffAddr# p1# (n# -# 1#) s of (# s, pn #) -> (# s, not (p1 ==# 0# || pn ==# (n# -# 1#)) #) when t $ exchange n perm perm1 flipsMaxP fm <- IO $ \s -> case readIntOffAddr# flipsMaxP 0# s of (# s, x #) -> (# s, I# x #) done <- IO $ \s -> rot rP n# p1# c# s if done then return fm else go didpr' go 0 ------------------------------------------------------------------------ exchange n p@(Ptr a) p1@(Ptr b) fm = do copyArray (p `advancePtr` 1) (p1 `advancePtr` 1) (n-1) IO $ \s -> case readIntOffAddr# b 0# s of { (# s, k #) -> case doswap k a 0# s of { (# s, f #) -> case readIntOffAddr# fm 0# s of { (# s, m #) -> if m <# f then case writeIntOffAddr# fm 0# f s of s -> (# s, () #) else (# s, () #) } } } {-# INLINE exchange #-} doswap k a f s = case swap 1# (k -# 1#) a s of { (# s, _ #) -> case readIntOffAddr# a k s of { (# s, j #) -> case writeIntOffAddr# a k k s of { s -> if k /=# 0# then doswap j a (f +# 1#) s else (# s, (f +# 1#) #) } } } {-# INLINE doswap #-} swap i j a s = if i <# j then case readIntOffAddr# a i s of { (# s, x #) -> case readIntOffAddr# a j s of { (# s, y #) -> case writeIntOffAddr# a j x s of { s -> case writeIntOffAddr# a i y s of { s -> swap (i +# 1#) (j -# 1#) a s } } } } else (# s, () #) {-# INLINE swap #-} loop r i a s = if i <# r then case readIntOffAddr# a (i +# 1#) s of (# s, x #) -> case writeIntOffAddr# a i x s of s -> loop r (i +# 1#) a s else (# s, () #) {-# INLINE loop #-} setcount p r s = if r ==# 1# then (# s, () #) else case writeIntOffAddr# p (r -# 1#) r s of s -> setcount p (r -# 1#) s {-# INLINE setcount #-} rot rP n a cp s = case readIntOffAddr# rP 0# s of { (# s, r #) -> if r ==# n then (# s, True #) else case readIntOffAddr# a 0# s of { (# s, p0 #) -> case loop r 0# a s of { (# s, _ #) -> case writeIntOffAddr# a r p0 s of { s -> case readIntOffAddr# cp r s of { (# s, cr #) -> case writeIntOffAddr# cp r (cr -# 1#) s of { s -> if cr -# 1# ># 0# then (# s, False #) else case inc s of s -> rot rP n a cp s } } } } } } where inc s = case readIntOffAddr# rP 0# s of (# s, x #) -> writeIntOffAddr# rP 0# (x +# 1#) s {-# INLINE rot #-} ppr i n p = when (i < n) $ do putStr . show . (+1) =<< peek (p `advancePtr` i) ppr (i+1) n p
1.4 Cleaner impure version
Here's a translation of the fast C version. It's unoptimised so far, but already runs much faster than our best `pure' version. Use -O2 -optc-O3. It's not pretty (or easy to reason about -- how do the C programmers do it?), but it works :)
-- -- translation of the C version to Haskell by Don Stewart -- import Control.Monad import Data.Array.Base import Data.Array.IO import Data.IORef import System main = do n <- getArgs >>= return . read . head k <- if n < 1 then return (0::Int) else fannkuch n putStrLn $ "Pfannkuchen(" ++ show n ++ ") = " ++ show (k - 1) fannkuch n = do perm <- newArray_ (0,n-1) :: IO (IOUArray Int Int) perm1 <- newArray_ (0,n-1) :: IO (IOUArray Int Int) count <- newArray_ (0,n-1) :: IO (IOUArray Int Int) sequence_ [ set perm1 n n | n <- [0 .. n-1] ] rP <- newIORef n flipsMaxP <- newIORef 0 let go didpr = do didpr' <- if didpr < (30 :: Int) then ppr 0 n perm1 >> putStr "\n" >> return (didpr + 1) else return didpr readIORef rP >>= setcount count >>= writeIORef rP p1 <- perm1 !. 0 pn <- perm1 !. (n-1) when (not $ p1 == 0 || pn == n-1) $ exchange n perm perm1 flipsMaxP fm <- readIORef flipsMaxP done <- rotate rP n perm1 count if done then return fm else go didpr' go 0 rotate rP n p1 c = do r <- readIORef rP if r == n then return True else do -- rotate down perm[0..r] by one p0 <- p1 !. 0 loop r 0 set p1 r p0 cr <- c !. r set c r (cr - 1) if cr - 1 > 0 then return False else inc rP >> rotate rP n p1 c where loop r i = when (i < r) $ p1 !. (i+1) >>= set p1 i >> loop r (i+1) exchange n p p1 fm = do setperm 1 n p p1 k <- p1 !. 0 f <- doswap k p 0 m <- readIORef fm when (m < f) $ writeIORef fm f doswap k p f = do swap 1 (k-1) p j <- p !. k set p k k if k /= 0 then doswap j p (f+1) else return (f+1) swap i j p = when (i < j) $ do xch p i j swap (i+1) (j-1) p xch p i j = do x <- p !. i y <- p !. j set p i y set p j x setperm i n p p1 = when (i < n) $ do p1 !. i >>= set p i setperm (i+1) n p p1 setcount _ 1 = return 1 setcount p r = set p (r-1) r >> setcount p (r-1) ppr i n p = when (i < n) $ do putStr . show . (+1) =<< p !. i ppr (i+1) n p p !. i = unsafeRead p i set p i j = unsafeWrite p i j inc p = readIORef p >>= writeIORef p . (+1)
1.5 Josh Goldfoot
My ugly "flops" function cut the time for the proposed entry from 9.246 to 6.401 seconds on my machine (N=10). It's as above, but:
flop :: Int -> [Int] -> [Int] flop 2 (x1:x2:xs) = x2:x1:xs flop 3 (x1:x2:x3:xs) = x3:x2:x1:xs flop 4 (x1:x2:x3:x4:xs) = x4:x3:x2:x1:xs flop 5 (x1:x2:x3:x4:x5:xs) = x5:x4:x3:x2:x1:xs flop 6 (x1:x2:x3:x4:x5:x6:xs) = x6:x5:x4:x3:x2:x1:xs flop 7 (x1:x2:x3:x4:x5:x6:x7:xs) = x7:x6:x5:x4:x3:x2:x1:xs flop 8 (x1:x2:x3:x4:x5:x6:x7:x8:xs) = x8:x7:x6:x5:x4:x3:x2:x1:xs flop 9 (x1:x2:x3:x4:x5:x6:x7:x8:x9:xs) = x9:x8:x7:x6:x5:x4:x3:x2:x1:xs flop 10 (x1:x2:x3:x4:x5:x6:x7:x8:x9:x10:xs) = x10:x9:x8:x7:x6:x5:x4:x3:x2:x1:xs flop n xs = rs where (rs, ys) = fl n xs ys fl 0 xs ys = (ys, xs) fl n (x:xs) ys = fl (n-1) xs (x:ys)
Perhaps using Template Haskell could make this code look less ugly. But hard-coding in this way significantly speeds up a frequently used function.
This really does speed things up nicely. -- Don
Ian Lynagh provides this TH version. Note that this is quite probably not as clear as the explicit patterns.
import Language.Haskell.TH $(do let xName n = mkName ('x':show n) consPat p ps = infixP (varP p) (mkName ":") ps xPats n = foldr consPat (varP $ mkName "xs") (map xName [1..n]) consExp e es = infixE (Just $ varE e) (conE $ mkName ":") (Just es) xExps n = foldr consExp (varE $ mkName "xs") (map xName [n, n-1..1]) mkClause n = clause [litP (integerL n), xPats n] (normalB $ xExps n) [] d <- funD (mkName "flop") (map mkClause [2..10]) runIO $ putStrLn $ pprint d return [d] )
1.6 Sebastian Sylvan
I contributed what I think is a "neat and elegant" solution which emphasizes clarity over speed (but is still pretty fast). The inlinings here really helped a lot (something like 2x improvment). It's been submitted (and accepted) in the shootout already as an example of an idiomatic "elegant" approach and is currently the fastest Haskell entry (note that they have changed the benchmark to use N=10). I think that if we want anything which is to compete with the imperative languages we need to use imperative style code (in-place reversions etc.). It's probably a good idea to have an "idiomatic" version and a "fast" version.
Note the permutations generator (a rewritten version of Betram's) which on my system performed slighty better than Bertrams and is also a lot clearer (IMHO). It basically does the same thing but with less "magic" syntax :-) I should clarify that Bertram's version is certainly faster altogether (my version is all about elegance and clarity), but I didn't experience any downside to rewriting the permutation generator in a clearer way (in fact, i got a slight speedup).
import System import Data.List(foldl') import GHC.Base {-# INLINE rotate #-} rotate n (x:xs) = let (a,b) = splitAt (n-1) xs in a ++ x : b {-# INLINE perms #-} perms l = foldr perm' [l] [2..length l] where perm' n ls = concat [take n (iterate (rotate n) l) | l <- ls] {-# INLINE flop #-} flop (1:_) = 0 flop xs = 1 + flop (rev xs) {-# INLINE rev #-} rev (x:xs) = reverse a ++ x : b where (a,b) = splitAt (x-1) xs fannuch xs = foldl' max 0 $ map flop xs main = do [n] <- getArgs let xs = perms [1..read n] putStr $ unlines $ map (concatMap show) $ take 30 xs putStrLn $ "Pfannkuchen(" ++ n ++ ") = " ++ show (fannuch xs)
In retrospect I should probably submit a new version which doesn't import GHC.Base (why did I import that?) and uses "where" in rotate as well to make it look the same as e.g. rev.
1.7 Bertram Felgenhauer
combining a few ideas from the list, and with 'correct' permutation order:
import System (getArgs) import Data.List (foldl', tails) rotate n (x:xs) = rot' n xs where rot' 1 xs = x:xs rot' n (x:xs) = x:rot' (n-1) xs permutations l = foldr perm' [l] [2..length l] where perm' n l = l >>= take n . iterate (rotate n) flop :: Int -> [Int] -> [Int] flop n xs = rs where (rs, ys) = fl n xs ys fl 0 xs ys = (ys, xs) fl n (x:xs) ys = fl (n-1) xs (x:ys) steps :: Int -> [Int] -> Int steps n (1:_) = n steps n ts@(t:_) = (steps $! (n+1)) (flop t ts) main = do args <- getArgs let arg = if null args then 7 else read $ head args mapM_ (putStrLn . concatMap show) $ take 30 $ permutations [1..arg] putStr $ "Pfannkuchen(" ++ show arg ++ ") = " putStrLn $ show $ foldl' (flip (max . steps 0)) 0 $ permutations [1..arg]
1.8 Kimberley Burchett
Updated with Bertram's permutations function to get the correct order. Seems about 1.5x faster than Bertram's.
import System(getArgs) import Data.Int import Data.List(foldl') main = do [n] <- getArgs let p = permutations [1..(read n)] mapM putStrLn $ map ((concat).(map show)) $ take 30 p putStr $ "Pfannkuchen(" ++ n ++ ") = " print $ findmax p findmax :: [[Int]] -> Int findmax xss = foldl' max 0 maxes where maxes = map countFlops xss countFlops :: [Int] -> Int countFlops (1:xs) = 0 countFlops list@(x:xs) = 1 + (countFlops (flop x list [])) flop :: Int -> [Int] -> [Int] -> [Int] flop 0 xs ys = ys ++ xs flop n (x:xs) ys = flop (n-1) xs (x:ys) -- rotate initial n elements of the list left by one place rotate n (x:xs) = rot' n xs where rot' 1 xs = x:xs rot' n (x:xs) = x:rot' (n-1) xs permutations l = foldr perm' [l] [2..length l] where perm' n l = l >>= take n . iterate (rotate n)
1.9 Cale Gibbard
This does not use the right permutation order
import Data.Word import Data.Array.Unboxed import System.Environment type Perm = Word8 -> Word8 comparing p x y = compare (p x) (p y) main = do let ns = "9" -- [ns] <- getArgs let n = read ns ps = perms n p = maximum $ map (flops n . perm) ps mapM (putStrLn . (>>= show)) (take 30 ps) putStrLn ("Pfannkuchen(" ++ ns ++ ") = " ++ (show p)) -- NB. element subtree siblings! This is an n-ary tree data Tree a = Node a (Tree a) (Tree a) | Empty flop n f = fs `seq` \x -> fs ! x where fs :: UArray Word8 Word8 fs = array (1,n) [(k, f' k) | k <- [1..n]] f' x = if x <= n then f (n-x+1) else f x where n = f 1 flops n = length . (takeWhile ((/= 1) . ($ 1))) . (iterate (flop n)) showPerm n f = [1..n] >>= show . f perm :: [Word8] -> (Word8 -> Word8) perm [] n = n perm (x:xs) 1 = x perm (x:xs) n = perm xs (n-1) paths depth t = -- paths from the root of t to given depth let across d ancestors Empty rest = rest across d ancestors (Node e l r) rest = down d (e:ancestors) l (across d ancestors r rest) down d ancestors t rest = if d >= depth then ancestors:rest else across (d+1) ancestors t rest in across 1 [] t [] build n = let t = toplevel n toplevel m = if m < 1 then Empty else Node m (f n m t) (toplevel (m-1)) f col banned Empty = Empty f col banned (Node a subtree sibs) = let others = f col banned sibs in if banned == a then others else Node a (f (col-1) banned subtree) others in t perms n = paths n (build n)
1.10 Iavor Diachki
This does not use the right permutation order
import System(getArgs) flop xs@(x:_) = reverse (take x xs) ++ drop x xs flops xs = takeWhile ((1 /=) . head) (iterate flop xs) perms xs = foldr (concatMap . ins) [[]] xs ins x [] = [[x]] ins x (y:ys) = (x:y:ys) : map (y:) (ins x ys) pfannkuchen x = maximum (map (length . flops) (perms [1..x])) main = do let a = "9" -- a:_ <- getArgs let n = read a :: Int putStrLn (unlines (map show (take 30 (perms [1..n])))) print (pfannkuchen n)
1.11 Josh Goldfoot
I was able to significantly speed up the code by replacing the flip function with a function that relies entirely on pattern matching (no splitAts or reverses). It looks ugly, though:
mangle list@(1:xs) = list mangle (2:x2:xs) = x2:2:xs mangle (3:x2:x3:xs) = x3:x2:3:xs ... and so on.
1.12 Jan-Willem Maessen
I was surprised to learn that indexed insertion:
permutations (x:xs) = [insertAt n x perms | perms <- permutations xs, n <- [0..length xs] ] insertAt :: Int -> a -> [a] -> [a] insertAt 0 y xs = y:xs insertAt n y (x:xs) = x:(insertAt (n-1) y xs)
was faster than the usual version of permutation based on "inserts":
permutations (x:xs) = [insertAt n x perms | perms <- permutations xs, n <- [0..length xs] ] insertAt 0 y xs = y:xs insertAt n y (x:xs) = x:(insertAt (n-1) y xs)
However, try these on for size. The non-strict "flop", which traverses its input exactly once, is the most surprising and made by far the biggest difference:
findmax :: [[Int]] -> Int findmax xss = fm xss 0 where fm [] mx = mx fm (p:ps) mx = fm ps $! (countFlops p `max` mx) countFlops :: [Int] -> Int countFlops as = cf as 0 where cf (1:_) flops = flops cf xs@(x:_) flops = cf (flop x xs) $! (flops+1) flop :: Int -> [Int] -> [Int] flop n xs = rs where (rs,ys) = fl n xs ys fl 0 xs ys = (ys, xs) fl n (x:xs) ys = fl (n-1) xs (x:ys)
2 Old Code
This is the *slowest* entry for this benchmark, 800x slower than C, 500x slower than OCaml.
{- The Computer Language Shootout http://shootout.alioth.debian.org/ compile with: ghc -O2 -o fannkuch fannkuch.hs contributed by Josh Goldfoot, "fixing" the version by Greg Buchholz permutations function translated from the C version by Heiner Marxen -} import System(getArgs) import Data.Int main = do [n] <- getArgs let p = permutations [1..(read n)] mapM putStrLn $ map ((concat).(map show)) $ take 30 (permutations [1..(read n)]) putStr $ "Pfannkuchen(" ++ n ++ ") = " print $ findmax 0 p findmax :: Int8 -> [[Int8]] -> Int8 findmax soFar [] = soFar findmax soFar (x:xs) = max (flop 0 x) (findmax soFar xs) flop :: Int8 -> [Int8] -> Int8 flop acc (1:xs) = acc flop acc list@(x:xs) = flop (acc+1) mangle where mangle = (reverse front) ++ back (front,back) = splitAt (fromIntegral x) list permutations :: [Int8] -> [[Int8]] permutations arry = arry : (permuteloop arry [1..n] 1) where n = fromIntegral (length arry) permuteloop :: [a] -> [Int8] -> Int8 -> [[a]] permuteloop arry count r | r == ((fromIntegral . length) arry) = [] | count' !! (fromIntegral r) > 0 = arry' : (permuteloop arry' (reload r count') 1) | otherwise = permuteloop arry' count' (r+1) where count' = (take (fromIntegral r) count) ++ ((count !! (fromIntegral r)) - 1):(drop (fromIntegral r+1) count) arry' = rotate (fromIntegral r) arry rotate :: Int8 -> [a] -> [a] rotate r (x:xs) = begin ++ x:end where (begin, end) = splitAt (fromIntegral r) xs reload :: Int8 -> [Int8] -> [Int8] reload r count | r == 1 = count | otherwise = [1..r] ++ (drop (fromIntegral r) count)
