[Haskell-cafe] Space Efficiency When Sorting a List of Many Lists

Luke Palmer lrpalmer at gmail.com
Thu Dec 31 05:38:51 EST 2009


On Wed, Dec 30, 2009 at 9:39 PM, Peter Green <kinch1967 at me.com> wrote:
> I can guess that there might be be less laziness and more instantiation when
>  sorting is introduced,

Yes, by a lot.  Sorting requires keeping the entire list in memory.
And Haskell lists, unfortunately, are not that cheap in terms of space
usage (I think [Int] uses 3 words per element).

> but my questions are:
>        (1) Am I doing anything terribly stupid/naive here?
>        (2) If so, what can I do to improve space efficiency?
>
> TIA!
>
>
> import Data.List (sort)
> import Data.List.Split (splitOn)
>
> -- Cartesian Product over a List of Lists
> -- cp [[1,2],[3],[4,5,6]] -->
> [[1,3,4],[1,3,5],[1,3,6],[2,3,4],[2,3,5],[2,3,6]]
> cp          :: [[a]] -> [[a]]
> cp []       =  [[]]
> cp (xs:xss) =  [y:ys | y <- xs, ys <- cp xss]

This cartesian product varies in its tail faster than its head, so
every head gets its own unique tail.  If you reverse the order of the
bindings so that it varies in its head faster, then tails are shared.
If my quick and dirty reasoning is correct, it improves the space
usage by a factor of the number of sublists.

cp' [] = [[]]
cp' (xs:xss) = [y:ys | ys <- cp' xss, y <- xs]

But if you're serious, you can probably do better than just generating
them all and passing them to sort.  I get the impression that there is
some structure here that can be taken advantage of.

>
> -- fromCSV ["8+12,11,7+13,10", "1+2+3,1+9,3+6,4"] -->
> -- [[[8,12],[11],[7,13],[10]],[[1,2,3],[1,9],[3,6],[4]]]
> fromCSV :: [String] -> [[[Int]]]
> fromCSV = map parseOneLine
>    where parseOneLine = map parseGroup . splitOn ","
>              where parseGroup = map read . splitOn "+"
>
> -- explode [[[1,2],[3],[4,5,6]], [[1, 2], [14,15], [16]]] -->
> [[1,3,4],[1,3,5],
> -- [1,3,6],[2,3,4],[2,3,5],[2,3,6],[1,14,16],[1,15,16],[2,14,16],[2,15,16]]
> explode :: [[[a]]] -> [[a]]
> explode =  concatMap cp
>
> -- toCSV [[8,11,7,10,12],[8,11,7,10,12],[8,11,7,10,12]] -->
> -- ["8,11,7,10,12","8,11,7,10,12","8,11,7,10,12"]
> toCSV :: (Show a) => [[a]] -> [String]
> toCSV = map $ tail . init . show
> --toCSV = map (intercalate "," . map show)
>
> main = interact (unlines . toCSV . sort . explode . fromCSV . lines)
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


More information about the Haskell-Cafe mailing list