[Haskell-cafe] cutting long strings into lines

Bulat Ziganshin bulat.ziganshin at gmail.com
Sat Sep 30 12:56:24 EDT 2006


Hello Andrea,

Saturday, September 30, 2006, 7:02:34 PM, you wrote:

> -- gets the indexes of the spaces within a string
> indx = findIndices (\x -> if x == ' ' then True else False)

indx = findIndices (==' ')

> -- takes the first index of a group of indexes
> takeFirst = map (\(x:xs) -> x)

takeFirst = map head

> -- split a string given a list of indexes
> splitS _ [] = []
> splitS (x:xs) (ls) = [take x ls] : splitS (map (\i -> i - x) xs) (drop x ls)
> splitS _ ls = [ls]:[]

> -- remove the first space from the begging of a string in a list of strings
> rmFirstSpace = map (\(x:xs) -> if x == ' ' then xs else x:xs)

i would prefer to use "map rmFirstSpace" where
rmFirstSpace (' ':xs) = xs
rmFirstSpace xs = xs

> -- used by foldr to fold the list of substrings 
> addNL s s1 = s ++ "\n" ++ s1

foldrl addNl == unlines ?


> try with putStrLn $ wrapString longString
> where: 
> longString = "The Haskell XML Toolbox (HXT) is a collection of
> tools for processing XML with Haskell. The core component of the
> Haskell XML Toolbox is a domain specific language, consisting of a
> set of combinators, for processing XML trees in a simple and elegant
> way. The combinator library is based on the concept of arrows. The
> main component is a validating and namespace aware XML-Parser that
> supports almost fully the XML 1.0 Standard. Extensions are a
> validator for RelaxNG and an XPath evaluator."

i think that your algorithm is too complex. standard algorithm, imho,
is to find last space before 80 (or 75) chars margin, split here and
then repeat this procedure again. so, one line split may look like

splitAt . last . filter (<80) . findIndices (==' ')

and then you need to define function which repeats this operation on
the rest of list. or, slightky different solution:

-- |this function splits the list xs into parts whose length defined
-- by call to function len_f on the rest of list
splitByLen len_f [] = []
splitByLen len_f xs = y : splitByLen len_f ys
                       where (y,ys) = splitAt (len_f xs) xs

-- |this function finds last space in String within 80-char boundary
len_f = last . filter (<80) . findIndices (==' ')

so, "splitByLen len_f" should give you that you need, you need only to
add checks for some additional conditions (first word in line is more
than 80 bytes long, it is a last line) and removing of the extra space
on each line

btw, are you seen http://haskell.org/haskellwiki/Simple_unix_tools ?



-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin at gmail.com



More information about the Haskell-Cafe mailing list