[Haskell-cafe] Telling Cassava to ignore lines

Johan Tibell johan.tibell at gmail.com
Wed Sep 18 04:03:59 CEST 2013


Hi,

It depends on what you mean by "doesn't parse". From your message is assume
the CSV is valid, but some of the actual values fails to convert (using
FromField). There are a couple of things you could try:

 1. Define a newtype for your field that calls runParser using e.g. the Int
parser and if it fails, return some other value. I should probably add an
Either instance that covers this case, but there's none there now.

newtype MaybeInt = JustI !Int | ParseFailed

instance FromField MaybeInt where
    parseField s = case runParser (parseField s) of
        Left err -> pure ParseFailed
        Right (n :: Int) -> JustI <$> n

(This is from memory, so I might have gotten some of the details wrong.)

 2. Use the Streaming module, which lets you skip whole records that fails
to parse (see the docs for the Cons constructor).

-- Johan



On Tue, Sep 17, 2013 at 6:43 PM, Andrew Cowie <
andrew at operationaldynamics.com> wrote:

> I'm happily using Cassava to parse CSV, only to discover that
> non-conforming lines in the input data are causing the parser to error
> out.
>
>     let e = decodeByName y' :: Either String (Header, Vector Person)
>
> chugs along fine until line 461 of the input when
>
>         "parse error (endOfInput) at ..."
>
> Ironically when my Person (ha) data type was all fields of :: Text it
> just worked, but now that I've specified one or two of the fields as Int
> or Float or whatever, it's mis-parsing.
>
> Is there a way to tell it to just ignore lines that don't parse, rather
> than it killing the whole run? Cassava understands skipping the *header*
> line (and indeed using it to do the -by-name field mapping).
>
> Otherwise the only thing I can see is going back to all the fields
> being :: Text, and then running over that as an intermediate structure
> and validating whether or not things parse to i.e. float.
>
> AfC
> Sydney
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130917/e616d36c/attachment.htm>


More information about the Haskell-Cafe mailing list