Text.Regex

Chris Kuklewicz haskell at list.mightyreason.com
Thu Oct 12 07:36:05 EDT 2006


John ` wrote:
> > I am converting programs to use ghc 6.6, I can't seem to find the
> > routine to make a regular expression that returns whether the string
> > passed in is a valid regular expression or not. The one provided just
> > seems to bottom out when passed an odd regex.
> >
> >         John

Hi,

  The Text.Regex compatibility module does not provide such error handling since
the old API does not.  The new API's match/matchM (aka =~ and =~~) are not error
friendly, but lower level functions are. The API has 3+ levels:

class RegexContext with match/matchM is the highest level and handles errors poorly.

That is built as a combination of the middle level:

class RegexMaker which also handles errors poorly.
and
class RegexLike which also handles errors poorly.

To get to the next lower level that has decent error handling you should import
the backend module you want and look at the 'compile' function.  An excerpt:

> > module Text.Regex.Posix.String(...blah...)
> >
> > unwrap :: (Show e) => Either e v -> IO v
> > unwrap x = case x of Left err -> fail ("Text.Regex.Posix.String died: "++
show err)
> >                      Right v -> return v
> >
> > instance RegexMaker Regex CompOption ExecOption String where
> >   makeRegexOpts c e pattern =  unsafePerformIO $
> >     (compile c e pattern >>= unwrap)
> >
> > -- compile
> > compile  :: CompOption -- ^ Flags (summed together)
> >          -> ExecOption -- ^ Flags (summed together)
> >          -> String     -- ^ The regular expression to compile (ASCII only,
no null bytes)
> >          -> IO (Either WrapError Regex) -- ^ Returns: the compiled regular
expression
> > compile flags e pattern =  withCAString pattern (wrapCompile flags e)

For this backend both wrapCompile and the error type WrapError are from the
Wrap.hsc file:

> > -- | The return code will be retOk when it is the Haskell wrapper and
> > -- not the underlying library generating the error message.
> > type WrapError = (ReturnCode,String)
> >
> > newtype ReturnCode = ReturnCode CInt deriving (Eq,Show)

This advice to "go look the 'compile' function" should work for all of the
backends that I have created.  The actual type signature will vary as some do
not need to be in the IO monad (depends on whether a c-library is involved) and
the type returned in the event of an error will differ.  The 'compile' functions
should never call error or fail (unless the underlying library runs of of memory
and does so, or you pass in a null pointer as a ByteString or some such nonsense).

In the same vein, there are 'regex' and 'execute' functions exposed by the
backend modules that offer similar error handling capabilities.

To go below this 3rd level is very backend dependent.  In the above case you
would be using the wrapCompile function which needs a pointer to an ASCII c
string.  In other backends there may or may not be a sane lower level.

-- 


More information about the Libraries mailing list