Matching word boundaries in Text.Regexp

Chris Kuklewicz haskell at list.mightyreason.com
Tue Jan 16 06:28:23 EST 2007


Bernd Holzmüller wrote:
> I would like to match word boundaries in a regular expression but this
> doesn't seem to work with Text.Regex in GHC 6.4.2.
> 
> The regular expression looks something like: "\\b(send|receive)\\b" to
> match either the keyword send or the keyword receive but not the word
> sending. Neither works \< and \> for matching the beginning and end of a
> word.
> 
> Thanks for any help,
> Bernd
> 

What you want to do is not POSIX regular expression syntax.

What you are asking for is Perl(-Compatible-Regular-Expressions, aka PCRE).

http://perldoc.perl.org/perlre.html

This is provided in Haskell.  You will first need to ensure you have the PCRE
library.  You may already have libpcre, if not you can get it from
http://www.pcre.org/ where it is developed.

I have the newest wrapper for calling this from Haskell:
http://haskell.org/haskellwiki/Libraries_and_tools/Data_structures#Regular_expressions

You will need regex-base and regex-pcre packages from:
http://darcs.haskell.org/packages/
darcs get --partial http://darcs.haskell.org/packages/regex-base/
darcs get --partial http://darcs.haskell.org/packages/regex-pcre/

It works on both String and Data.ByteString (great performance), and with a bit
of .cabal file editing (to point to libpcre) it should compile and run with GHC
6.4.2 (which I have done on OS X).  To easily compile the packages you will also
need the Data.ByteString  module which is provided by Don's fps package:
http://www.cse.unsw.edu.au/~dons/fps.html
darcs get --partial http://www.cse.unsw.edu.au/~dons/code/fps

The Text.Regex module is the old Posix api.  You don't want that.  You want the
new api exported by Text.Regex.Base and Text.Regex.PCRE which uses (=~) (=~~)
and classes RegexOptions, RegexMaker, RegexLike, RegexContext.

If you upgrade to GHC 6.6 then the regex-base and Data.ByteString are already
installed and you would only need regex-pcre.

Please continue to ask for help on this mailing list or haskell-cafe.

(This was based on the older JRegex libpcre wrapper, which for reference is at
http://repetae.net/john/computer/haskell/JRegex/ )
	


More information about the Glasgow-haskell-users mailing list