[Haskell-beginners] remove XML tags using Text.Regex.Posix

Robert Ziemba rziemba at gmail.com
Tue Sep 29 15:25:07 EDT 2009


I have been working with the regular expression package (Text.Regex.Posix).
 My hope was to find a simple way to remove a pair of XML tags from a short
string.

I have something like this "<tag>Data</tag>" and would like to extract
'Data'.  There is only one tag pair, no nesting, and I know exactly what the
tag is.

My first attempt was this:

  "<tag>123</tag>" =~ "[^<tag>].+[^</tag>]"::String

result:  "123"

Upon further experimenting I realized that it only works with more than 2
digits in 'Data'.  I occured to me that my thinking on how this regular
expression works was not correct - but I don't understand why it works at
all for 3 or more digits.

Can anyone help me understand this result and perhaps suggest another
strategy?  Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20090929/77d2a447/attachment.html


More information about the Beginners mailing list