Robert,<br><br>On Tue, Sep 29, 2009 at 3:25 PM, Robert Ziemba <span dir="ltr"><<a href="mailto:rziemba@gmail.com">rziemba@gmail.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>I have been working with the regular expression package (Text.Regex.Posix). My hope was to find a simple way to remove a pair of XML tags from a short string. </div><div><br></div><div>I have something like this "<tag>Data</tag>" and would like to extract 'Data'. There is only one tag pair, no nesting, and I know exactly what the tag is. </div>
<div><br></div><div>My first attempt was this: </div><div><br></div><div> "<tag>123</tag>" =~ "[^<tag>].+[^</tag>]"::String</div><div><br></div><div>result: "123"</div>
<div><br></div><div>Upon further experimenting I realized that it only works with more than 2 digits in 'Data'. I occured to me that my thinking on how this regular expression works was not correct - but I don't understand why it works at all for 3 or more digits. </div>
<div><br></div><div>Can anyone help me understand this result and perhaps suggest another strategy? Thank you.</div></blockquote><div><br>The regex you are using here can be described as such:<br><br>"Match a character not in the set '<,t,a,g,>', followed by 1 or more of anything, followed by a character not in the set '<,/,t,a,g,>'."<br>
<br>Effectively, it will not match if your data has less than 3 characters and is probably not the correct regex for this job, i.e. it would also match "x123x". What you need is regex capturing, but I don't know if that is available in that regex library (I'm not an expert Haskeller).<br>
<br>If you really need a regex to locate the tag, you could use a function like this to extract it:<br><br>getTagData tag s =<br> let match = s =~ ("<" ++ tag ++ ">.*</" ++ tag ++ ">")::String<br>
dropTag = drop (length tag + 2) s<br> getData = take (length match - (2 * length tag + 5)) dropTag<br> in if length match > 0<br> then Just getData<br> else Nothing<br><br>*Main> getTagData "tag" "<tag>123</tag>"<br>
Just "123"<br><br><br>Patrick<br><br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>_______________________________________________<br>
Beginners mailing list<br>
<a href="mailto:Beginners@haskell.org">Beginners@haskell.org</a><br>
<a href="http://www.haskell.org/mailman/listinfo/beginners" target="_blank">http://www.haskell.org/mailman/listinfo/beginners</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br>=====================<br>Patrick LeBoutillier<br>Rosemère, Québec, Canada<br><br>