[Haskell-beginners] subRegex https? with anchor href tags

Daniel Fischer daniel.is.fischer at googlemail.com
Sat Nov 12 15:50:38 CET 2011


On Saturday 12 November 2011, 14:23:30, Shakthi Kannan wrote:
> Hi,
> 
> --- On Sat, Nov 12, 2011 at 5:44 PM, Daniel Fischer
> 
> <daniel.is.fischer at googlemail.com> wrote:
> | Maybe the backreferences numbering starts at 0?

Not backreferences, but who cares?

> | Worth a try.
> 
> \--
> 
> \0 represents the entire string match:

The entire *match*, that is, the part of the input matched by the regexp.
The other entries correspond to parts matched by certain subregexen in the 
match.

> 
>   http://cvs.haskell.org/Hugs/pages/libraries/base/Text-Regex.html

May I suggest using the docs at hackage, hugs hasn't had a release since 
2006, I don't think the docs are up to date. Unless you're actually using 
hugs, in which case I suggest switching to ghc.

http://hackage.haskell.org/package/regex-compat

> 
> Prelude Text.Regex> subRegex (mkRegex "e") "hello" "\\0"
> "hello"

Heh, I didn't see it immediately either ;)

Prelude Text.Regex> subRegex (mkRegex "e") "hello" ">\\0<"
"h>e<llo"

Of course, if you replace a substring with itself, it doesn't change 
anything.



Prelude Text.Regex> subRegex (mkRegex "https?[^\\s\n\r]+") "The best is 
http://haskell.org" "<a href=\"\\0\">there</a>"
"The best is <a href=\"http://ha\">there</a>skell.org"

Not what you want, character classes don't work that way,

Prelude Text.Regex> subRegex (mkRegex "https?[^[:space:]]+") "The best is 
http://haskell.org\n" "<a href=\"\\0\">there</a>"
"The best is <a href=\"http://haskell.org\">there</a>\n"

but that.

However,

Prelude Text.Regex> subRegex (mkRegex "https?[^[:space:]]+") "The best is 
http://haskell.org." "<a href=\"\\0\">there</a>"
"The best is <a href=\"http://haskell.org.\">there</a>"

please make an effort to not include final punctuation in the href, it's 
rather annoying how many 404s I get from that.





More information about the Beginners mailing list