gettext = (many1 $ noneOf "><") >>= (return . Body)<div><br></div><div>works for your case.</div><div><br></div><div><br><br><div class="gmail_quote">On Thu, Jul 19, 2012 at 6:37 PM, Christian Maeder <span dir="ltr"><<a href="mailto:Christian.Maeder@dfki.de" target="_blank">Christian.Maeder@dfki.de</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Am 19.07.2012 14:53, schrieb C K Kashyap:<div class="im"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Dear gentle Haskellers,<br>
<br>
I was trying to whet my Haskell by trying out Parsec today to try and<br>
parse out XML. Here's the code I cam up with -<br>
<br>
I wanted some help with the "gettext" parser that I've written. I had to<br>
do a dummy "char ' ') in there just to satisfy the "many" used in the<br>
xml parser. I'd appreciate it very much if someone could give me some<br>
feedback.<br>
</blockquote>
<br></div>
You don't want empty bodies! So use many1 in gettext.<br>
<br>
gettext = fmap Body $ many1 $ letter <|> digit<br>
<br>
If you have spaces in your bodies, skip them or allow them with<br>
noneOf "<".<br>
<br>
HTH Christian<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
<br>
<br>
data XML = Node String [XML]<br>
| Body String deriving Show<br>
<br>
gettext = do<br>
x <- many (letter <|> digit )<br>
if (length x) > 0 then<br>
return (Body x)<br>
else (char ' ' >> (return $ Body ""))<br>
<br>
xml :: Parser XML<br>
xml = do {<br>
name <- openTag<br>
; innerXML <- many innerXML<br>
; endTag name<br>
; return (Node name innerXML)<br>
}<br>
<br>
innerXML = do<br>
x <- (try xml <|> gettext)<br>
return x<br>
<br>
openTag :: Parser String<br>
openTag = do<br>
char '<'<br>
content <- many (noneOf ">")<br>
char '>'<br>
return content<br>
<br>
endTag :: String -> Parser String<br>
endTag str = do<br>
char '<'<br>
char '/'<br>
string str<br>
char '>'<br>
return str<br>
<br>
h1 = parse xml "" "<a>A</a>"<br>
h2 = parse xml "" "<a><b>A</b></a>"<br>
h3 = parse xml "" "<a><b><c></c></b></a>"<br>
h4 = parse xml "" "<a><b></b><c></c></a>"<br>
<br>
Regards,<br>
Kashyap<br>
<br>
<br></div></div>
______________________________<u></u>_________________<br>
Haskell-Cafe mailing list<br>
<a href="mailto:Haskell-Cafe@haskell.org" target="_blank">Haskell-Cafe@haskell.org</a><br>
<a href="http://www.haskell.org/mailman/listinfo/haskell-cafe" target="_blank">http://www.haskell.org/<u></u>mailman/listinfo/haskell-cafe</a><br>
<br>
</blockquote>
<br>
<br>
______________________________<u></u>_________________<br>
Haskell-Cafe mailing list<br>
<a href="mailto:Haskell-Cafe@haskell.org" target="_blank">Haskell-Cafe@haskell.org</a><br>
<a href="http://www.haskell.org/mailman/listinfo/haskell-cafe" target="_blank">http://www.haskell.org/<u></u>mailman/listinfo/haskell-cafe</a><br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>I drink I am thunk.<br>
</div>