[Haskell-cafe] Haskell XML Parsers

Neil Mitchell ndmitchell at gmail.com
Wed May 5 17:35:52 EDT 2010


Hi,

You might want to take a look at TagSoup
(http://community.haskell.org/~ndm/tagsoup) - it parses XML/HTML
lazily returning a stream of tags. It doesn't do nesting, but it does
have good memory usage.

Thanks, Neil

On Fri, Apr 30, 2010 at 11:35 AM, R Senington <sc06r2s at leeds.ac.uk> wrote:
> Dear all,
>
> I have been looking at using XML for a little program I have been writing. The file I am currently trying to load is about 9MB, and I have now tried to use
> HaXml and HST. Without any of my own code, just a simple call to the basic parsers, they both use huge amount of memory.
> HST is the worst and about 7GB and climbing. HaXml uses 1.3Gb.
>
> The code I am using is
> HST
> xml <- readFile file_name_here;k<-runX (parseXmlDocument True) xml;print k
>
> and for HaXml
> x<-readFile file_name_here
> let (Document _ _ e _) = xmlParse "t" x
> let t = myFilter $ CElem e
> print $ length t
>
>
> I have seen on previous posts to the cafe that other people have run into this problem with HST. Is this a general problem with XML in Haskell (I know that XML parsing is a slow and bulky process but this seems excessive)? Is there a known solution? Does anyone have any advice?
>
> Cheers
>
> RS
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


More information about the Haskell-Cafe mailing list