[Haskell-beginners] Data.Binary.Get for large files

Kyle Murphy orclev at gmail.com
Fri Apr 30 17:57:48 EDT 2010


Check out the Real World Haskell chapter on profiling, it should have
everything you need to track down where the thunks are sneaking in:

http://book.realworldhaskell.org/read/profiling-and-optimization.html

It's particularly great in this case because the problem being diagnosed in
that chapter is most likely the same sort of problem you're seeing.

-R. Kyle Murphy
--
Curiosity was framed, Ignorance killed the cat.


On Fri, Apr 30, 2010 at 17:06, Philip Scott <haskell-beginners at foo.me.uk>wrote:

> Hi Daniel
>
>
>  Replace getFloat64le with e.g. getWord64le to confirm.
>> The reading of IEEE754 floating point numbers seems rather complicated.
>> Maybe doing it differently could speed it up, maybe not.
>>
>>
>>
> That speeds things up by a factor of about 100 :)
>
> I think there must be some efficiency to be extracted from there
> somewhere.. Either the IEEE module or the Data.Binary.Get.
>
> Is it possible to get the profiler to look deeper than the top level
> module? With all the options I could find, it only ever tells me about
> things in the file I am dealing with..Hm, 200MB file => ~25 million Doubles,
> such a list needs at least 400MB.
>
>
>  Still a long way to 2GB. I suspect you construct a list of thunks, not
>> Doubles.
>>
>>
>
> I think you are almost certainly right. Is there an easy way to see
> if/how/where this is happening?
>
> Thanks once again,
>
> Philip
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20100430/1e4504c7/attachment.html


More information about the Beginners mailing list