Containers and strictness

Edward Kmett ekmett at gmail.com
Thu Jun 24 09:28:15 EDT 2010


On Thu, Jun 24, 2010 at 8:27 AM, Johan Tibell <johan.tibell at gmail.com>wrote:

> On Thu, Jun 24, 2010 at 1:59 PM, Edward Kmett <ekmett at gmail.com> wrote:
>
>>
>>
>> 2010/6/24 Milan Straka <fox at ucw.cz>
>>
>> > On 24 June 2010 11:14, Milan Straka <fox at ucw.cz> wrote:
>>> >
>>> > > I need some opinion:
>>> > >
>>> > > - Do you think methods like insert/lookup/delete/etc should be strict
>>> in
>>> > >  key/element?
>>> > >
>>> > >  As Claus wrote, right now it is undocumented and inconsistent (both
>>> in
>>> > >  the methods of one container and also in the same methods of
>>> different
>>> > >  container).
>>> >
>>> > Just as it is sometimes important to be able to do strict inserts, it
>>> > is important sometimes that we have maps that are lazy in the
>>> > elements. There are important use cases both ways.
>>> >
>>> > So yes we should have some kind of consistent convention. We could do
>>> > worse than the naming convention where the strict versions use a
>>> > trailing prime ' character.
>>>
>>> I thought we are talking only about keys/elements. I would leave the
>>> values untouched.
>>>
>>> Personally I vote for:
>>> - keys in Maps and elements in Sets are strict
>>> - vales in Maps are left untouched (lazy)
>>>
>>
>> +1 from me
>>
>> Great work so far.
>>
>
> The space overhead per key/value pair is 6 words (48 bytes on a 64-bit
> architecture) when using lazy values but only 4 words (32 bytes) per
> key/value pair when using strict (unpacked) values, a 50% difference. This
> really starts to matter with big enough data sets (as seen in the recent
> Twitter analysis thread). When work with Big Data it's often desirable to
> fit as much data in RAM as possible as the result of many algorithms (think
> machine learning or search ranking) differs with the amount of data you can
> hold in memory.
>
> Something to consider.
>

I definitely agree that unboxing can help a great deal with performance and
space utilization.

However, as containers does not currently require any exotic extensions, I
think that perhaps a type family -based generic map would belong in another
'unboxed-containers' or 'adaptive-containers' package (both of which
currently exist on hackage), as it dramatically extends the language
extension footprint of containers, taking it from something that easily runs
across a wide array of Haskell implementations to something very
ghc-specific.

-Edward Kmett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20100624/de037f10/attachment.html


More information about the Libraries mailing list