[Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

Mon Dec 29 01:49:37 EST 2008

Thanks, that's good to know. 

I tried incrementally loading the graph one node per transaction.  It's 
faster: about 38 seconds instead of 4 minutes, but I think I'll stick 
with IORefs and one thread for the present.

-jim

Ryan Ingram wrote:
> Both readTVar and writeTVar are worse than O(1); they have to look up
> the TVar in the transaction log to see if you have made local changes
> to it.
>
> Right now it looks like that operation is O(n) where n is the number
> of TVars accessed by a transaction, so your big transaction which is
> just accessing a ton of TVars is likely O(n^2).
>
> From ghc/HEAD/rts/STM.c:
>
> static TRecEntry *get_entry_for(StgTRecHeader *trec, StgTVar *tvar,
> StgTRecHeader **in) {
>   TRecEntry *result = NULL;
>
>   TRACE("%p : get_entry_for TVar %p", trec, tvar);
>   ASSERT(trec != NO_TREC);
>
>   do {
>     FOR_EACH_ENTRY(trec, e, {
>       if (e -> tvar == tvar) {
>         result = e;
>         if (in != NULL) {
>           *in = trec;
>         }
>         BREAK_FOR_EACH;
>       }
>     });
>     trec = trec -> enclosing_trec;
>   } while (result == NULL && trec != NO_TREC);
>
>   return result;
> }
>
> STM performance is not really geared towards "big" transactions right
> now; in large part because big transactions are likely to starve under
> any real workload anyways.  If you have a single-threaded startup bit
> to populate your data followed by concurrent small mutations, you can
> put the startup in IO using small transactions to populate the data.
>
>   -- ryan
>
>