[Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

Evan Laforge qdunkan at gmail.com
Mon Dec 29 02:08:03 EST 2008


On Mon, Dec 29, 2008 at 1:15 PM, Ryan Ingram <ryani.spam at gmail.com> wrote:
> Both readTVar and writeTVar are worse than O(1); they have to look up
> the TVar in the transaction log to see if you have made local changes
> to it.
>
> Right now it looks like that operation is O(n) where n is the number
> of TVars accessed by a transaction, so your big transaction which is
> just accessing a ton of TVars is likely O(n^2).

So this actually brings up a tangential question I've wondered about
for a while.

The lock-free datastructure paper at
http://research.microsoft.com/users/Cambridge/simonpj/papers/stm/lock-free-flops06.pdf
shows lock vs. STM with very similar performance for single processor
with STM quickly winning on multi-processors.  I understand the basic
idea as being that the STM versions are optimistic in that they only
pay a cost on a transaction collision, while locks have to go frob
some bit of memory on every read or write.  I'm guessing that STM
would also show less contention on a read-heavy load since you don't
need block anyone if you are just reading (though I guess you can get
the same effect with shared-read locks in a locking style?).

But every STM operation has to modify a transaction log, which seems
like it should be even more expensive than frobbing a lock bit.  So it
seems like if the per-operation STM overhead is higher, and blocking
contention is the same (assuming the lock implementation uses shared
locks for reads), I don't see how the STM implementation could be
faster.

I'm sure it's all more complicated than this... but where exactly is
the STM performance coming from in those papers?  Also, they disclaim
in the testing section that the STM implementation was immature, but
now parallel GC and perhaps STM doesn't do so much dynamic allocation
(?), shouldn't the STM numbers be even better?


More information about the Haskell-Cafe mailing list