[Haskell-beginners] Fwd: Implementing a spellchecker - problem with Data.HashTable performance

Lorenzo Bolla lbolla at gmail.com
Fri Apr 20 18:58:46 CEST 2012


Maybe you could also time with datasets of increasing size (up to 1M),
and see if the execution time grows like O(n^2), in which case I'd say
it's a hashing problem...


On Fri, Apr 20, 2012 at 06:03:19PM +0200, Radosław Szymczyszyn wrote:
> Thanks for your suggestions. Alas, they don't solve the problem.
> 
> As I was at work without the original data file, I repeated the test
> suggested by Karol Samborski with a file of 1 400 000 repetitions of
> "żyźniejszymi". It took about 3.5s, so I thought my problem had been
> solved. However, repeating it with -O2 makes a difference of ~2-3s and
> I don't believe my laptop I used at home is *that much slower* than my
> Mac at work, that running without optimization would make such a great
> difference.
> 
> Now, I've just rerun the test run with the original data file (still
> at work, so comparison with 3.5s is appropriate) at 17:26 and it's
> still running -- so the problem lies in the data set being hashed. I
> don't know why, but it seems to:
> - either make a difference whether one specific or many different
> words are hashed,
> - or whether it's just one slot or many of the HashTable being updated
> (but as I'm using newHint the space should be preallocated).
> 
> Either way I would be grateful if you Karol or somebody else could
> rerun the test with the original data. It's available at:
> http://ernie.icslab.agh.edu.pl/~lavrin/formy.utf8.gz
> 
> Thanks for your time!
> 
> Regards,
> Radek Szymczyszyn
> 
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners

-- 
Lorenzo Bolla
http://lbolla.info



More information about the Beginners mailing list