[Haskell-cafe] Announce: hlcm 0.2.2 - Parallel closed frequent itemsets mining

Alexandre Termier Alexandre.Termier at imag.fr
Wed Jun 16 09:11:40 EDT 2010


Dear all,

I'm pleased to announce the release of hlcm on Hackage :

http://hackage.haskell.org/package/hlcm-0.2.2

hlcm is data mining tool for computing closed frequent itemsets.
This problem is famous as "market basket analysis":

   - given a list of transactions :
     [["bread", "butter","chocolate","tomato"]
     ,["bread","butter"]
     ,["bread","pencil","butter","chocolate"]
     ,["bread","butter","book"]]

   - and a minimal frequency threshold in [1..4], let's say 2: we want 
items that are sold together in at least 2 transactions

hlcm will tell you that ["bread","butter","chocolate"] appears in 3 
transactions and that ["bread","butter"] appears in 4 transactions.
You can many funnier applications with your own data, for example log 
analysis, mining words in web pages, etc.
You can see details on getting started with the program here: 
http://membres-liglab.imag.fr/termier/HLCM/hlcm/hlcm/Main.html
The library documentation is here : 
http://membres-liglab.imag.fr/termier/HLCM/hlcm/HLCM.html

hlcm is based on the most efficient algorithm for closed frequent 
itemset mining, LCM, which is much, much faster than the well-known 
Apriori algorithm (more details when following the pointers from the 
homepage: http://membres-liglab.imag.fr/termier/HLCM/hlcm.html).

hlcm can also exploit parallelism through Strategies, with promising 
speedups. We still have more work to do in order to beat existing C/C++ 
implementations, but you can have a look at the paper that we submitted 
at Haskell Symposium this year for a detailed experimental study:
http://membres-liglab.imag.fr/termier/HLCM/hlcm.pdf
Don't miss out the section about the influence of RTS parameters on 
parallel performance.

Feel free to send me an e-mail if you have any question about hlcm.

Alexandre

-- 
_____________________________________________________________
Alexandre Termier
LIG (Laboratoire d'Informatique de Grenoble)
Université Joseph Fourier
681 rue de la Passerelle
B.P. 72, 38402 Saint Martin d'Hères (FRANCE)
Phone: +33 4 76 82 72 07
Fax: +33 4 76 82 72 87
http://membres-liglab.imag.fr/termier/
_____________________________________________________________



More information about the Haskell-Cafe mailing list