<p dir="ltr">have you took a look at hlearn and statistics packages? it's even easy to parallellize hlearn on cluster because it's training result is designed for composable, which means you can create two model , train them seperately and finally combine them. you can also use other database such as redis or cassandra,which has haskell binding, as backend. for parallellizing on clusters, hdph is also good.</p>


<p dir="ltr">I personally prefer python for data science because it has much more mature packages and is more interactive and more effective (not kidding. you can create compiled C for core datas and algorithms by python-like cython and call it from python, and exploit gpus for accelerating by theano) than haskell and scala, spark also has a unfinish python binding.</p>


<p dir="ltr">2013/12/18 下午3:41 於 "jean-christophe mincke" <<a href="mailto:jeanchristophe.mincke@gmail.com">jeanchristophe.mincke@gmail.com</a>> 寫道：<br>

><br>

> Hello Cafe,<br>

>  <br>

> Big Data is a bit trendy these days.<br>

>  <br>

> Does anybody know about plans to develop an Haskell eco-system in that domain?<br>

> I.e tools such as Storm or Spark (possibly on top of Cloud Haskell) or, at least, bindings to tools which exist in other languages.<br>

>  <br>

> Thank you<br>

>  <br>

> Regards<br>

>  <br>

> J-C<br>

><br>

> _______________________________________________<br>

> Haskell-Cafe mailing list<br>

> <a href="mailto:Haskell-Cafe@haskell.org">Haskell-Cafe@haskell.org</a><br>

> <a href="http://www.haskell.org/mailman/listinfo/haskell-cafe">http://www.haskell.org/mailman/listinfo/haskell-cafe</a><br>

><br>

</p>