[Haskell-cafe] Re: Greetings

Sun Oct 1 00:05:05 EDT 2006

Paul Johnson wrote:
> I've done some stuff with maybe 50k rows at a time.  A few bits and pieces:
> 
> 1: I've used HSQL 
> (http://sourceforge.net/project/showfiles.php?group_id=65248) to talk to 
> ODBC databases.  Works fine, but possibly a bit slowly.  I'm not sure 
> where the delay is: it might just be the network I was running it over.  
> One gotcha: the field function takes a field name, but its not random 
> access.  Access the fields in query order or it crashes.

Thanks; that's certainly the sort of thing I like knowing in advance.

> 2: For large data sets laziness is your friend.  When reading files 
> "getContents" presents an entire file as a list, but its really 
> evaluated lazily.  This is implemented using unsafeInterleaveIO.  I've 
> never used this, but in theory you should be able to set up a query that 
> returns the entire database as a list and then step through it using 
> lazy evaluation in the same way.

I assume that the collectRows function in HSQL can produce this kind of 
a lazy list...right?

> 3: You don't say whether these algorithms are just row-by-row algorithms 
> or whether there is something more sophisticated going on.  Either way, 
> try to make things into lists and then apply map, fold and filter 
> operations.  Its much more declarative and high level when you do it 
> that way.

I'm going to need to do some mapping, folding, partitioning...

> 
> Let us know how you get on.

I certainly will.