# [Haskell-cafe] Newbie seeking advice regarding data structure for a tricky algorithm

Toby Hutton toby.hutton at gmail.com
Tue Apr 24 01:59:55 EDT 2007

```Hi,

I'm trying to implement a fast kd-tree in Haskell.
http://en.wikipedia.org/wiki/Kd-tree  It's a binary tree representing space
partitions.

One of the fastest ways to build kd-trees is outlined in this paper:
http://www.cgg.cvut.cz/~havran/ARTICLES/ingo06rtKdtree.pdf  (I'm trying to
implement Algorithm 5 on page 5 and the steps outlined in section 4.3.1.)

I'll try to outline a simplified analogous algorithm to demonstrate my
problem.  It's all about classifying which children should go in a left or
right branch using heuristics.

Say I want to put the words 'foo', 'bar' and 'baz' into a binary tree.  The
heuristic requires I split the words into letters and sort them:
'aabbfoorz'.  The heuristic then may decide, based on the sorted letters,
that 'bar' and 'foo' should go in the left child and 'baz' goes in the
right.  Typically we'd then simply recurse and for example, the left child's
words would be re-sorted into 'abfoor' and the heuristic is reapplied.

If we assume that sorting is relatively expensive, we can avoid the re-sort
for the children by unmerging the parent's sorted list of letters.  Two
sublists of a sorted list should already be sorted.  If we know which word
each letter belongs to it would be more efficient to tag the letters with
'left' or 'right' as the words are classified.  Then we can iterate down the
sorted letter list and  produce new sorted sublists rather simply.

So it's not actually that complicated, and I can imagine exactly how it
could be done in C but I really don't know how to approach this in Haskell.
The problem I'm having is how to keep a map between the words and its
letters (which in the real problem is a map between a list of vectors and 6
tuples containing Doubles and enumerations) while keeping in mind you can't
just map any letter to a word, but specific letters.  e.g., the 'b's in the
example must remember specifically whether they belong to 'bar' or 'baz'.

Thanks for any insights,
Toby.
-------------- next part --------------
An HTML attachment was scrubbed...