[Haskell-cafe] None brute-force BWT algorithm

larry.liuxinyu liuxinyu95 at gmail.com
Fri Jun 24 04:51:27 CEST 2011


I think that version is still a brute-force solution. The only difference is 
that it uses EOF (sentinel) so that it can sort the suffixes instead of 
However, the big-O complexity is the same.

Let's take the rbwt for example:

> rbwt xs = let
>     res = sortBy (\(a:as) (b:bs) -> a `compare` b) (zipWith' (:) xs res)
>   in
>     tail . map snd . zip xs $ head res

Here we can see that, although the infinity res is lazy-evaluated, it 
actually sorts the matrix N times, adding one column per evaluation.
each time there are N elements to be sorted and the length of every element 
grows proportion to N, 
so the time is O( N * (N^2) * \lg (N^2) ) = O(N^3 \lg N)

While my brute-force program provided in previous post is as same as O(N^3 
\lg N).

However, if the random access time is O(1) (on array) but not O(N) (on 
list), my 2nd program is much faster:
Here is the modified version with Array. (import Data.Array)

ibwt'' :: (Ord a) => ([a], Int) -> [a]
ibwt'' (r, i) =  fst $ iterate f ([], i) !! n where
    t = listArray (0, n-1) $ map snd $ sort $ zip r [0..]
    ra = listArray (0, n-1) r
    f (s, j) = let j' = t ! j in (s++[ra ! j'], j')
    n = length r

This version only sort the input data 1 time, (proportion to O(N * \lg N), 
after that the transform vector t is generated.
Then it iterates N times on t to get the result. so the total time is O(N * 
\lg N) + O(N) = O(N * \lg N)

This should be much better than the brute force one. the idea is that, we 
can get the result without reconstructing the complete matrix,
Only two columns (L & F) are enough.

But how to turn the random access idea of transform-vector into purely 
functional settings? here I mean what if Haskell
doesn't provide constant access time array? One idea is to to turn the 
transform vector T into a function, just like what we've done in KMP 
implementation in FP way. Does such solution exist?

Larry, LIU Xinyu

On Friday, June 24, 2011 6:50:46 AM UTC+8, Henning Thielemann wrote:
> On Wed, 22 Jun 2011, larry.liuxinyu wrote:
> > Hi,
> > 
> > I read a previous thread about BWT implementation in Haskell:
> > http://www.mail-archive.com/haskell-cafe@haskell.org/msg25609.html
> > and
> > 
> http://sambangu.blogspot.com/2007/01/burrows-wheeler-transform-in-haskell
> > 
> > They are all in a `brute-force' way, that is implement based on 
> Burrows-Wheeler's definition like below:
> I thought that the knot-tying solution by Bertram Felgenhauer in the same 
> thread was both elegant and efficient:
>    http://www.mail-archive.com/haskell-cafe%40haskell.org/msg25692.html
> _______________________________________________
> Haskell-Cafe mailing list
> Haskel... at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110623/aabf5030/attachment.htm>

More information about the Haskell-Cafe mailing list