<div>One observation is that you&#39;re doing a lot of redundant Bool -&gt; Int conversion.</div>

<div>&nbsp;</div>

<div>As you iterate across the array in fillArray, the rule you are using for the next cell is almost entirely determined by the rule you are using for the current cell; lop off the top bit, shift left by one, and add the new bit.&nbsp; Instead, you&#39;re recalculating the entire rule at that point.

</div>

<div>&nbsp;</div>

<div>Sadly, (at least as of GHC 6.6.1) the performance of the operations in Data.Bits was horrendous, and any time I wanted to use them for performance,&nbsp;I ended up going through the FFI and writing C code.&nbsp; I haven&#39;t had a chance to test this on GHC 

6.8.</div>

<div>&nbsp;</div>

<div>In this case, that might not be so bad, however.&nbsp; You can probably write a 10-20 line C function that implements &quot;fill&quot; and call it via the FFI.</div>

<div>&nbsp;</div>

<div>Alternatively, you could create a new rule table which maps a rule and a bool into a new rule index.&nbsp; This would only be twice the size of the rule table, and you can index that way.</div>

<div>&nbsp;</div>

<div>Also, what stops getRule from going off the end of the array?&nbsp; I didn&#39;t see anything that prevented that in the code, and you&#39;re using unsafeAt, which seems like a potential bug.</div>

<div>&nbsp;</div>

<div>&nbsp; -- ryan<br><br>&nbsp;</div>

<div><span class="gmail_quote">On 11/13/07, <b class="gmail_sendername">Justin Bailey</b> &lt;<a href="mailto:jgbailey@gmail.com">jgbailey@gmail.com</a>&gt; wrote:</span>

<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">I&#39;ve been working on a program over the last few days to evolve<br>cellular automata rules using a genetic algorithm. Luckily, this email

<br>has nothing to do with CAs but everything to do with Haskell<br>performance.<br><br>For those who don&#39;t know, a CA is represented as a row of cells, where<br>each can be either black (on/1) or white (off/0). A CA is &quot;run&quot; by

<br>generating a new row from the previous row according to some rule.<br>Each cell is examined in turn and based on the state of it&#39;s neighbors<br>and itself, a new cell in the next row is generated that is either<br>

black or white.<br><br>The function below is my &quot;step&quot; function that generates this new row.<br>It&#39;s at the heart of my program and where all&nbsp;&nbsp;the execution time is<br>spent. In this scenario it&#39;s executed around 800 million times. On my

<br>relatively modest desktop using GHC 6.8.1, the program takes about 30<br>seconds to run. Here&#39;s the function, with some of the type<br>declarations:<br><br>data Rule = Rule { ruleWidth :: Int, rules :: UArray Int Bool }

<br>data Ring = Ring { currIdx :: !Int, vals :: (UArray Int Bool), lower,<br>upper, size:: !Int }<br>type CA = Ring<br><br>currR :: Int -&gt; Ring -&gt; Bool<br>currR amt r@(Ring curr arr l u s) = unsafeAt arr ((curr + amt) `mod` s)

<br><br>stepWithUArray :: Int -&gt; Rule -&gt; CA -&gt; CA<br>stepWithUArray rowLen rule@(Rule width rules) row =<br>let upper = rowLen - 1<br>&nbsp;&nbsp;&nbsp;&nbsp; getRule currIdx = pattern&#39; start 0<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; where<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; start = currIdx - width

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; end = currIdx + width<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pattern&#39; cnt !result<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | cnt &gt; end = result<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | otherwise = if (currR cnt row)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; then pattern&#39; (cnt + 1) (result * 2 + 1)

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else pattern&#39; (cnt + 1) (result * 2)<br>&nbsp;&nbsp;&nbsp;&nbsp; makeNewRow :: ST s (ST.STUArray s Int Bool)<br>&nbsp;&nbsp;&nbsp;&nbsp; makeNewRow =<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; do<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; arr &lt;- ST.newArray_ (0, upper)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; let fill idx

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | idx &gt; upper = return ()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | otherwise = do<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; unsafeWrite arr idx (unsafeAt rules (getRule idx))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; fill (idx + 1)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; fill 0<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return $! arr

<br>in fromUArray (ST.runSTUArray makeNewRow)<br><br>fromUArray produces a new Ring (i.e. CA) from an array. &#39;makeNewRow&#39;<br>iterates over every cell in the current row using getRule to get the<br>new value for each cell and returns the new row as an array. getRule

<br>essentially treats the neighbors of the current cell as bits, with the<br>most significant to the left. An index into the rules array is<br>constructed based on the values around the cell being examined (which<br>wrap on the ends, thus the Ring construct). That index is used to get

<br>the value of the new cell from the rules array.<br><br>Profiling shows that the following lines take up the most execution<br>and allocation:<br><br>makeNewRow = ... -- 20.5% execution,&nbsp;&nbsp;26.7% allocation<br>if (currR cnt row) ... -- 

14.7% execution, 26.6% allocation,&nbsp;&nbsp;in pattern&#39;<br>currR ... -- 14.7% execution, 0% allocation<br><br>Any suggestions for improving this code? Thanks in advance!<br><br>Justin<br><br>p.s. The entire program is attached. Compile with ghc -O2

<br>-funbox-strict-fields -fbang-patterns --make GA-CA.hs. It runs 25<br>rules on each of 25 initial CAs for 2 generations.<br>p.p.s. On the bright side, this program has excellent memory<br>performance. It uses a constant 2 - 7 MB depending on the initial

<br>parameters for the entire run. Beautiful!<br><br>_______________________________________________<br>Haskell-Cafe mailing list<br><a href="mailto:Haskell-Cafe@haskell.org">Haskell-Cafe@haskell.org</a><br><a href="http://www.haskell.org/mailman/listinfo/haskell-cafe">

http://www.haskell.org/mailman/listinfo/haskell-cafe</a><br><br><br></blockquote></div><br>