[Haskell-beginners] Question about data structures

Wed Nov 24 20:47:34 EST 2010

Here's a simple example that arises in the KenKen problem.

Let's assume that instead of keeping for each cage the Operation and target
number we pre-compute all the combinations of values that will satisfy the
cage.  For example, assume that in a 4x4 game, we have a cage that refers to
cells C1 and C2 and that the cage requires those two cells to produce a
value of 2 using division, Pre-computing the possibilities, we store [(1,
2), (2, 1), (2, 4), (4, 2)] in this cage.

The cage points to the two cells it constrains, and each cell points to the
cage that constrains it.

When a cell (say C1) gets a value of (say) 2, we want to change the cage to
be [(2, 1), (2, 4)]. But when that happens, it's necessary to change cell C2
to point to the new cage -- and of course since the cell was changed the
construct that held that cell has to be changed also.

It's a bit of a pain to write the code to do all that.  This isn't to say
that the code will take that long to run or that it will use excessive
memory -- only that it feels like one is being forced to write code that one
shouldn't have to write.
*
-- Russ *

On Wed, Nov 24, 2010 at 3:11 PM, Russ Abbott <russ.abbott at gmail.com> wrote:

> Thanks, I'm still struggling with it. But here is what it looks like now.
>
> I'm working on a KenKen solver. One issue I'm having is that I'm not sure
> how many types to declare -- where by types I mean a "type X = ..."
> declaration.  The more there are the better documented the code, but the
> more there are the harder it seems to be to remember what they all are.
>  I've pasted the current version of my declarations below.  By way of
> explanation, here's the representation strategy.
>
> The puzzle board is called a Grid. It consists primarily of a Map CellRef
> Cells. I'm using a Map instead of an array (list of lists) as a way to avoid
> rebuilding the list of lists of cells when a Cell gets a value.  A CellRef
> is (Int, Int) for row and column. A Cell contains its own CellRef along with
> a reference to the Cage that refers to it. It also has a place to put its
> actual value, an Int (called type CellValue).  (In the current version a
> Cell has one value. In a subsequent version it will be more like a
> FiniteDomain variable with a Set of still possible values.)
>
> A Grid also contains two other Maps. Each is of the type Map Int ValueSet.
> One of them represents the unused values for the rows; the other the unused
> values for the columns. A ValueSet is a Set of allowable values. If the
> puzzle is (size x size), its ValueSet will be Set.fromList [1 .. size].
>
> The other main structure is the puzzle presentation, which is called a
> KenKen. It is a list of Cages, where a Cage is: a target value (what the
> operation on the cells has to come out to), an Op (the Operation to be
> performed), and a list of CellRef (which are (row, col) references to
> Cells).
>
> When I give a Cell a value, I am forced to rebuild: the Map of row
> ValueSets, the Map of column ValueSets, the Map of Cells themselves, and the
> Grid that contains all these pieces. In this case, the pieces do change.
> (The cell changes since it gets a value; the candidates still left for the
> row and column also change since that value is no longer available.) But
> still it's a pain rebuilding it all.
>
> So perhaps the first question is whether there is a cleaner and simpler way
> to represent all this.
>
>
>
> type CellValueList = [CellValue]
>
> type CellRef   = (Int, Int)
>
> type CellValue = Int
>
> data Grid      = Grid { rowCandidatesSets :: CandidatesSets
>                              , colCandidatesSets :: CandidatesSets
>                              , gridCells :: (Map CellRef Cell)
>                              }
>
> type KenKen    = [Cage]
>
> type ValueSet  = Set CellValue
>
> type CandidatesSets = Map Int ValueSet
>
> data Cell      = Cell { cellRef :: CellRef
>                            , cage:: Cage
>                            , value:: CellValue
>                            } deriving (Eq, Show)
>
> row cell = fst (cellRef cell)
> col cell = snd (cellRef cell)
>
>
> data Cage      = Cage {target :: CellValue
>                                , op :: Op
>                                , cageCells :: [CellRef]
>                                } deriving (Eq, Show)
>
> data Op        = Add | Subtract | Multiply | Divide deriving (Show, Eq)
>
>
>
> *
> -- Russ *
>
>
>
> On Wed, Nov 24, 2010 at 2:19 PM, matthew coolbeth <mac01021 at engr.uconn.edu
> > wrote:
>
>> May we see a code sample of what you're describing?
>>
>> Generally, when you're writing haskell programs, you will not approach
>> the problem in the same way that you would appoach it in a language
>> with mutable objects.
>>
>> If you can provide a reasonably simple example of a computing task
>> that someone would want to perform,  I'm sure someone will be able to
>> show you how it would be best approached in haskell, and how you need
>> not write too many lines of code, provided that you use the proper,
>> idiomatic style.
>>
>> On 2010-11-24, Russ Abbott <russ.abbott at gmail.com> wrote:
>> > OK. So putting a Map at Leaf nodes doesn't solve the problem.
>> (Apparently
>> > I haven't been able to communicate what I see as the problem.)
>> >
>> > The problem that I'm trying to get to is the need to write excessive
>> code
>> > for something that would require a lot less code in an OO world.  It's
>> not a
>> > matter of execution time or space. It's a matter of the amount of code
>> one
>> > is required to write.
>> > *
>> > -- Russ *
>> >
>> >
>> >
>> > On Wed, Nov 24, 2010 at 1:52 PM, Daniel Fischer
>> > <daniel.is.fischer at web.de>wrote:
>> >
>> >> On Wednesday 24 November 2010 22:12:37, Russ Abbott wrote:
>> >> > Cool. I wasn't aware of that notation.  It doesn't quite get to the
>> >> > issue though.
>> >> >
>> >> > The problem I'm concerned about is the need to define y in the first
>> >> > place. If one is chasing through a data structure and finds a need to
>> >> > change something buried within it, it seems necessary to rebuild
>> >> > everything that includes the changed thing.
>> >>
>> >> In general, values are immutable, so you can't "change something buried
>> >> within it". You have to build a new value containing some of the old
>> stuff
>> >> and a new part. Building the new value usually consists mostly of
>> copying
>> >> a
>> >> couple of pointers (plus building the new part of course), so isn't too
>> >> expensive normally.
>> >>
>> >> You can have mutable values in the IO or (ST s) monads, if you need
>> them.
>> >>
>> >> > That is, I can't change a
>> >> > component of somethingNew without creating y. The point is there's
>> >> > nothing about x that changed,
>> >>
>> >> The thing with the changed component is not x anymore.
>> >>
>> >> > and there may be nothing about (var1 x)
>> >> > that changed, and there may be nothing about var11 . var1 $ x that
>> >> > changed, etc. Yet one is apparently forced to keep track of and
>> >> > reconstruct all those elements.
>> >>
>> >> The compiler takes care of that.
>> >>
>> >> >
>> >> > Another example is to imagine a Tree in which the leaves contain
>> >> > "objects." If I want to change a property of one of those leaf
>> objects,
>> >>
>> >> You can't in general, the thing with a different property is a
>> different
>> >> object.
>> >>
>> >> > I am forced to rebuild all the ancestor nodes of that leaf down to
>> >> > rebuilding the root.
>> >>
>> >> Yes (well, not you, the compiler does it), except if your tree contains
>> >> mutable objects (IORefs/STRefs for example).
>> >>
>> >> >
>> >> > One way to avoid that is for the leaves to refer to their objects
>> >> > through a Map. Then changing a leaf object requires only that the
>> value
>> >> > associated with key represented by the leaf be (re-)inserted into the
>> >> > Map.  The Tree itself need not change at all.
>> >>
>> >> Oh, it will. If you change a Map, you get a new one, thus you get a new
>> >> tree containing the new Map.
>> >>
>> >> >
>> >> > But that trick isn't always available.  In the example we are talking
>> >> > about we can't make a Map where they keys are the instance variable
>> >> > names and the values are their values.  That would seem to do the
>> job,
>> >> > but since the values are of different types, we can't create such a
>> Map.
>> >> >
>> >> > So now what?
>> >>
>> >> Well, what's the problem with the compiler copying some nodes?
>> >> Normally, that doesn't cost very much performance, if it does in your
>> >> case,
>> >> we'd need to know more to suggest the best way to go.
>> >>
>> >> > *
>> >> > -- Russ *
>> >> >
>> >>
>> >
>>
>>
>> --
>> mac
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20101124/b27af1fb/attachment.html