[Haskell-cafe] I just don't get it (data structures and OO)

Sun Jun 3 07:58:27 EDT 2007

hi there,

> Let's say i have a deep nested data structure.
> Universe containing galaxies, containing solar systems, containing 
> planets, containing countries, containing inhabitants, containing 
> ...whatever.
> 
> Using the OO paradigm, once i get a reference to an inhabitant, i can 
> update it quite easily (say by changing it's age), and that's the end of it.

but that approach limits you to about one choice for modelling your 
problem: a big ball of interlinked updateable references. every time
you update anything, you update everything, because that is all there is.
so you've reached "the end of it" for your modelling process before
you've even started to think about how best to model your problem.

haskell does have updateable references, so you could replicate that
approach, though i'd suspect there'd be very few cases where this
could actually be recommended. so here are two alternative, related
approaches, to give you an idea of what is possible:

data Universe = U [Galaxy] deriving Show
data Galaxy = G [SolarSystem] deriving Show
data SolarSystem = S [Planet] deriving Show
type Planet = String

u = U [G [S ["mercury","venus","earth","mars"]]] -- it's a small world

a) work inside the outer structures, by moving the work inwards, 
    to the target

traverseU g traverse (U gs) = U (here++(traverse this:there))
    where (here,this:there) = splitAt g gs
traverseG s traverse (G ss) = G (here++(traverse this:there))
    where (here,this:there) = splitAt s ss
traverseS p traverse (S ps) = S (here++(traverse this:there))
    where (here,this:there) = splitAt p ps

so, to work on a planet, you'd visit its solar system, then do the
work there:

workOnPlanet [g,s,p] work = traverseU g (traverseG s (traverseS p work))

b) work through the outer structures, by lifting target accessors 
outwards

liftUget g get (U gs) = get (gs!!g)
liftGget s get (G ss) = get (ss!!s)
liftSget p get (S ps) = get (ps!!p)

liftUset g set (U gs) =  U (here++(set this:there))
    where (here,this:there) = splitAt g gs
liftGset s set (G ss) =  G (here++(set this:there))
    where (here,this:there) = splitAt s ss
liftSset p set (S ps) =  S (here++(set this:there))
    where (here,this:there) = splitAt p ps

getPlanet get [g,s,p] = liftUget g (liftGget s (liftSget p get))
setPlanet set [g,s,p] = liftUset g (liftGset s (liftSset p set))

now, if 'get'/'set' work on a planet, then 'getPlanet get [g,s,p]'/
'setPlanet set [g,s,p]' work on a specific planet in a universe.

main = do
  print $ workOnPlanet [0,0,2] ("HERE: "++) u
  print $ getPlanet id [0,0,2] u
  print $ setPlanet (const "EARTH") [0,0,2] u

while (b) may be closer to the 'reference into everything' you're used
to, (a) is usually nicer and more efficient, because it nests larger chunks
of work instead of nesting each and every small step. both styles are in 
common use in haskell, though get&set are often merged into one.

there are more interesting aspects to this, eg, how exactly to model
the data structures (lists are a useable default, but not always the best
choice), or how to share substructures not affected by updates, or 
how to make traversal more flexible, so that we can change traversal
directions on the fly, or how to extract the commonalities in the data 
structure interfaces, so that we only have to write that traverse or 
lifting code once, and perhaps generically over the different types
addressed by different paths, etc. but since all of those will no 
doubt be addressed in other replies, i thought a simple start might 
be best;-)

hth,
claus