[Haskell-beginners] first open source haskell project and a mystery to boot

Brent Yorgey byorgey at seas.upenn.edu
Thu Oct 13 18:18:40 CEST 2011


On Wed, Oct 12, 2011 at 11:59:30AM -0700, Alia wrote:

> --------------------------------------------------------------------
> -- Testing Area
> --------------------------------------------------------------------
> outlook s
>     | s == "sunny"    = 1
>     | s == "overcast" = 2
>     | s == "rain"     = 3
> 
> temp :: (Real a, Fractional n) => a -> n
> temp i = (realToFrac i) / (realToFrac 100)
> 
> humidity :: (Real a, Fractional n) => a -> n
> humidity i = (realToFrac i) / (realToFrac 100)
> 
> 
> windy x
>     | x == False = 0
>     | x == True  = 1
> 
> -- attributes
> a1 = Discrete outlook
> a2 = Continuous temp
> a3 = Continuous humidity
> a4 = Discrete windy
> 
> outlookData  = ["sunny","sunny","overcast","rain","rain","rain","overcast","sunny","sunny","rain","sunny","overcast","overcast","rain"]
> tempData     = [85, 80, 83, 70, 68, 65, 64, 72, 69, 75, 75, 72, 81, 71]
> humidityData = [85, 90, 78, 96, 80, 70, 65, 95, 70, 80, 70, 90, 75, 80]
> windyData    = [False, True, False, False, False, True, True, False, False, False, True, True, False, True]
> outcomes     = [0,0,1,1,1,0,1,0,1,1,1,1,1,0]
> 
> d1 = zip outlookData outcomes
> d2 = zip tempData outcomes
> d3 = zip humidityData outcomes
> d4 = zip windyData outcomes
> 
> t1 = id3 [a1] d1
> t2 = id3 [a2] d2
> t3 = id3 [a3] d3
> t4 = id3 [a4] d4
> 
> --t5 = id3 [a1,a2,a3,a4] [d1,d2,d3,d4] 
> -- doesn't work because you can't mix strings and numbers in a list
> -- 

This also doesn't work because [d1,d2,d3,d4] isn't the right type,
even if you could mix strings and numbers in a list: d1, d2, etc. are
each lists of pairs, so [d1,d2,d3,d4] is a list of lists of pairs.

I think what you really want is to combine all the data for each
observation into a single structure.  Something like this:



data Item = Item String Double Double Bool

outlook (Item "sunny" _ _ _) = 1
outlook (Item "overcast" _ _ _) = 2
outlook (Item "rain" _ _ _) = 3

temp (Item _ i _ _) = (realToFrac i) / (realToFrac 100)

humidity (Item _ _ i _) = (realToFrac i) / (realToFrac 100)

windy (Item _ _ _ False) = 0
windy (Item _ _ _ True)  = 1

-- attributes
a1 = Discrete outlook
a2 = Continuous temp
a3 = Continuous humidity
a4 = Discrete windy

outlookData  =
["sunny","sunny","overcast","rain","rain","rain","overcast","sunny","sunny","rain","sunny","overcast","overcast","rain"]
tempData     = [85, 80, 83, 70, 68, 65, 64, 72, 69, 75, 75, 72, 81,
71]
humidityData = [85, 90, 78, 96, 80, 70, 65, 95, 70, 80, 70, 90, 75,
80]
windyData    = [False, True, False, False, False, True, True, False,
False, False, True, True, False, True]
outcomes     = [0,0,1,1,1,0,1,0,1,1,1,1,1,0]

d = zip (zipWith4 Item outlookData tempData humidityData windyData)
outcomes

t1 = id3 [a1] d
t2 = id3 [a2] d
t3 = id3 [a3] d
t4 = id3 [a4] d

t5 = id3 [a1,a2,a3,a4] d


Now t5 works just fine.

-Brent



More information about the Beginners mailing list