HAppS tutorial

From HaskellWiki
Revision as of 22:04, 6 August 2007 by Tphyahoo (talk | contribs)
Jump to navigation Jump to search


HAppS is a framework for developing Internet services quickly, deploying them easily, scaling them massively, and managing them effortlessly. Web, persistence, mail, DNS and database servers are all built-in so you can focus on app development rather than integrating and babysitting lots of different servers/services (the Haskell type system keeps everything consistent).

Installing

To install HAppS the following packages are needed:

The following additional packages are required for the current darcs version:


The quick way to see what's missing is to get the darcs repository, change into that directory, and run runghc Setup.hs configure. If you don't get an error, try runghc Setup.hs build and then as root runghc Setup.hs install.

Note: If you're building the darcs version of HAppS you'll need to use Cabal 1.1.7+.

Overview

The application model in HAppS is to help separate state, application logic, wire formats, protocols, and presentation layer:

State

State is just a haskell data type you define (deriving (Read, Show, Typeable)). If you have several pieces of state, you'll probably want to define the state as a Haskell record of these pieces.

ACID

Atomicity: Guarantees that every single one of the operations is successfully performed or none of them are. This prevents unfinished operations in the system.

Consistency: Guarantees that the system is in a valid state before and after every operation. This is enforced by Haskell's type system.

Isolation: Guarantees that all operation happen isolated from the other operations in the system. This means that outsider operations will never see operational values while they're still being processed.

Durability: Guarantees that after an operation has succesfully exited, it's value will remain in the system as long as needed (it will not dissappear in the case of system failure). This is handled by MACID write-ahead logging and check-pointing.

Application

Incoming events are gathered in individual haskell threads and then pushed onto a single application queue for processing. The queue model gives you ACID Atomicity and Isolation and lets your app be simply a set of functions with types like:

SomeInputType -> MACID SomeOutputType


The MACID monad lets you update your state and *schedule* side-effects. To be clear, MACID is not in the IO monad so you cannot execute side effects, you can only schedule them. The framework takes care of making sure they are executed at-least-once (if they can be completed by a deadline you specify).

Wire formats

Since your app consists of a set of functions with various haskell input and output types, somewhere you need a place to convert between those internal haskell types and external protocol event types; e.g. from URL Encoded HTTP requests to SomeInputType and from SomeOutputType to XML encoded HTTP responses.

Protocols

HAppS currently provides support for HTTP Requests/Responses and SMTP Envelopes. To be clear HAppS provides ACID Atomicity at the protocol event level. So if you write a protocol with SMTP envelopes being the arriving event type then your app will have atomicity in processing incoming SMTP envelopes. If you write a protocol with SMTP commands being the arriving event type, then your app will have atomicity at the level of individual smtp commands.

Presentation

If your application outputs XML as its wire format, HAppS provides a lot of support for using XSLT to transform it for presentation purposes. For example, you can send XML mail and HAppS will take care of applying the relevant XSLT stylesheet before it is delivered. If you output XML HTTP responses, HAppS takes care of applying the XSLT stylesheet server side for user-agents that don't support doing so on the client. The value here is that you can have designer types who know XSLT modify presentation stuff without touching your application code.

First-step examples

This chapter will run you through some first simple programs written in HAppS. For other programs have a look at the directory named 'examples'.

First of all, default HAppS applications run their own webserver on port 8000, so you probably want to try out these examples at http://localhost:8000/

If you'd rather access these applications on some other port, use ./myapp --default-port=8001 obviously substituting the name of your binary for myapp.

How to build these examples

Cut'n'paste this into a file named Hello.hs and run ghc --make Hello.hs -o hello to compile and then ./hello to execute the resulting binary.

Simple stateless examples

Hello World

import HAppS
import HAppS.Protocols.SimpleHTTP2

helloWorld () () = respond "Hello World"

main = stdHTTP
       [
        debugFilter -- we want to be able to see debug messages in the console
       ,noState -- our application has no state
       ,h () GET $ ok helloWorld -- GET / returns "HTTP/1.0 200 OK\nContent-Type: text/html; charset=utf-8\n\nHello World"
       ]

Handlers are functions that produce either a request or a response. stdHTTP runs forward through the list of handlers transforming requests into requests until it hits a handler that produces a response. It then runs backward up the list transforming responses into responses.

debugFilter actually consists of two handlers, one that prints the request to console and then returns it and another that prints the response to console and then returns it. It is defined in HAppS.Protocols.SimpleHTTP2 as

debugFilter = multi [Handle (\req -> (debugM $ show req) >> debugM "\n" >> request req)
                    ,ModResp (\res -> return (debugM "\n" >> res >>= debugM . show >> debugM "\n=======\n" >> res))]

h is a wrapper around Handle that simplifies matching on uris and methods and structuring responses. It only executes the handler if the URI matches the regex in its first argument and the method specification in its second. A "^" is automatically added to the URI because that is the 99% case.

Notice in this example that any request other than GET / will produce an error!

Add "val" for simplicity

The concept of just returning a value is so common that we defined a function "val" so you don't have to define a function just to return a simple value.

import HAppS
import HAppS.Protocols.SimpleHTTP2

main = stdHTTP
       [
        debugFilter -- we want to see debug messages in the console
       ,noState -- our application has no state
       ,h () GET $ ok $ val "Hello world" -- any request will return "Hello world"
       ]

Methods and paths

URI arguments to h are by default regular expression strings to which ^ is prepended and (.*) is appended.

Method arguments can be individual methods, lists of methods or () to mean all methods.

import HAppS
main = stdHTTP $ debugFilter : -- we want to see debug messages in the console
                  noState : -- our application has no state
       [
         h [""] GET $ ok $ val "Hello World"
        ,h ["getPost"] [GET,POST] $ ok $ val "either GET or POST will result in this response"
        ,h (Prefix ["dir"]) () $ ok $ \lst () -> respond (unwords lst) -- any method to /dir/sub/dir  will return "sub dir"
        ,h ["methods"] () $ ok $ val "Hello" -- any method to /methods will return "Hello"
        ,h () () $ ok $ val "default" -- any method and any reqURI not matched above gets this

       --these two are automatically added by stdHTTP so you don't have to unless you want to override
       --notice that the responses are not "ok" they are notFound and notImplemented!
        ,h () [GET,POST] $ notFound $ val "not found"
        ,h () () $ notImplemented $ val "not implemented"
       ]

Simple file serving

import HAppS -- 0.8.4
import HAppS.Protocols.SimpleHTTP2 -- 0.8.8

main = stdHTTP
       [ debugFilter -- we want to see debug messages in the console
         , noState  -- our application has no state
         , h [""]   GET $ ok $ val "GETting root hello"
         --, h (Prefix ["s"]) GET $ respIO $ fileServe staticPath -- 0.8.4
         , hs (Prefix ["s"]) GET $ basicFileServe staticPath -- 0.8.8
       ]

Note that to try this out with some static files you should create a directory named "static" in the directory where you are running the tutorial code, and put any files you wish to serve in there.

Block dot files

But observe that we don't want to serve all paths in the filesystem. So we want to preempt certain requests that reach the fileServe line:

Now we observe that we actually want to block dot files as well so we do. (There's probably a nicer way to do this using regex). Notice that the fileServe code actually does IO. So you can write responses that do IO. Conceptually you can serve content out of an external database or a proxy server.

import HAppS
main = stdHTTP $ debugFilter : -- we want to see debug messages in the console
                  noState : -- our application has no state
       [
        h [""] GET $ ok $ val "GETting root hello"
       ,h (Prefix ["s"]) GET  $ forbidden $ \path () -> if isDot path then respond "Dot files not allowed" else pass
       , h (Prefix ["s"]) GET $ respIO $ fileServe staticPath
       ]

isDot name = (head name) == '.'

hs let us consolidate these. SimpleHTTP2 defines basicFileServe as (NOTE: currently basicFileServe is not available):

   basicFileServe staticPath path meth= multi
    [
     ,h path      meth $ forbidden $ \path req -> if isDot path then respond "Dot files not allowed" else request req
    ,h path       meth $ fileServe2 mimeTypes staticPath
    ]

We can then use this in our application using "hs" to call a function that produces a list of handlers:

  main = stdHTTP $ debugFilter : -- we want to see debug messages in the console
                  noState : -- our application has no state
    [
     h  [""]            GET $ ok $ val "GETting root hello"
     hs (Prefix ["s"])  GET $ basicFileServe staticPath
   ]

Saved state examples

Getting the URL itself

Now lets add some state and a function that does something with state. Notice that we now get rid of the noState directive. In this example, we write an instance for FromReqURI that tries to read the next part after the url as an value of type Int.

{-# OPTIONS -fglasgow-exts -fth #-}
import HAppS
import HAppS.Protocols.SimpleHTTP2

import Data.Monoid
import Data.Typeable
import Control.Monad.State (get, put)

data MyState = MySt { appVal :: Int } deriving (Read, Show, Typeable)

instance Serialize MyState where
  encodeStringM = defaultEncodeStringM
  decodeStringM = defaultDecodeStringM

instance Monoid MyState where
  mempty = MySt 0
  mappend (MySt x) (MySt y) = MySt (x+y)

$(inferStartState ''MyState) -- boilerplate that will eventually be SYB

instance FromReqURI [String] MyState where
  fromReqURI expr uri = do
    [val] <- fromReqURI (Prefix expr) uri
    fmap MySt $ mbRead val

main = stdHTTP
       [debugFilter -- we want to see debug messages in the console
       ,h [""] GET $ ok $ val "GETting root hello"
       -- /val shows us the current value
       ,h ["val"] GET $ ok $ \() () -> do (MySt val) <- get; respond (show val)
       -- /set/56 would set the value to 56
       ,h ["set"] GET $ ok $ \newVal () -> do put newVal; respond ("New value is " ++ show newVal)
       ]

Getting from a POST'd value

{-# OPTIONS -fglasgow-exts -fth #-}
import HAppS
import HAppS.Protocols.SimpleHTTP2

import Data.Monoid
import Data.Typeable
import Control.Monad.State (get, put)

data MyState = MySt { appVal :: Int } deriving (Read, Show, Typeable)

instance Serialize MyState where
  encodeStringM = defaultEncodeStringM
  decodeStringM = defaultDecodeStringM

instance Monoid MyState where
  mempty = MySt 0
  mappend (MySt x) (MySt y) = MySt (x+y)

$(inferStartState ''MyState) -- boilerplate that will eventually be SYB

instance FromMessage MyState where
  fromMessageM m = do
    val <- maybeM $ lookMbRead m "val"
    return $ MySt val

main :: IO ()
main = stdHTTP
       [debugFilter -- we want to see debug messages in the console
       ,h [""] GET $ ok $ val "GETting root hello"
       ,h (Prefix ["s"]) GET $ respIO $ fileServe staticPath
       -- /val shows us the current value
       ,h ["val"] GET $ ok $ \() () -> do (MySt val) <- get; respond (show val)
       -- /set with the POST data "val"=56 would set the value to 56
       ,h ["set"] POST $ ok $ \() newVal -> do put newVal; respond ("New value is " ++ show newVal)
       -- The first one is FromReqURI and the second one is FromMessage
       ]

This example reads POST data, such as from the following HTML form:

<html>
<head><title>HAppS POST example</title></head>
<body>
<form method="POST" action="http://localhost:8000/set">
<input name="val">
<input name="Submit" type="submit">
</form>
</body>
</html>

Haskell to XML with ToElement, XML to Haskell with FromMessage

HAppS supports turning Haskell values into XML values with the ToElement typeclass.

(this example needs to be trimmed)

{-# OPTIONS -fglasgow-exts -fth #-}
module Main where 

import HAppS
import Data.Typeable
import Control.Monad.State

data Something = Thing { appVal :: String } deriving (Read, Show, Typeable)
$(inferStartState ''Something)

-- to return Something values as xml, implement ToElement
instance ToElement Something where toElement = textElem "something" [] . show . appVal
-- to turn 
instance FromMessage Something where fromMessageM m = maybeM $ lookMbRead m "something" >>= return . Thing

exampleGetVal x () = do (Thing y) <- get; respond $ Thing (x++y)
examplePostVal () (Thing x) = modify (\ (Thing y) -> Thing (y++x)) >> get >>= respond

-- Ambiguous type variable `st' in the constraints: `StartStateEx st st' ... `Serialize st'
-- Means your code isn't specific enough for GHC to infer a type
-- You need to either specify a type, or have other cases that require more specific types.
examplePostVal' () (Thing x) = respond $ show x

exampleHello () () = respond "Hello world"


main = stdHTTP $ debugFilter : -- we want to see debug messages in the console
       [
        hs clientPath  GET $ basicFileServe staticPath
       ,h "/val" POST $ ok plain_xml examplePostVal -- $ val "the request ends with plain"
       ,h "/status" GET $ ok plain_xml examplePostVal'
       ,h () () $ ok plain $ val "fallthrough" -- any request will return "Hello"
       ]

XML with style

We can format the status output of the example above with XSLT. Save the following xsl source as style.xsl in a 'static' subdirectory.

<?xml version="1.0"?>
<!DOCTYPE xsl:stylesheet PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "libxslt/xslt2.dtd">

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:html='http://www.w3.org/TR/REC-html40'>
  <xsl:template match="/">
    <html>
      <head>
      </head>
      <body>
      <!-- template goes here -->
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

Where the comment 'template goes here' is located:

       1. view source to see server output.
       2. make templates for each of your output types.
       3.use the xsl lib that handles all sorts of standard template issues to make it all nice!
       (The value of the element <something> is <xsl:value-of select="something"/>)


And then change one line and add one line for the example code above.


        ,h "/status" GET $ ok plain_xml examplePostVal'

to

        ,h "/status" GET $ ok xml exampleDumpVal

and add this to get the state and dump it to the browser:

exampleDumpVal () () = get >>= respond

Redirection

Next we'll demonstrate redirection by creating a static error page and redirection to it in the fallthrough case.

You can use this shockingly complex error page. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html>

 <head>
   <title>Error!</title>
 </head>
 <body>
   There's been an ERROR!
 </body>

</html> and then change one line in the example code above. ,h () () $ ok plain $ val "fallthrough" becomes

       ,h () () $ seeOther plain $ val ("/s/error.html","hmm") -- when all else fails, complain

Send email

Sending email is straightforward. Create the message value, then hand it to send, which first tries to send it via SMTP_RELAY then directly. Alternatives are the more specific autoSend for direct delivery, envSend for relay via SMTP_RELAY, and sendHost host port for relay via a given server. Often in fear of spam, the recipient will accept your message only if you relay it via your ISP's outgoing SMTP server.

import HAppS
myenvelope = Envelope {
                       relay = "localhost", -- your name at HELO, not the recipient's!
                       sender = Address "tutorial-reader" "happs.org",
                       recipients = [Address "shae.erisson" "gmail.com"],
                       contents = "\r\nHello shapr!"
                      }
main = send myenvelope

The contents field of the Envelope is the actual message, which should conform to the RFC 2822 SMTP standard. That is, the message body is preceded by header lines and a blank line (lines separated by "\r\n"). According to the standard, Date and From lines are obligatory. Other lines that probably should be in the messages are To and Subject. For attachments, and messages that are not plain US-ASCII text, see the RFC 2046 MIME standard.

If the SMTP server is temporarily unavailable or uses graylisting, the message should be saved for a retry later. Sending the message using queueMessage (uses send) achieves this.

Get and set cookies

At some point you'll want to get and set cookies in the browser.

To set a cookie, modify the response as in setsomecookie. To get a cookie, look in the request as in showallcookies.

To see output from this example, first go to /setcookie, and then to /showcookie.

{-# OPTIONS -fglasgow-exts -fth #-}
module Main where 

import HAppS
import Data.Typeable
import Data.Maybe
import Control.Monad.State

main = stdHTTP $ debugFilter :
       noState :
       [
        hs clientPath  GET $ basicFileServe staticPath
       ,h ["hello"] GET $ ok $ val "Hello World"
       ,h ["setcookie"] GET $ setsomecookie -- sets a cookie
       ,h ["showcookie"] GET $ ok showallcookies
       ,h () () $ ok $ val "fallthrough"
       ]

exampleHello () () = respond "Hello World"

setsomecookie () () 
    = do resp <- ok (val "cookie now set") () ()
         return $ liftM (testCookie =<<) resp

showallcookies () req
    = respond $ allcookies req

testCookie :: Monad m => Result -> m Result
testCookie = setCookieEx maxBound $ Cookie "1" "/" "" "cookieName" "cookieValue"

allcookies    :: Request -> String
allcookies rq = unlines $ map show $ concat (getCookies rq)

Sessions

Sessions are much like

autoexpire state, and hook that expire event.

something can happen when state expiresxp

Blocking IO

There's a way to do blocking IO within HAppS, ...

How to structure your handlers?

As we have discussed before, HAppS programs are structured as lists of server parts that handle events. The type system stops you from making some mistakes, but the types involved are somewhat complex to understand at first. You can get a long way by re-using the example code, but here are some tips for when it gets more elaborate. We've already seen examples of nesting with 'multi' and 'hs'.

ServerPart

data ServerPart m req m' res
    = Handle (req -> m (Either req (m' res)))
    | ModResp (m (res -> m' res))
    | Multi [ServerPart m req m' res]

stdHTTP :: (MonadIO m, StartStateEx st st, Serialize st) 
           => [ServerPart (Ev st Request) Request IO Result] -> m ()

The list you give to stdHTTP is a list of ServerParts. Each ServerPart is a Handle which takes a Request, a ModResp which post-processes a further Result, or a Multi which nests more ServerParts. Of course the ServerParts can be constructed by functions so you don't have to write them manually.

The normal way to construct a Handle is 'h' which makes the handler list useful by choosing which requests are handled by which handlers. For ModResp, there's a corresponding 'hOut'. 'multi' isn't very useful, it just lets you define a part as a list of parts. 'multiIf' works more like 'h', letting you filter the requests that reach the nested list.

h

h :: (Monad m, Monad m', FromMessage req_msg, FromReqURI uri_match uri_msg, MatchMethod req_match)
     => uri_match -> req_match
        -> (uri_msg -> req_msg -> m (Either Request (m' res)))
        -> ServerPart m Request m' res

When you use 'h' to construct a ServerPart, you give it two expressions: one for matching the URI paths and another for matching the HTTP methods that you want the handler to handle. The third parameter is the handler itself: it gets one argument based on the URI and another based on the query parameters. The type classes FromMessage, FromReqURI and MatchMethod ensure that there are various types uri_match, req_match, uri_msg and req_msg can take, and you can define more.

hs

hs :: path -> method -> (path -> method -> res) -> res

basicFileServe :: (FromReqURI urimatcher [String],
 HAppS.Protocols.SimpleHTTP2.MatchMethod matcher,
 Monad m) 
 => String -> urimatcher -> matcher -> ServerPart m Request IO Result

As can be seen from the extremely generic type, 'hs' doesn't really do anything. From the earlier examples we have learned that it lets us include in our part list a part which does its own filtering and message decoding, such as basicFileServe, so that it looks like a call to 'h'. It doesn't do any actual request filtering, it just passes the filtering expressions to the part.

multiIf

multiIf :: (MatchMethod req_match, FromReqURI uri_match (), Monad m, Monad m')
        => uri_match -> req_match
        -> [ServerPart m Request m' res]
        -> ServerPart m Request m' res

An interesting variation of 'h', 'multiIf' takes URI path and HTTP method matching expressions, and excludes non-matching requests from a nested list of server parts.

For example, HPaste includes a ServerPart called adminSubsystem in its source code. It implements the URI paths that start with /admin and doesn't affect other URIs:

adminSubsystem accounts
  = multiIf (Prefix ["admin"]) [GET,POST]
      [hs ()                ()   $ basicAuth "Admins Only" accounts
      ,h ["admin","delete"] GET  $ ok       handleGetDelete
      ,h ["admin","delete"] POST $ seeOther handlePostDelete
      ,h ()                 ()   $ ok       handleAdminIndex
      ]

Here the effect of multiIf is that 'basicAuth' and 'handleAdminIndex' aren't applied to all requests. Notice how "admin" still needs to be included in the delete handlers, since 'multiIf' only filters.

runServerParts

runServerParts :: (Monad m', Monad m) 
                  => [ServerPart m request m' response]
                     -> request
                     -> m (Either request (m' response))

Lists of ServerParts are normally run with 'runServerParts'. It can be used to process a nested list of parts in a custom handler.

For example, here's a handler that sets the implicit parameters ?root and ?time to be available within all of handlers in 'myApp':

h (Prefix ()) () $ \(path::[String]) req -> do
    time <- getTime
    let root = concatMap (const "../") $ drop 1 path
    let ?time = time; ?root = root in runServerParts myApp req

(Implicit parameters are useful when you don't want to pass the parameters explicitly in a deep hierarchy of function calls. 'getTime' needs to be run in the monad but ?time is available in pure functions. ?root provides the right amount of '..' needed in the beginning of link URI references to get to the root directory of myApp even when the HTTP server hosts more than one application, perhaps even behind an Apache reverse proxy.)

How to deploy your application?

You need a server which has enough RAM to hold the state of your HAppS application and while they are transmitted the requests and responses. Of course you need to be able to run HAppS as a continuous background process on the server. Typically this is achieved by starting 'screen' and starting HAppS within that.

If you compile on the server, you need to have the compiler and all the required libraries installed, and the compiling and linking process easily takes a hundred megabytes of extra RAM. If you have a work station computer which is similar to the server, you can compile on the work station and upload the resulting binary to the server. Note that you don't necessarily need root (administration) privileges on the server, as HAppS can use unprivileged ports, such as the port 8000 by default. Similarily, you can install the compiling environment to your home directory by setting prefix as that directory.

At least if you want to run more than one HAppS process on one server, you need to change the default port numbers. This is done with for example './myApp --http-port=8001'.

Backgrounding

If you want to run the application in the background without 'screen', you still need a way to provide it with standard input as it waits for an 'e' there before it exits. One way to accomplish this is to background a shell pipeline which sleeps (practically) forever before outputting the 'e':

bash -c '((sleep 1000000000;echo e) | ./myApp) &' </dev/null &>/dev/null

Now to shut down the app, you can simply kill the sleep process, which causes the 'e' to be input to the app.

Apache reverse proxy

If you want more behind port 80 on a server than a single HAppS process, you can run Apache as the HTTP server on port 80, and reverse proxy the HAppS process(es).

Allow public reverse proxying of the real location of the HAppS process:

<IfModule mod_proxy.c>
  <Proxy http://localhost:8000/>
    Allow from all
  </Proxy>
</IfModule>

Configure a path prefix to correspond to the HAppS process:

<Location /myhappsapp/>
        <IfModule mod_proxy.c>
                ProxyPass http://localhost:8000/
                ProxyPassReverse http://localhost:8000/
        </IfModule>
</Location>

In your HAppS app, you need to take care of the links in HTML being interpreted correctly by web browsers. The links either need to be all relative, or the absolute paths need to include "/myhappsapp/". However, you needn't worry about URIs in HTTP headers such as a redirect Location, as ProxyPassReverse takes care of those.