Difference between revisions of "High-level option handling with GetOpt"

From HaskellWiki
Jump to navigation Jump to search
 
m
(10 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 
== Introduction ==
  +
 
I've used Haskell to create various command-line utitities for unix-like
 
I've used Haskell to create various command-line utitities for unix-like
 
systems. In the process I developed a simple yet powerful and flexible
 
systems. In the process I developed a simple yet powerful and flexible
 
technique for processing program options.
 
technique for processing program options.
 
What you can read below is my unfinished attempt at writing an article
 
about it. The current form is probably far from good, but I decided I'll
 
rather release it as it is than waste my effort.
 
 
I will highly appreciate your opinions and criticism for both technical and
 
literary side of this text.
 
 
== High-level technique for program options handling ==
 
 
=== Introduction ===
 
   
 
Sven Panne's GetOpt module does a fine job at parsing command line options. It
 
Sven Panne's GetOpt module does a fine job at parsing command line options. It
Line 21: Line 12:
   
 
(Documentation and example code using this module can be found at
 
(Documentation and example code using this module can be found at
http://www.haskell.org/ghc/docs/latest/html/libraries/base/System.Console.GetOpt.html)
+
http://www.haskell.org/ghc/docs/latest/html/libraries/base/System-Console-GetOpt.html)
   
 
However, I checked over half a dozen serious programs written in Haskell and
 
However, I checked over half a dozen serious programs written in Haskell and
Line 47: Line 38:
 
would be a pointer for interested users.
 
would be a pointer for interested users.
   
=== About typical use of GetOpt ===
+
== About typical use of GetOpt ==
   
 
In a typical use of GetOpt module (like in the example) there is a definition
 
In a typical use of GetOpt module (like in the example) there is a definition
of a sum datatype with data constructors corresponding to individual
+
of a sum [[type|datatype]] with data [[constructor]]s corresponding to individual command-line options. Options with no arguments are represented by nullary constructors, and options that must or can have arguments - by unary
command-line options. Options with no arguments are represented by nullary
 
constructors, and options that must or can have arguments - by unary
 
 
constructors. For example, below is a slightly modified example from GetOpt's
 
constructors. For example, below is a slightly modified example from GetOpt's
 
documentation:
 
documentation:
Line 84: Line 73:
 
list, some options have to be combined, etc, etc.
 
list, some options have to be combined, etc, etc.
   
=== Advertised technique ===
+
== Advertised technique ==
  +
===Imports for the full program===
 
( Because I wanted it to be a fully functional Literate Haskell program, here
+
( Because I wanted it to be a fully functional Literate Haskell program, here come the required imports )
come the required imports )
 
   
 
<haskell>
 
<haskell>
Line 99: Line 87:
 
import Char
 
import Char
 
</haskell>
 
</haskell>
  +
===Analysis of options typing===
 
 
Let's look at program options not from the command-line side, but rather from
 
Let's look at program options not from the command-line side, but rather from
 
program(mer)'s side. How to encode them? What would be the best way to present
 
program(mer)'s side. How to encode them? What would be the best way to present
Line 110: Line 98:
   
 
We need some kind of a fixed-size polytypic dictionary of option values - a
 
We need some kind of a fixed-size polytypic dictionary of option values - a
perfect application for Haskell's records.
+
perfect application for Haskell's [[record]]s.
   
 
<haskell>
 
<haskell>
Line 132: Line 120:
 
</haskell>
 
</haskell>
   
  +
===Threading the options through the program===
 
But in the end we would like to get the record reflecting the options given to
 
But in the end we would like to get the record reflecting the options given to
 
the program. We do this by threading the Options record through functions
 
the program. We do this by threading the Options record through functions
Line 140: Line 129:
 
I won't use a pure function with type (Options -> Options), but rather an
 
I won't use a pure function with type (Options -> Options), but rather an
 
effectful function in the IO Monad, because I want to easily perform side
 
effectful function in the IO Monad, because I want to easily perform side
effects during option parsing (here only in 'verbose' and 'help' options, but I
+
effects during option parsing (here only in 'version' and 'help' options, but I
 
could also check validity of input and output files during option processing).
 
could also check validity of input and output files during option processing).
 
You may prefer to use a pure function, a State+Error monad, a State+IO monad,
 
You may prefer to use a pure function, a State+Error monad, a State+IO monad,
Line 165: Line 154:
 
"FILE")
 
"FILE")
 
"Input string"
 
"Input string"
  +
  +
, Option "v" ["verbose"]
  +
(NoArg
  +
(\opt -> return opt { optVerbose = True }))
  +
"Enable verbose messages"
   
 
, Option "V" ["version"]
 
, Option "V" ["version"]
Line 176: Line 170:
 
(NoArg
 
(NoArg
 
(\_ -> do
 
(\_ -> do
prg <- getProgName
+
prg <- getProgName
 
hPutStrLn stderr (usageInfo prg options)
 
hPutStrLn stderr (usageInfo prg options)
 
exitWith ExitSuccess))
 
exitWith ExitSuccess))
Line 183: Line 177:
 
</haskell>
 
</haskell>
   
  +
===The <hask>main</hask> function===
Now we combine all this pieces:
 
   
 
<haskell>
 
<haskell>
Line 205: Line 199:
   
 
Voila! As you can see most of work is done in option descriptions.
 
Voila! As you can see most of work is done in option descriptions.
  +
  +
[[Category:Idioms]]
  +
  +
== See also ==
  +
  +
* [[Command line option parsers]]
  +
* [[GetOpt]]

Revision as of 17:37, 23 December 2012

Introduction

I've used Haskell to create various command-line utitities for unix-like systems. In the process I developed a simple yet powerful and flexible technique for processing program options.

Sven Panne's GetOpt module does a fine job at parsing command line options. It has a simple, easy to use interface, handles short and long options, abbreviated options and most needed option argument modes (no / optional / required argument). It can even produce a nice usage info from option descriptions.

(Documentation and example code using this module can be found at http://www.haskell.org/ghc/docs/latest/html/libraries/base/System-Console-GetOpt.html)

However, I checked over half a dozen serious programs written in Haskell and saw that in most of them the code responsible for option handling is quite ugly, repetitious and error prone (of course there were exceptions, see below). Saying 'option handling' I don't mean the work of analyzing the command line, but rather how supplied options influence program's behaviour, how error situations are handled, etc.

(The exception is Wolfgang Thaller's VOP program. Wolfgang used a nice, but a little lengthy technique there. There are probably other ,,exceptional programs out there).

I think there are two major reasons of the current situation. The first reason is that option handling is rarely of primary concern to the programmer. Often there are not that much options to handle in the beginning and it seems that the simplest solution will do. Alas, if the program is evolving and its users ask for new functionality, new options appear, and the initial design starts to be an obstacle.

The other reason is lack of good examples. The example delivered with GetOpt shows how to use the library, but it doesn't show that there are other, better ways to use it. I don't propose to introduce more advanced techniques to this example, because it would make it rather heavy. But it would be nice if there would be a pointer for interested users.

About typical use of GetOpt

In a typical use of GetOpt module (like in the example) there is a definition of a sum datatype with data constructors corresponding to individual command-line options. Options with no arguments are represented by nullary constructors, and options that must or can have arguments - by unary constructors. For example, below is a slightly modified example from GetOpt's documentation:

  data Flag = Verbose               -- this option has no arguments
            | Version               -- no arguments
            | Input (Maybe String)  -- optional argument
            | Output String         -- mandatory argument
            | LibDir String         -- mandatory argument

GetOpt uses 'ArgDescr a' datatype to specify argument modes of options:

  data ArgDescr a = NoArg                   a
                  | ReqArg (String ->       a) String
                  | OptArg (Maybe String -> a) String

Because every constructor of Flag datatype has one of types Flag, (String -> Flag) or (Maybe String -> Flag), they can be given verbatim as first arguments to appropriate constructors of 'ArgDescr Flag' datatype.

Used in this way, getOpt parses a list of strings to a list of Flag values. It also separates options from non option arguments, signals unrecoginized, ambiguous or improperly used options.

But after all, it is not that big step. There is still much processing of this Flag list to do. One has to check if specific options are there in the list, some options have to be combined, etc, etc.

Advertised technique

Imports for the full program

( Because I wanted it to be a fully functional Literate Haskell program, here come the required imports )

module Main (main) where

import System.Console.GetOpt
import System
import Control.Monad
import IO
import List
import Char

Analysis of options typing

Let's look at program options not from the command-line side, but rather from program(mer)'s side. How to encode them? What would be the best way to present them to the programmer. How should they influence program's behaviour?

'Verbose' could be an easily accessible Bool value, False by default. 'Version', if supplied, could just print the program's version and exit. 'Input' could just yield a String, either from stdin or from a specified file. 'Output' could just take a String and do something with it.

We need some kind of a fixed-size polytypic dictionary of option values - a perfect application for Haskell's records.

data Options = Options  { optVerbose    :: Bool
                        , optInput      :: IO String
                        , optOutput     :: String -> IO ()
                        }

Note that I didn't place optVersion in the record. We will handle this option differently.

There is one special combination of options, the start (or default) options:

startOptions :: Options
startOptions = Options  { optVerbose    = False
                        , optInput      = getContents
                        , optOutput     = putStr
                        }

Threading the options through the program

But in the end we would like to get the record reflecting the options given to the program. We do this by threading the Options record through functions processing single options. Each such function can change this record. Why not just put such functions in ArgDescr and OptDescr datatypes? Here we benefit from first-class citizenship of functions.

I won't use a pure function with type (Options -> Options), but rather an effectful function in the IO Monad, because I want to easily perform side effects during option parsing (here only in 'version' and 'help' options, but I could also check validity of input and output files during option processing). You may prefer to use a pure function, a State+Error monad, a State+IO monad, or something different...

options :: [ OptDescr (Options -> IO Options) ]
options =
    [ Option "i" ["input"]
        (ReqArg
            (\arg opt -> return opt { optInput = readFile arg })
            "FILE")
        "Input file"

    , Option "o" ["output"]
        (ReqArg
            (\arg opt -> return opt { optOutput = writeFile arg })
            "FILE")
        "Output file"

    , Option "s" ["string"]
        (ReqArg
            (\arg opt -> return opt { optInput = return arg })
            "FILE")
        "Input string"

    , Option "v" ["verbose"]
        (NoArg
            (\opt -> return opt { optVerbose = True }))
        "Enable verbose messages"

    , Option "V" ["version"]
        (NoArg
            (\_ -> do
                hPutStrLn stderr "Version 0.01"
                exitWith ExitSuccess))
        "Print version"

    , Option "h" ["help"]
        (NoArg
            (\_ -> do
    	        prg <- getProgName
                hPutStrLn stderr (usageInfo prg options)
                exitWith ExitSuccess))
        "Show help"
    ]

The main function

main = do
    args <- getArgs

    -- Parse options, getting a list of option actions
    let (actions, nonOptions, errors) = getOpt RequireOrder options args

    -- Here we thread startOptions through all supplied option actions
    opts <- foldl (>>=) (return startOptions) actions

    let Options { optVerbose = verbose
                , optInput = input
                , optOutput = output   } = opts

    when verbose (hPutStrLn stderr "Hello!")

    input >>= output

Voila! As you can see most of work is done in option descriptions.

See also