restarting the discussion

Henrik Nilsson nilsson@cs.yale.edu
Thu, 08 Feb 2001 20:13:38 -0500


Hi Jan, and Hi HaskellDocers,

Jan Skibinski wrote:
> 
> On Thu, 8 Feb 2001 malcolm-hs@cs.york.ac.uk wrote:
> ....
> > My immediate aim is to have a tool that produces browsable
> > documentation to be read by the user of a module, not necessarily
> > an author or developer (i.e. external docs, not internal docs).
> 
>         A good tool can produce either thing equally well. See below.
> ....
> > But the programmer will write many kinds of comment - not all are
> > intended for users - many describe the implementation, which is of
> > no interest to the mere user.  So we need to distinguish different
> > styles of comments.  This can be done lexically or positionally,
> > and we need to agree a standard.
> 
>         My opinion is that there should be only one style of comment,
>         the good and important comments. They can, of course belong
>         to different entities, such as functions, classes, methods,
>         datatypes, but all of them should be Good. If an entity
>         is implementation specific then do not export it. But if
>         you wish to make some special implementation notes in
>         the exported entity then make them somewhere in the body
>         of your code, not around the entity header, so the comment
>         extractor will never touch it. But if you do write the
>         comment around the header then be prepared to see it
>         in documentation. The pressure is on you to do it right.
>         You should be able to guarantee readability of the resulting
>         interface.

As outlined in my previous mail, I think this might be a little
too limiting, so my current prefernce is for some kind of comment
tagging. But I might be swayed, and I agree that the conventions Jan
outline are pleasantly lightweight.

> > I have another big requirement.  The source code must remain readable
> > as source code.  I absolutely loathe so-called "literate programming"
> > style, because it breaks this rule horribly.  From my point of view,
> > any new documentation standard must be as non-intrusive as possible.
> > This almost immediately rules out XML-style tags I'm afraid.  Other
> > (less heavy) lexical conventions might be OK though, provided there
> > are only a small number of them to be learnt.
> 
>         So you, Henrik and I agree on this point. I have been
>         advocating it quite strongly right here.
>         (BTW, I do write a lot of HTML-based "literate" tutorials but
>         they are not meant as libraries. Libraries should be readable
>         and easy to maintain).
> 

Yes, at least more or less! ;-)

> > I am intrigued by the use of positional cues (e.g. a comment just
> > before or just after a type signature) as a clever way of associating
> > documentation with code, whilst avoiding extra syntax.
> >
> > For those who have not yet looked at Armin's HDoc, can I encourage
> > you
> > to do so, as a concrete example of how some of these ideas have been
> > put into practice.  He develops a "special" kind of comment,
> > introduced
> > by {--- rather than {-, and has some small lexical conventions that
> > help to generate nice hyperlinked HTML.
> >
> > Also, look at Jan Skibinski's tool which captures comments by their
> > position with respect to a signature.  Both are linked from the
> > haskell.org page on libraries and tools.
> 
>         I have been working very hard to provide some working
>         examples of interfaces that have all of the above
>         features outlined by you. I was hoping that they will help
>         with this discussion. However, few cared to check
>         them out, which disappoints me greatly.

Sorry. I just had other things to attend to too.

>         In addition, those who have tried must have misunderstood
>         my links since the have not found any of the samples
>         - which is shown in the logs of my website. So here
>         is the careful explanation of those links, or rather the
>         links to new and improved interface samples.
> 
>         ------------------------------------------------------
>         They are in a temporary location and they may
>         be removed one day. I do not wish them to be indexed
>         from this email message, therefore I provide the information
>         in two stages:
>         1. The directory is
>                 http://www.numeric-quest.com/haskell/
>         This directory is already indexed, which should be.
>         2. Append to the above one of the following filenames:
>                 Extractor.short.html
>                 Extractor.long.html
>                 InterfacePrinter.short.html
>                 InterfacePrinter.long.html
>         That's all.

Thanks, I've now found them.

>         All the above samples are HTML-"preformatted".
> 
>         I have defined three types of interfaces so far:
>         Short - listing exported features only ("external" iface)
>         Long - listing all the features ("internal" iface)
>         Coded - Long, but with source code included (prettyfied
>                 source code)

Jan, one question: what do you do/intend to do with *re*exported
entities?

As explained in my previous mail, I think external documentation
should include everything a module exports, including the related
documentation, regardless of where the various entities were
originally defined.
 
>         The modules are still under development, and they do not do
>         yet everything I want them to do. There are some unfinished
>         portions of the code; for example, the printer does not
>         handle the classes properly yet, there are some formatting
>         issues, export/import should be improved, etc. You should
>         easily notice those shortcomings, because the samples
>         I provide are not edited - this is what is _really_ produced
>         by the tool.
> 
>         The good news is that all of those have been produced
>         without XML or HTML formatting information - straight
>         from the readable source code. But I use two lightweight
>         helpers: single quoted words within comments become
>         italic and the special banner --: separates groups of
>         functions.
> 
>         I will stop explaining right now. You should be able
>         to understand the tool solely from the above interfaces. If
>         not, then it would mean that I have been wrong all along
>         and that it is time for me to shut up.

I have to say that this already looks like a quite useful tool,
and it does show that one can go a long way with very lightweight
conventions. A few points, tough.

I'd again like to take the opportunity to push for the idea of also
defining and intermediate format. The documentation samples Jan
have presented are all very good examples of the kind of documentation
renderings one might want. But on the other hand, I don't think
they cover the entire spectrum of reasonable presentations.
(I believe I have some support from at least Jan on this one.)

This brings me my main concern: is the *style* of conventions
Jan proposes (not necessarily exactly those conventions he currently
use) flexible enough to:

   1. support other reasonable renderings,
   2. avoid creating a feeling among users that the conventions are
      "dictatorial" ;-)

Regarding 2, I'm not entirely serious: by necessity, the conventions
must be fairly strict. But after browsing through the Eiffel
documentation Jan sent a pointer too, I personally felt that the
required conventions were a bit too much. Now, I certainly do not
wish to start a discussion about the Eiffel conventions. I'm just
mildly concerned that too strict requirements might get in the way
of getting wide acceptance for a standard. But maybe this is a
non-issue.

Regarding 1, below are a few examples of things that I think are
reasonable.

Please keep in mind that I too favour a fairly lightweight
convention, but I am willing to allow a little extra noise for the
sake of flexibility. To be concrete, I like the convention 'xxx'
to mark a piece of code (such as a variable name) in running text
which Jan proposes, I think one or two more such conventions
would be useful (e.g. _xxx_ for emphasis). On the other hand,
I don't necessarily see anything wrong with HDoc/JavaDoc-style tags
for marking large things: the extra piece of information provided
could be quite valuable.

-----------------------------------------------------------------------
Example 1.

I sometimes write functions which take tuples among their arguments
and/or return tuples, and I sometimes find it useful to document
each field separately.

Returning a tuple immediately implies that it might be difficult
to start each documentation block with a sentence that describes
what the function returns (which seems to be your convention)?

Here's an example of the convention I've used for such cases up
until now (from the module LambdaLift in my compiler):

-- Lambda lifting.
-- m .......... The name of the module.
--              locally bound functions to a triple of the expression
--              to be substituted for the function names, a list of the
--              top-level identifiers in the expression (just one), and
a
--              list of the free variables in the expression in
question.
-- lvdos ...... List of new (i.e. lifted) top-level definitions with
--              associated occurrences of relevant top-level identifiers
--              from previous expressions.
-- e .......... The expression on which to perform lambda lifting.
--
-- Returns a five tuple:
-- #1 ......... List of new top-level definitions.
-- #2 ......... List of top-level identifiers which occurs in the
--              residual expression and in lrvs or which has been
--              introduced.
-- #3 ......... List of the free variables (not syntactically bound to a
--              lambda abstraction) in the expression prior to inner
--              lifting. 
-- #4 ......... List of the free functions (local variables bound to
--              lambda abstractions) in the expression prior to inner
--              lifting. 
-- #5 ......... Residual expression.

ll :: PackedString -> [Id] -> Env Id -> Env (Exp a, [Id], [Id])
      -> [(ValDef a, [Id])] -> Exp a 
      -> NS ([(ValDef a, [Id])], [Id], [Id], [Id], Exp a)

I guess I could be accused of writing too complicated functions
(in particular if one is to believe what the Eiffel book said :-),
but that's beside the point. This is real code, and I think it is
reasonable to require that a documentation format should be flexible
enough to document it and render the documentation in some nice way.

Here's another example, showing a tuple argument:

-- Pattern matching compiler.
-- p .......... Compilation parameters.
-- us ......... Variable vector ("SFIR" variables).
-- qs ......... Pattern/RHS matrix. Four tuple:
--              #1 .... Clause number.
--              #2 .... Substitution for RHS of this clause.
--                      Extended as pattern matching compilation
--                      proceeds.
--              #3 .... Pattern vector for this clause.
--              #4 .... RHS of clause. Three tuple. A boolean indicates
--                      whether the RHS can fail.
-- de ......... Default expression. Result in case of pattern matching
--              failure.
--
-- Returns a 3-tuple:
-- #1 ......... A list of substitutions, one for each RHS.
--              Note: This is FIR to SFIR substitutions, so this implies
--              a circular definition.
-- #2 ......... A fail flag that indicates whether the pattern matching
--              may fail so that the value of the default expression is
--              used.
-- #3 ......... The resulting expression.
--
-- Note: Newtypes are handled as any other algebraic type. This results
-- in case constructs of the form (case <exp1> of $id <v> -> <exp2>)
-- (no default clause). After type checking, these can be replaced by
-- (let <v> = <exp1> in <exp2>). 

------------------------------------------------------------------------
Example 2:

The lambda lifter in the previous example illustrates another problem:
how to deal with moands? NS happens to be a monad, so it is not
really correct to claim that the function "returns a five tuple".

In Armin's HDoc, there is a separte tag for marking monadic
return values. (But I think they still renders as "Returns ..."?)

In your case you seem to adopt the convention that functions
with return type (IO a) should be described in imperative terms.
This makes sense for IO and many other monads, but not necessarily for
all monads.

Do we need more structure here, or are monads just such a general
concept that there is not much one can do about it?

------------------------------------------------------------------------
Example 3:

This is really a variation on 2. Consider combinator libraries.
Sometimes it makes sense to adopt special commenting conventions.
Again, the Fudget GUI combinator library can serve as a useful example.

The central abstraction in the Fudget GUI is the Fudget, which can
be understood as a component with one user-visible input and one
user-visible output. Such a component may or may not have a graphical
rendering in the form of some GUI widget. Each fudget also has an
invisible, low-level input/output pair which connects it to the outside
world. An example could be a text input field. On the input it
would accept strings which are shown as default values to the user.
On the output, a user-entered string would appear as soon as the
user presses return.

The documentation for a function like

   foo :: a -> b -> F c d

where (F c d) is a fudget with input of type d(!) and output of type c,
would be something along the following lines:

   DESCRIPTION
       A foo fudget is a button that ...
       INPUT: True or False to switch the button on or off under
              program control.
       OUTPUT: True when the button is on, False when the button is off.
   ARGUMENTS:
       xxx :: a, ...
       yyy :: b, ...

The point here is that describing foo as a function returning a Fudget
isn't very helpful. Instead special conventions approporiate for
the domainin question were adopted.

I'm currently involved in the development of a framework for
domain-specific languages called FRP (Functional Reactive Programming).
One of its incarnations is as a combinator library. A central
abstraction is (Behavior a b) which represents a "transformer" of
signals of type a to signals of type b. I.e. there are some similarities
to a fudget (F b a). Again, when we document a behaviour, it would be
nice tobe able to talk about its inputs and outputs (which furthermore
often happen to carry elements of tuple types).

So, can we find conventions which are flexible enough to support
documentation of this type of code?

BTW, the Fudget library manual is available on line. It might be worth
checking it out. For example:

http://www.cs.chalmers.se/Cs/Research/Functional/Fudgets/Manual/current/small.html

A concrete example illustrating the points above is:

http://www.cs.chalmers.se/Cs/Research/Functional/Fudgets/Manual/current/toggleButtonF.html

-----------------------------------------------------------------------
Example 4

Well, not an example actually, more of a "laundry list". I touched
on most of thses in my previous mail:

* A possibility to include pictures does not seem entirely unreasonable.
* A convenient way of including a piece of code, e.g. a useage example.
  <code> </code> tags?
* Explicit cross references Cf. the man-page "see also" style. Quite
  useful to be able to refer the reader to closely related functions
  in a large library, for instance.
* Hints for generating indices at various level of detail
  (beginner's index, programmer's index, ...)

-----------------------------------------------------------------------

Maybe there are other Haskell-isms we need to take into account.
Compared to most imperative langauges, langauges like Haskell
are extremely flexible, and pepole tend to use this flexibility in all
kinds of innovative ways. Thus, documentation conventions that work
very for an imperative language, might be too rigid for our porposes.
I think Armin already has done a quite good job of identifying
some of the issues in his HDoc tool through the selection of markup
tags.

Best regards,

/Henrik

-- 
Henrik Nilsson
Yale University
Department of Computer Science
nilsson@cs.yale.edu