[Haskell-cafe] Re: What do _you_ want to see in FGL?

Wed May 12 06:08:57 EDT 2010

Heinrich Apfelmus <apfelmus at quantentunnel.de> writes:

> Ivan Lazar Miljenovic wrote:
>> As I said, we're considering using an Associated Type to let users
>> choose what type they want to use (probably with a default Map instance
>> for this).  However, we'd recommend/push the Int-based one.
>
> For my  make  example, I prefer a plain parameter because it would be
> too verbose to define a new class that has  FilePaths  as node types.
> I'd use the  Int  instead, but this defeats the point: why offer
> flexible node types when it's too much of a burden to use them?

The point is so that you can define new graph types with that node type.

>> An explicit type parameter for the vertex type is not appropriate for
>> this reason: you don't want to change it.
>
> It's true that I don't want to change the node type, but I want to curry
> it. If I don't feel like writing out the node type every time, I can use
> a type synonym:
>
>    type MyGraph a b = Graph FilePath a b
>
>
> Graphs with different node types don't behave differently; graphs are
> parametric with respect to the node type, just like lists don't behave
> differently on different element types.

There will be a Map-based graph available that will have the node type
parameter, but the variant that's currently called PatriciaTree will
most likely be the preferred default one (as it will have better
performance due to its use of IntMap).

We can't require the class to have the vertex type as a type parameter
for when we want a graph (such as PatriciaTree) _with_ a fixed vertex
type.

>>> Also, hard-coding  Node = Int  feels like the wrong kind of flexibility:
>>> the user is able to do anything with it, just in case the library forgot
>>> some important functionality. Which is exactly what happened in my case
>>> when I used  Map.findIndex . I prefer the library to get it right.
>> 
>> What do you mean by "the library forgot some important functionality"?
>
> Ah, I'm comparing  Node = Int  to a  Node  type that is entirely
> abstract. After all, conceptually,  Node  is not an integer, it's a
> unique identifier.
>
> If  Node  is abstract, then you will probably miss things like the [1..]
>  that you mentioned.
>
> Nonetheless, I would like to see  Node  to become abstract. This means
> that the graph library should include a library that deals with unique
> identifiers  Node . The [1..] pattern would correspond to a function
>
>     freshNodes :: () -> [Node]

Well, I don't use [1..] to get new nodes, I use them to map over nodes
(depending on how I construct the graph) or to create a new graph
completely from scratch.

Also, that type signature doesn't make sense; something like "freshNodes
:: (Graph g) => g -> [Vertex g]" might, but the problem with a generic
node type is that its not really possible to do such a thing in general;
AFAICT the full type signature will need to be:

    freshNodes :: ( Graph g, Enum (Vertex g), Bounded (Vertext g)
                  , Ord (Vertex g)) => g -> [Vertex g]

where the Bounded is needed for empty graphs (i.e. use minBound); if the
graph isn't empty, then take the maximum (hence Ord).  This of course
will be tricky to implement for the general case (if we have vertices
[1,2,4], should it return 3?  What happens about overflows?).

If we define such a function, it will most likely not be part of the
class definition since we wouldn't want to put these restrictions on the
vertex type (since I might want Integer as my vertex type, which isn't
an instance of Bounded; maybe having an unsafe variant of this that
assumes the graph is non-empty makes sense).

>> Really, performance aside, this is my biggest possible problem with
>> generic label types is that it may make it harder to define various
>> algorithms on graphs because you can no longer guarantee what you can do
>> with the vertex types; as such people may resort to requring the vertex
>> type to be Int or something to use a specific algorithm.
>
> Ah, you mean algorithms that create new nodes on the fly? I don't think
> that  Node = Int  works for them either, because the user might have his
> own ideas about which  Ints  can appear as graph vertexes. For instance,
> he might only use even numbers to denote vertexes and will be surprised
> by a library algorithm that suddenly creates odd vertexes. In short,
> Node  needs to be abstract for that.

Actually, I've looked through my code and it appears that (apart from
verboseness), there won't be too much of a problem with removing the
assumption of vertex type == Int.

However, I can't see any reason why someone would only want to use even
Int values.  As I think I've said before (I've been making these
arguments in various threads and discussions, so I'm not sure if I've
said it here): the vertex type is just an _index_ to ensure consistency,
etc; it is _not_ IMHO meant to represent the actual data: that's what
the labels are for.

> Other than that, I don't see much of a difference between custom vertex
> types and  Int  . Internally, you can always use  Int  to reference
> nodes and keep the association between the custom vertex type and  Int
> in a separate map, like this
>
>    data Graph node a b =
>        Graph { internal :: Gr a b , nodes :: Map node a }

Custom vertex types will work; it's just that using Int will probably
prove to be in general more efficient and easier to use.  I haven't said
we'll disallow custom vertex types, but I don't plan on going on the
extreme of having optional labels, or of making the vertex type a type
parameter for all graphs (since as I've said, you don't/can't always
want/assume that).

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic at gmail.com
IvanMiljenovic.wordpress.com