[Haskell-cafe] Re: (flawed?) benchmark : sort

Adrian Hey ahey at iee.org
Wed Mar 12 15:28:53 EDT 2008


Jules Bean wrote:
> Adrian Hey wrote:
>> This might be a reasonable thing to say about *sortBy*, but not sort
>> as the ordering of equal elements should not be observable (for any
>> correct instance of Ord). It should be impossible to implement a
>> function which can discriminate between [a,a],[a,b],[b,a],[b,b] if
>> compare a b = EQ.
> 
> The fact that you can't implement a function to differentiation does not 
> meant the difference is not important.
> 
> "[b,a]" might cause 500G of file IO which "[a,b]" will not cause.

I can't imagine why, unless this is some weird side effect of lazy IO
(which I thought was generally agreed to be a "bad thing").

What if it's the "[a,b]" ordering that causes this but the "[b,a]"
ordering that doesn't? The choice is arbitrary (for sort), but neither
is obviously correct.

> This is not observable within haskell, but is very observable to the user.

Yes, there are plenty of things like space and time behaviour that are
not "observable" in the semantic sense, but have real obvervable
consequenses in the practical sense of real executables running on real
machines. But constraints like this and Data.Set/Map "left biasing"
often mean that implementations have to be made unnecessarily time and
space *inefficient* for no good semantic reason.

> Stability is a nice property. I don't understand why you are arguing 
> against this so aggressiviely.

I'm not arguing against it for sortBy. I wouldn't even object to the
existance of an overloaded..
  stableSort = sortBy compare
by definition.

I am arguing against it for the default sort that is applied to all
types because for many types there will be more efficient alternatives
which are perfectly correct in the semantic sense, but don't respect
the (semantically meaningless IMO for Ord instances) stability property.
Of course the proper place for this hypothetical sort (and several
other variations) is probably as an Ord class method, not a single
overloaded function in Data.List.

I would also regard any use of stableSort (in preference to the
hypothetical "unstable" overloaded sort) with about the same degree of
suspicion as any use of unsafePerformIO.

Regards
--
Adrian Hey



More information about the Haskell-Cafe mailing list