[Haskell-cafe] Re: Allowing hyphens in identifiers

Sun Dec 13 23:11:59 EST 2009

Am Montag 14 Dezember 2009 01:44:16 schrieb Richard O'Keefe:
> On Dec 13, 2009, at 3:44 AM, Daniel Fischer wrote:
> > Am Freitag 11 Dezember 2009 01:20:55 schrieb Richard O'Keefe:
> >> On Dec 11, 2009, at 3:00 AM, Daniel Fischer wrote:
> >>> Am Mittwoch 09 Dezember 2009 23:54:22 schrieb Richard O'Keefe:
> >>>> Given the amazinglyUglyAndUnreadably baStudlyCaps namingStyle that
> >>>> went into Haskell forNoApparentReasonThatIHaveEverHeardOf,
> >>>
> >>> mb_t's_bcs the ndrscr_stl is considered far uglier and less readable
> >>> by others
> >>
> >> Come ON.  Argue honestly!
> >
> > Thanks, but I have to return that compliment.
>
> Not a bit of it.   "mb_t's_bcs the ndrscr_stl" was NOT playing
> fair and you knew it.

1. I wasn't playing in the under_score vs. camelCase game, just proposing a possible 
reason why the camelCase may have been chosen for Haskell's standard libraries. For that, 
I deliberately used an unrealistically exaggerated example of unreadable underscores. Even 
more unrealistic than your camelCase parody. The very next thing, I said that 
full_word_underscores is *not* unreadable, making clear (apparently not to you, sorry 
about that) that the example is not to be taken at face value.
2. Below, you say that it was an honest misunderstanding, so it's okay then. I didn't 
expect that to be the case but suspected intentional misinterpretation.

>
> > I claim that the part not using insanely abbreviated words
> > (isConsidered, farUglier,
> > byOthers) *is* readable.
> > Also that moderately abbreviated words are readable in camelCase as
> > well as under_score.
>
> I guess we must mean different things by "readable" then.

By readable, I mean (broadly) "easy to read".
Of course, some find is_considered far_uglier by_others easier to read, others harder, and 
yet others find both equally easy to read.
I have no difficulty believing that some find either of the styles hard to read, but I 
don't expect experienced programmers to be in that group.

>
> I've approached my head of department about running an experiment.
>
> >>> (granted,
> >>> underscore-style with nonabbreviated words is not unreadable, but
> >>> still extremely ugly)?
> >>
> >> Who "grants" that underscore separation with fully written words is
> >> "still extremely ugly"?  Not me!
> >
> > Is that remark really unclear,
>
> Yes, it is really unclear.  It read as "x (granted that y)".

Aha. English more fragile than thought. Sorry.

> For clarity, it should have been "(still extremely ugly, granted that
> underscore-style with nonabbreviated words is not unreadable)".

No, that is not what I meant.

[Granted,] Underscore-style with nonabbreviated words is not unreadable.
Even with written-out words, underscore-style is still extremely[1] ugly.

[1] This, however, is an exaggeration as far as I'm concerned. I find the underscore-style 
ugly, but not extremly so. It was your "amazinglyUgly" which caused it to appear in that 
sentence.

>
> > Nobody *grants* anything is ugly or not, that is an aesthetic
> > judgment, as such entirely a
> > matter of personal preference - you can agree with it or not.
>
> Where is it written that aesthetic judgements are _entirely_ a
> matter of personal preference?

I think you could find that written in many texts on aesthetic relativism. Doesn't matter, 
though.
Of course, one's personal preferences aren't developed in a vacuum, they are strongly 
influenced by genes and society. So a lot of preferences are shared within a culture and 
even across cultures and aesthetic judgments aren't entirely *individual*.
Nevertheless, I'm convinced that there is no Platonic idea Beauty (or Ugliness) and that 
if A says that van Gogh's Sunflowers is a beautiful picture while B says it's ugly, it's 
not the case that one is objectively right, the other wrong.
Both are judgments based on their respective preferences and nothing else (unless: lying, 
acting, succumbing to social pressure, other reasons to not express one's actual opinion).

>
> As a student I was in the Archaeological society.
> One of the things I learned was this:
>   - many ancient cultures would ritually "kill" grave goods
>   - some grave goods would be smashed up, others would just
>     have a chip out of them
>   - as a rule, the ones that were least damaged were the ones
>     the archaeologists considered to be the most beautiful (after
> repair).
> If an aesthetic sense about pots can be shared by modern archaeologists
> and people living five thousand years ago, it's hardly "entirely
> personal".

Depends on what you mean by "entirely personal". All I'm saying is that one can find a 
thing beautiful which another finds ugly.

>
> What you may not appreciate is that I have been a Smalltalk programmer
> for a long time.  I've read and written a lot of code like
>
>      printSolutions
>        (self sendSolutionsUsing: Empty and: Empty to: [:p :n |
>          Transcript nextPutAll:   'p = '; print: p;
>                     nextPutAll: ', n = '; print: n; nextPut: $.; cr.
>          true "stop after reporting first solution"]
>        ) ifFalse: [
>          Transcript nextPutAll: 'No solutions.'; cr].
>
> If it were just a matter of experience, then this experience should
> surely have taught me to love baStudlyCaps.

No. It should have tought you to *read* camelCase - unless your aversion is so strong that 
you actively refuse to learn reading it.

>
> >> I have not been able to discover an experimental study of word
> >> separation style effect on readability in programming.  I've been
> >> wondering about running a small one myself, next year.  But there
> >> is enough experimentally determined about reading in general to
> >> be certain that visible gaps between words materially improves
> >> readability, and internal capital letters harm it. Now that
> >> applies to ordinary text, but until there's evidence to show that
> >> it doesn't apply to program sources, it's a reasonable working
> >> assumption that it does.
> >
> > I doubt that.
>
> Until the evidence is in, what other reasonable working
> assumption is there?
>
None.

> > Sourcecode is so different from ordinary text (a line of sourcecode
> > rarely
> > contains more than four or five words),
>
> OK, let's try it.  Shakespeare sonnet number 1,
> first four lines, but split into shorter chunks
>
> With spaces:
>
>  From fairest creatures
>    we desire increase,
> That thereby
>    beauty's rose might never die,
> But as the riper should
>    by time decease,
> His tender heir
>    might bear his memory:
>
> With underscores:
>
>  From fairest_creatures
>    we desire_increase,
> That thereby
>    beauty's_rose might_never_die,
> But as the_riper should
>    by_time decease,
> His tender_heir
>    might_bear his_memory:
>
> With baStudlyCaps:
>
>  From fairestCreatures
>    we desireIncrease,
> That thereby
>    beauty'sRose mightNeverDie,
> But as theRiper should
>    byTime decease,
> His tenderHeir
>    mightBear hisMemory:
>
> baStudlyCaps doesn't read any better with short lines.

I have no trouble reading either version. And that although this is not what camelCase is 
intended for (as far as I know, the purpose of it is to mark word boundaries within *one 
token* [identifier]).

>
> > that I'd be very wary to transfer the findings for
> > one to the other.
>
> It's not uncommon for style guides to explicitly
> recommend that program code should be spaced like text.
> Apple notoriously violate this in Objective C.
> Where Smalltalk would have
>
>     this sendSolutionsUsing: Empty and: Empty to: that
>
> Objective C practice is to write
>
>    [this sendSolutionsUsing:Empty and:Empty to:that]
>

So? Whitespace helps tokenising and thus increases readability (for me, at least).
What's the relation to the question whether camel case and underscore are readable or not?

> > If somebody claimed that of
> >
> > x <- take_while some_condition some_list
> >
> > and
> >
> > x <- takeWhile someCondition someList
> >
> > either was objectively more readable than the other, I wouldn't
> > believe it without lots and lots of hard evidence.
>
> "Persaude a man against his will, he's of the same opinion still."
> How _much_ evidence?

Replicated studies with enough participants from enough different environments/cultures 
showing that  more than 99% of the participants find it clearly more readable.
That's due to the *objectively*; for such a strong claim, you need unusually strong 
evidence.
If you claim that a [large] majority find one more readable, it's far easier.

And if it turns out that a large majority finds underscores more readable, that would be a 
sufficient reason to use that [for new languages - mixing camel case and underscores in 
one language I believe to be as bad as usngabbrvtdwrdswocapsorundrscores].

I take the widespread presence of both as an indication that the majority isn't very 
large, so you'd have a little work to do to convince me.

> That's an artificial example.  Haskell code doesn't tend to look
> like that.  It's much more likely to be a choice between
> 	x' <- takeWhile p x
> and	x' <- take_while p x

And in such code *I* see _no_ difference in readability. Both read flawlessly, without any 
hiccough.
It's when more than two several-word identifiers are juxtaposed that readability degrades 
(for both). You should've tried enumFromThenToWithCuteExtraForKicks - such identifiers I 
find actually harder to read in camel case than with underscores. But using such 
identifiers is evil and wrong anyway.

>
> I'll repeat the run length figures I previously posted today,
> but this time as percentages
>
> 	len	pct
> 	 1	67%
> 	 2	22%
> 	 3	 7%
> 	 4	 3%
> 	 5	 1%
>
> 	>5	<1%
> >
> > for me the underscores push the
> > space-separated parts together, so that I have the tendency to
> > bracket it
> >
> > x <- take (while some) (condition some) list
>
> Such lines are rare.
> In my sample (very little of which was my own),
> under 14% of (lines having at least one identifier)
> had two or more internal capitals.
>
> Let's look at the number of internal capitals per run:
>
> 	0 99324	  79%
> 	1 20761	  16%
> 	2  4100	   3%
> 	3   791		3..10 all have under 1% each;
> 	4   221		the sum is 2%
> 	5    68
> 	6    11
> 	7    14
> 	8     4
> 	9     0
>         10     1
>
> takeWhile someCondition someList has three internal
> capitals, so it's one of just the 5% of lines with three or more.
> Let's try a more specific question:
> what proportions of runs have the pattern
> 	aB cD eF?
> 44 out of 125295, or 0.00035.
> You're asking me to sacrifice readability everywhere else
> for the sake of one line in every 2850?  (Not that I do
> find that line more readable in basStudlyCaps.)

Not at all. What gave you that idea?

Since camel case is the style used in the overwhelming majority of Haskell code and I 
consider mixing the two styles really bad, I ask you to use that in your public code, even 
if grudgingly.

You prefer to read and write code in underscore style. Others prefer camel case.
Without an easy way to convert, at least one group won't be happy.

You have already posted a preprocessor to convert between the two styles, I think. That's 
good.
If I can help improving it and making it more usable, I'd be happy to (there are a couple 
of points where the transformation is not trivial, {-# OPTIONS_GHC #-}, foreign import).