Hi Conal,<div><br></div><div>I&#39;m aware of one case that violates semantic referential transparency, but it&#39;s a bug.  Which pretty much proves your point as I understand it.</div><div><br></div><div>John<br><br><div class="gmail_quote">

On Tue, Aug 24, 2010 at 2:01 PM, Conal Elliott <span dir="ltr">&lt;<a href="mailto:conal@conal.net">conal@conal.net</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Hi John,<br><br>Please note that I&#39;m suggesting eliminating chunks from the semantics only -- not from the implementation.<br><br>For precise &amp; simple chunk-less semantics, it&#39;s only important that the iteratees map equivalent input streams to equivalent output streams, where &quot;equivalent&quot; means equal after concatenating all of the chunks.  In other words, the chunk lists denote their concatenations, so semantically equal inputs must lead to semantically equal outputs.  (Assuming I understand the intention of chunking as being an implementation issue only, i.e., present only for optimization.)  We could call this property &quot;semantic referential transparency&quot;.  IIUC, &#39;drop&#39; is semantically RT, since it&#39;s *specified* in terms of elements (and only *implemented* in terms of chunks).<br>


<br>Do you know of any iteratees in use that map (semantically) equal inputs to (semantically) unequal outputs, i.e.,that  violate semantic RT as I&#39;ve defined it?  In the current APIs, one can easily define such iteratees, but I&#39;m hoping that the programming interfaces can be repaired to eliminate that problem (the &quot;abstraction leaks&quot; I&#39;ve been mentioning).<br>

<font color="#888888">


<br>   - Conal</font><div><div></div><div class="h5"><br><br><div class="gmail_quote">On Tue, Aug 24, 2010 at 9:32 PM, John Lato <span dir="ltr">&lt;<a href="mailto:jwlato@gmail.com" target="_blank">jwlato@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">


<div class="gmail_quote"><div>I think the big problem with chunking is that many useful iteratees need to be able to inspect the length of the chunk.  The most obvious is &quot;drop&quot;, but there are many others.  Or if not inspect the length, have a new function on the stream &quot;dropReport :: Int -&gt; s -&gt; (s, Int)&quot; which reports how much was dropped.  Either way, chunking adds a deal of implementation burden.</div>


<div><br></div><div>I suspect that the proper vocabulary for iteratees wouldn&#39;t include chunks at all, only single elements.  This discussion has prompted me to consider the implications of such an implementation, as it would be much simpler.  I have one idea that I think will at least maintain performance for many operations, although there will be performance hits too.  If the drawbacks are in areas that aren&#39;t particularly useful, though, it may be acceptable.</div>


<div><br></div><div>John</div><div> </div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">From: Conal Elliott &lt;<a href="mailto:conal@conal.net" target="_blank">conal@conal.net</a>&gt;<div>


<div></div><div><br>


<br>

Here&#39;s a way I&#39;ve been tinkering with to think about iteratees clearly.<br>

<br>

For simplicity, I&#39;ll stick with pure, error-free iteratees for now, and take<br>

chunks to be strings.  Define a function that runs the iteratee:<br>

<br>

&gt; runIter :: Iteratee a -&gt; [String] -&gt; (a, [String])<br>

<br>

Note that chunking is explicit here.<br>

<br>

Next, a relation that an iteratee implements a given specification, defined<br>

by a state transformer:<br>

<br>

&gt; sat :: Iteratee a -&gt; State String a -&gt; Bool<br>

<br>

Define sat in terms of concatenating chunks:<br>

<br>

&gt; sat it st =<br>

&gt;   second concat . runIter it == runState st . second concat<br>

<br>

where the RHS equality is between functions (pointwise/extensionally), and<br>

runState uses the representation of State directly<br>

<br>

&gt; runState :: State s a -&gt; s -&gt; (a,s)<br>

<br>

(I think this sat definition is what Conrad was alluding to.)<br>

<br>

Now use sat to specify and verify operations on iteratees and to<br>

*synthesize* those operations from their specifications.  Some iteratees<br>

might not satisfy *any* (State-based) specification.  For instance, an<br>

iteratee could look at the lengths or number of its chunks and produce<br>

results accordingly.  I think of such iteratees as abstraction leaks.  Can<br>

the iteratee vocabulary be honed to make only well-behaved (specifiable)<br>

iteratees possible to express?  If so, can we preserve performance benefits?<br>

<br>

If indeed the abstraction leaks can be fixed, I expect there will be a<br>

simpler &amp; more conventional semantics than sat above.<br>

<br>

  - Conal<br>

<br>

<br>

On Tue, Aug 24, 2010 at 2:55 PM, Conrad Parker &lt;<a href="mailto:conrad@metadecks.org" target="_blank">conrad@metadecks.org</a>&gt; wrote:<br>

<br>

&gt; On 24 August 2010 14:47, Jason Dagit &lt;<a href="mailto:dagit@codersbase.com" target="_blank">dagit@codersbase.com</a>&gt; wrote:<br>

&gt; &gt;<br>

&gt; &gt;<br>

&gt; &gt; On Mon, Aug 23, 2010 at 10:37 PM, Conrad Parker &lt;<a href="mailto:conrad@metadecks.org" target="_blank">conrad@metadecks.org</a>&gt;<br>

&gt; &gt; wrote:<br>

&gt; &gt;&gt;<br>

&gt; &gt;&gt; On 24 August 2010 14:14, Jason Dagit &lt;<a href="mailto:dagit@codersbase.com" target="_blank">dagit@codersbase.com</a>&gt; wrote:<br>

&gt; &gt;&gt; &gt; I&#39;m not a semanticist, so I apologize right now if I say something<br>

&gt; &gt;&gt; &gt; stupid or<br>

&gt; &gt;&gt; &gt; incorrect.<br>

&gt; &gt;&gt; &gt;<br>

&gt; &gt;&gt; &gt; On Mon, Aug 23, 2010 at 9:57 PM, Conal Elliott &lt;<a href="mailto:conal@conal.net" target="_blank">conal@conal.net</a>&gt;<br>

&gt; wrote:<br>

&gt; &gt;&gt; &gt;&gt;&gt;<br>

&gt; &gt;&gt; &gt;&gt;&gt; So perhaps this could be a reasonable semantics?<br>

&gt; &gt;&gt; &gt;&gt;&gt;<br>

&gt; &gt;&gt; &gt;&gt;&gt; Iteratee a = [Char] -&gt; Maybe (a, [Char])<br>

&gt; &gt;&gt; &gt;&gt;<br>

&gt; &gt;&gt; &gt;&gt; I&#39;ve been tinkering with this model as well.<br>

&gt; &gt;&gt; &gt;&gt;<br>

&gt; &gt;&gt; &gt;&gt; However, it doesn&#39;t really correspond to the iteratee interfaces I&#39;ve<br>

&gt; &gt;&gt; &gt;&gt; seen, since those interfaces allow an iteratee to notice size and<br>

&gt; &gt;&gt; &gt;&gt; number of<br>

&gt; &gt;&gt; &gt;&gt; chunks.  I suspect this ability is an accidental abstraction leak,<br>

&gt; &gt;&gt; &gt;&gt; which<br>

&gt; &gt;&gt; &gt;&gt; raises the question of how to patch the leak.<br>

&gt; &gt;&gt; &gt;<br></div></div><div>

&gt; &gt;&gt; &gt; From a purely practical viewpoint I feel that treating the chunking as<br>

&gt; &gt;&gt; &gt; an<br>

&gt; &gt;&gt; &gt; abstraction leak might be missing the point.  If you said, you wanted<br>

&gt; &gt;&gt; &gt; the<br>

&gt; &gt;&gt; &gt; semantics to acknowledge the chunking but be invariant under the size<br>

&gt; or<br>

&gt; &gt;&gt; &gt; number of the chunks then I would be happier.<br>

&gt; &gt;&gt;<br></div><div>

&gt; &gt;&gt; I think that&#39;s the point, ie. to specify what the invariants should<br>

&gt; &gt;&gt; be. For example (to paraphrase, very poorly, something Conal wrote on<br>

&gt; &gt;&gt; the whiteboard behind me):<br>

&gt; &gt;&gt;<br>

&gt; &gt;&gt; run [concat [chunk]] == run [chunk]<br>

&gt; &gt;&gt;<br>

&gt; &gt;&gt; ie. the (a, [Char]) you maybe get from running an iteratee over any<br>

&gt; &gt;&gt; partitioning of chunks should be the same, ie. the same as from<br>

&gt; &gt;&gt; running it over the concatenation of all chunks, which is the whole<br>

&gt; &gt;&gt; input [Char].<br>

&gt; &gt;<br>

&gt; &gt; I find this notation foreign.  I get [Char], that&#39;s the Haskell String<br>

&gt; &gt; type, but what is [chunk]?  I doubt you mean a list of one element.<br>

&gt;<br>

&gt; sorry, that was just my way of writing &quot;the list of chunks&quot; or perhaps<br>

&gt; &quot;the stream of chunks that represents the input&quot;.<br>

&gt;<br>

&gt; Conrad.<br>

&gt;<br>

&gt; &gt;<br>

&gt; &gt;&gt;<br></div><div>

&gt; &gt;&gt; &gt; I use iteratees when I need to be explicit about chunking and when I<br>

&gt; &gt;&gt; &gt; don&#39;t<br>

&gt; &gt;&gt; &gt; want the resources to &quot;leak outside&quot; of the stream processing.  If you<br>

&gt; &gt;&gt; &gt; took<br>

&gt; &gt;&gt; &gt; those properties away, I wouldn&#39;t want to use it anymore because then<br>

&gt; it<br>

&gt; &gt;&gt; &gt; would just be an inelegant way to do things.<br>

&gt; &gt;&gt;<br></div><div>

&gt; &gt;&gt; Then I suppose the model for Enumerators is different than that for<br>

&gt; &gt;&gt; Iteratees; part of the point of an Enumerator is to control the size<br>

&gt; &gt;&gt; of the chunks, so that needs to be part of the model. An Iteratee, on<br>

&gt; &gt;&gt; the other hand, should not have to know the size of its chunks. So you<br>

&gt; &gt;&gt; don&#39;t want to be able to know the length of a chunk (ie. a part of the<br>

&gt; &gt;&gt; stream), but you do want to be able to, say, fold over it, and to be<br>

&gt; &gt;&gt; able to stop the computation at any time (these being the main point<br>

&gt; &gt;&gt; of iteratees ...).<br>

&gt; &gt;<br>

&gt; &gt; I think I agree with that.<br>

&gt; &gt; Jason<br>

&gt;</div></blockquote></div>

</blockquote></div><br>

</div></div></blockquote></div><br></div>