Here's a way I've been tinkering with to think about iteratees clearly.<br><br>For simplicity, I'll stick with pure, error-free iteratees for now, and take chunks to be strings. Define a function that runs the iteratee:<br>
<br>> runIter :: Iteratee a -> [String] -> (a, [String])<br><br>Note that chunking is explicit here.<br><br>Next, a relation that an iteratee implements a given specification, defined by a state transformer:<br>
<br>
> sat :: Iteratee a -> State String a -> Bool<br><br>Define sat in terms of concatenating chunks:<br><br>> sat it st =<br>> second concat . runIter it == runState st . second concat<br><br>where the RHS equality is between functions (pointwise/extensionally), and runState uses the representation of State directly<br>
<br>> runState :: State s a -> s -> (a,s)<br><br>(I think this sat definition is what Conrad was alluding to.)<br><br>Now use sat to specify and verify operations on iteratees and to *synthesize* those operations from their specifications. Some iteratees might not satisfy *any* (State-based) specification. For instance, an iteratee could look at the lengths or number of its chunks and produce results accordingly. I think of such iteratees as abstraction leaks. Can the iteratee vocabulary be honed to make only well-behaved (specifiable) iteratees possible to express? If so, can we preserve performance benefits?<br>
<br>If indeed the abstraction leaks can be fixed, I expect there will be a simpler & more conventional semantics than sat above.<br><br> - Conal<br><br><br><div class="gmail_quote">On Tue, Aug 24, 2010 at 2:55 PM, Conrad Parker <span dir="ltr"><<a href="mailto:conrad@metadecks.org">conrad@metadecks.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div><div></div><div class="h5">On 24 August 2010 14:47, Jason Dagit <<a href="mailto:dagit@codersbase.com">dagit@codersbase.com</a>> wrote:<br>
><br>
><br>
> On Mon, Aug 23, 2010 at 10:37 PM, Conrad Parker <<a href="mailto:conrad@metadecks.org">conrad@metadecks.org</a>><br>
> wrote:<br>
>><br>
>> On 24 August 2010 14:14, Jason Dagit <<a href="mailto:dagit@codersbase.com">dagit@codersbase.com</a>> wrote:<br>
>> > I'm not a semanticist, so I apologize right now if I say something<br>
>> > stupid or<br>
>> > incorrect.<br>
>> ><br>
>> > On Mon, Aug 23, 2010 at 9:57 PM, Conal Elliott <<a href="mailto:conal@conal.net">conal@conal.net</a>> wrote:<br>
>> >>><br>
>> >>> So perhaps this could be a reasonable semantics?<br>
>> >>><br>
>> >>> Iteratee a = [Char] -> Maybe (a, [Char])<br>
>> >><br>
>> >> I've been tinkering with this model as well.<br>
>> >><br>
>> >> However, it doesn't really correspond to the iteratee interfaces I've<br>
>> >> seen, since those interfaces allow an iteratee to notice size and<br>
>> >> number of<br>
>> >> chunks. I suspect this ability is an accidental abstraction leak,<br>
>> >> which<br>
>> >> raises the question of how to patch the leak.<br>
>> ><br>
>> > From a purely practical viewpoint I feel that treating the chunking as<br>
>> > an<br>
>> > abstraction leak might be missing the point. If you said, you wanted<br>
>> > the<br>
>> > semantics to acknowledge the chunking but be invariant under the size or<br>
>> > number of the chunks then I would be happier.<br>
>><br>
>> I think that's the point, ie. to specify what the invariants should<br>
>> be. For example (to paraphrase, very poorly, something Conal wrote on<br>
>> the whiteboard behind me):<br>
>><br>
>> run [concat [chunk]] == run [chunk]<br>
>><br>
>> ie. the (a, [Char]) you maybe get from running an iteratee over any<br>
>> partitioning of chunks should be the same, ie. the same as from<br>
>> running it over the concatenation of all chunks, which is the whole<br>
>> input [Char].<br>
><br>
> I find this notation foreign. I get [Char], that's the Haskell String<br>
> type, but what is [chunk]? I doubt you mean a list of one element.<br>
<br>
</div></div>sorry, that was just my way of writing "the list of chunks" or perhaps<br>
"the stream of chunks that represents the input".<br>
<font color="#888888"><br>
Conrad.<br>
</font><div><div></div><div class="h5"><br>
><br>
>><br>
>> > I use iteratees when I need to be explicit about chunking and when I<br>
>> > don't<br>
>> > want the resources to "leak outside" of the stream processing. If you<br>
>> > took<br>
>> > those properties away, I wouldn't want to use it anymore because then it<br>
>> > would just be an inelegant way to do things.<br>
>><br>
>> Then I suppose the model for Enumerators is different than that for<br>
>> Iteratees; part of the point of an Enumerator is to control the size<br>
>> of the chunks, so that needs to be part of the model. An Iteratee, on<br>
>> the other hand, should not have to know the size of its chunks. So you<br>
>> don't want to be able to know the length of a chunk (ie. a part of the<br>
>> stream), but you do want to be able to, say, fold over it, and to be<br>
>> able to stop the computation at any time (these being the main point<br>
>> of iteratees ...).<br>
><br>
> I think I agree with that.<br>
> Jason<br>
</div></div></blockquote></div><br>