unix-bytestring v0.2.0 (Was: Re: Proposal: add ByteString support to unix:System.Posix.IO API)

wren ng thornton wren at freegeek.org
Mon Mar 7 23:52:44 CET 2011


On 3/6/11 12:22 PM, Bas van Dijk wrote:
> On 6 March 2011 13:16, wren ng thornton<wren at freegeek.org>  wrote:
>> One thing I considered was to offer both versions, one for compatibility
>> with the old string versions (and for decluttering client code)[1] and then
>> one that just gives the bytestring. Of course, then the issue is what to
>> name them...
>>
>> So the big question is: how minimal should we be, vs how much in the way of
>> convenience functions should we offer?
>
> It would be great if you could analyse the direct reverse dependencies
> of unix to see how much fdRead is used and how much work it is to
> adapt packages to an fdRead which only returns the ByteString:
>
> http://bifunctor.homelinux.net/~roel/cgi-bin/hackage-scripts/revdeps/unix-2.4.2.0#direct

What an excellent idea!

It looks like unix has about 234 direct revdeps. Of which, some hasty 
grepping detects only 26 files which may call fdRead or fdWrite. Namely:

./HFuse-0.2.3/System/Fuse.hsc
./bindings-bfd-0.2.0/src/Bindings/Bfd/Disasm.hs
./bindings-bfd-0.2.0/src/Bindings/Bfd/Symbol.hsc
./cautious-file-0.1.5/src/System/Posix/ByteLevel.hsc
./epoll-0.2.2/examples/Stdin.hs
./epoll-0.2.2/src/System/Linux/Epoll/Buffer.hs
./funion-0.0.2/Funion.hs
./halfs-0.2/Binary.hs
./halfs-0.2/System/RawDevice/File.hs
./iteratee-0.8.1.2/src/Data/Iteratee/IO/Fd.hs
./iteratee-0.8.1.2/src/Data/Iteratee/IO/Posix.hs
./iteratee-mtl-0.5.0.0/src/Data/Iteratee/IO/Fd.hs
./iteratee-mtl-0.5.0.0/src/Data/Iteratee/IO/Posix.hs
./lhc-0.10/lib/base/src/Control/Concurrent.hs
./lhc-0.10/lib/base/src/GHC/IO/FD.hs
./lhc-0.10/src/Grin/Eval/Primitives.hs
./lhc-0.10/src/Grin/Stage2/Backend/C.hs
./liboleg-2010.1.10.0/System/IterateeM.hs
./liboleg-2010.1.10.0/System/LowLevelIO.hs
./liboleg-2010.1.10.0/System/RandomIO.hs
./liboleg-2010.1.10.0/System/SysOpen.hs
./miniplex-0.3.4/lib/System/Miniplex/Sink.hs
./serialport-0.3.3/System/Hardware/Serialport/Posix.hsc
./vty-4.6.0.4/src/Graphics/Vty/LLInput.hs
./ztail-1.0.1/TailHandle.hs

Of these, many are false positives due to some local function called 
fdReady (and ignored hereafter), many are false positives due to local 
definitions of fdRead, myfdRead, etc. (discussed later), and only about 
half a dozen appear to be direct calls to the unix package version.

Of these half dozen files, most of the use sites completely ignore the 
ByteCount, two or three use it only to check (==0) which can be 
efficiently detected by either String or ByteString's null predicate, 
and only two of them appear to use it in any nontrivial way (e.g., 
returning it from the current function, or printing it).

Thus, I conclude, nobody wants the ByteCount. The switch from String to 
ByteString would involve far more work than correcting literally a 
couple pattern matches per affected project. And less than half a dozen 
use sites across all of Hackage might consider calling BS.length to get 
at the information.


Of the local definitions there were two classes I noticed.

* The HFuse project defines their own bindings to the pread(2) and 
pwrite(2) functions, thus answering my question about whether anyone'd 
want them. They use them as handling ByteString buffers too, no less!

* The liboleg, iteratee, and iteratee-mtl packages all have copies of 
their own versions of fdRead (apparently a cut&paste job). In the 
documentation they level numerous complaints against the unix package's 
version of fdRead. The two notable ones are

(1) that fdRead allocates a new buffer every time it's called, this 
seems to be addressed by fdReadBuf but the packages haven't been updated 
to use it; and

(2) that fdRead throws errors (at all, let alone on EOF). They'd prefer 
the type:

     myfdRead
         :: Fd
         -> Ptr CChar
         -> ByteCount
         -> IO (Either Errno ByteCount)

Considering that fdReadBuf already addresses the first complaint, it 
seems that it might be worthwhile to provide a safe version of fdReadBuf 
which captures the error in an Either rather than throwing it (should be 
cheaper than throwing it and having a wrapper catch it and convert it 
into Errno?). Similarly, it may be worthwhile to have a version of 
fdRead which doesn't throw an exception on EOF, but just returns the 
empty ByteString instead.

-- 
Live well,
~wren



More information about the Libraries mailing list