Database interface - would like advice on oracle library binding

Wed Sep 24 13:08:15 EDT 2003

Bayley, Alistair writes:
 :
 | Still making slow progress on an Oracle database binding... now I'm trying
 | to fit the API I have into some sort of abstract interface (like the one(s)
 | discussed previously:
 |  http://haskell.org/pipermail/haskell-cafe/2003-August/004957.html ).
 | 
 | 
 | 1. Is the left-fold the best/only interface to expose? I think yes,
 | but that doesn't allow fine-grained control over cursors i.e. being
 | able to open many cursors at once and interleave fetches from
 | them. Or does it?

It looks like the interleaving would be limited to a nested loop
structure: a cursor could be processed in full during one extraction
for another cursor.

Application-side nested loop structures are often a poor substitute
for server-side joins.

 | 2. I'm finding it hard to write a doQuery function that takes an
 | extraction function that isn't a pig to write. Some advice would be
 | useful here... (and a long-ish explanation follows):
 :

Here's my attempt to summarise the piggishness you describe:

    The interface to Oracle requires that you initialise a cursor by
    allocating a suitably typed buffer for each column prior to
    fetching the first row, and finalise a cursor by freeing those
    buffers after fetching the last row.

    This means that we must iterate over the columns 3 times.  We
    would prefer to express this iteration only once, and have the
    other 2 happen automatically within the library.  (As opposed to
    what ex3 does, which is to iterate for getExtractFnString, iterate
    for fetchcolN, and iterate for freecolN.)

Here's one approach: find the OCI equivalent of JDBC's
ResultSetMetaData, and use it to drive the allocation and freeing of
buffers.

Here's another:

    Add a mode attribute to the abstract data type which encompasses
    ErrorHandle and StmtHandle.  (I'll persist in referring to that
    ADT as Cursor.)

    Expect extraction functions to be written along these lines:

        \cursor result
           -> do field1 <- getInt    cursor
                 field2 <- getString cursor
                 field3 <- getString cursor
                 return ((field1, field2, field3):result, True)

    Make getInt (and friends) behave differently depending on the mode
    of the cursor they're passed: either allocate a buffer and return
    _|_, decode and return the current column of the current row, or
    free a buffer and return _|_.

    doQuery could then apply the extraction function once in Allocate
    mode after opening the cursor, once per fetched row in Decode
    mode, and once in Free mode at the end.

There's nothing to stop an extraction function from varying the number
of get___ functions it applies, or trying to match their results when
not in Decode mode.  These weakness could be mitigated by:

    Pointing out that some database connection standards (JDBC, and
    for all I know also ODBC) don't guarantee that you can still get
    at a row's 1st column after you've looked at its 2nd column,
    i.e. there's a precedent for such fragility.

    Complicating the extraction functions by giving them the type

        (Cursor -> b -> IO (IO (b, Bool)))

    , expecting that all the get___ functions are applied in the outer
    IO layer, and undertaking that the inner IO layer will only be
    used in Decode mode.

Regards,
Tom