Dealing with fragile C libraries (e.g. MySQL) from Haskell

I spent some time today trying to talk to a MySQL database server from a piece of middleware I'm writing in Haskell. You might think that talking to a database server would be easy, but it turned out to be quite a bother.

Both of the major MySQL bindings, HDBC-mysql and HDBC-odbc, use the libmysqlclient C library behind the scenes. With GHC's unthreaded runtime, which is still the default, an application using either will work fine. However, my middleware app is highly concurrent and uses software transactional memory (STM) to manage some shared state, and I have to use the threaded runtime. This is where my troubles began.

The symptom I observed was that I couldn't even connect to a database:

SqlError {
  seState = "", 
  seNativeError = 2003, 
  seErrorMsg = "Can't connect to MySQL server on 'xxxxx' (4)"}

After enough years of dealing with MySQL, you pick up some useful nuggets such as "the number in parentheses at the end of certain kinds of error message is a Unix errno value" (the library doesn't provide any other way to see what errno caused a failure, amusingly enough). The number 4 is EINTR, indicating that a system call was being interrupted.

I split my development time between a Mac and a Linux laptop, and today's hacking was on a Mac, so I fired up dtruss to see what was wrong:

dtruss -b128m myapp

(I'd much preferred to have been using Linux here. dtruss is vastly inferior to strace, and in fact in its default configuration, it doesn't work at all! That -b128m is necessary to give its kernel component enough of a scratchpad that it won't run out of space while sampling.)

The interrupted system call was connect, and sure enough, reading the library source code, we can see that the problem lies in the my_connect function:

  /*
If they passed us a timeout of zero, we should behave
exactly like the normal connect() call does.
*/

if (timeout == 0)
return connect(fd, (struct sockaddr*) name, namelen);

The comment is more or less accurate, but the library should be more careful in its use of the connect function: the caller of my_connect doesn't check for EINTR, and so the connection will fail if the thread receives a signal.

Why is the thread receiving a signal in the first place, though? GHC's threaded RTS sets up either a SIGALRM or SIGVTALRM signal to perform some internal book-keeping at a fairly high frequency, and it's the arrival of this signal that interrupts connect. Failure to check for EINTR and retry is a widespread problem in C code that uses system calls directly.

To work around this, I wrote a simple module that masks the RTS signals that the MySQL client library fails to handle, then performs an action. It ensures that it's running in a bound thread (GHC terminology for a lightweight thread that's tied to a heavyweight system thread) for the duration of the action.

{-# LANGUAGE EmptyDataDecls, ForeignFunctionInterface #-}

module RTSHack (withRTSSignalsBlocked) where

import Control.Concurrent (runInBoundThread)
import Control.Exception (finally)
import Foreign.C.Types (CInt)
import Foreign.Marshal.Alloc (alloca)
import Foreign.Ptr (Ptr, nullPtr)
import Foreign.Storable (Storable(..))

#include <signal.h>

withRTSSignalsBlocked :: IO a -> IO a
withRTSSignalsBlocked act = runInBoundThread . alloca $ \set -> do
sigemptyset set
sigaddset set (#const SIGALRM)
sigaddset set (#const SIGVTALRM)
pthread_sigmask (#const SIG_BLOCK) set nullPtr
act `finally` pthread_sigmask (#const SIG_UNBLOCK) set nullPtr

data SigSet

instance Storable SigSet where
sizeOf _ = #{size sigset_t}
alignment _ = alignment (undefined :: Ptr CInt)

foreign import ccall unsafe "signal.h sigaddset" sigaddset
:: Ptr SigSet -> CInt -> IO ()

foreign import ccall unsafe "signal.h sigemptyset" sigemptyset
:: Ptr SigSet -> IO ()

foreign import ccall unsafe "signal.h pthread_sigmask" pthread_sigmask
:: CInt -> Ptr SigSet -> Ptr SigSet -> IO ()
Posted in haskell, linux, open source
2 comments on “Dealing with fragile C libraries (e.g. MySQL) from Haskell
  1. Benoit says:

    Funny the article following this post in my RSS was about the same issue :)

    http://factor-language.blogspot.com/2010/09/two-things-every-unix-developer-should.html

  2. Gabriel Wicke says:

    An easier solution is available for idempotent syscalls directly wrapped through the FFI: Foreign.C.Error has some handy retry wrappers.

    Not that this would help for signal-unsafe C libraries, but might still be useful to somebody new to the FFI.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>