Dealing with fragile C libraries (e.g. MySQL) from Haskell

I spent some time today trying to talk to a MySQL database server from a piece of middleware I'm writing in Haskell. You might think that talking to a database server would be easy, but it turned out to be quite a bother.

Both of the major MySQL bindings, HDBC-mysql and HDBC-odbc, use the libmysqlclient C library behind the scenes. With GHC's unthreaded runtime, which is still the default, an application using either will work fine. However, my middleware app is highly concurrent and uses software transactional memory (STM) to manage some shared state, and I have to use the threaded runtime. This is where my troubles began.

The symptom I observed was that I couldn't even connect to a database:

SqlError {
  seState = "", 
  seNativeError = 2003, 
  seErrorMsg = "Can't connect to MySQL server on 'xxxxx' (4)"}

After enough years of dealing with MySQL, you pick up some useful nuggets such as "the number in parentheses at the end of certain kinds of error message is a Unix errno value" (the library doesn't provide any other way to see what errno caused a failure, amusingly enough). The number 4 is EINTR, indicating that a system call was being interrupted.

I split my development time between a Mac and a Linux laptop, and today's hacking was on a Mac, so I fired up dtruss to see what was wrong:

dtruss -b128m myapp

(I'd much preferred to have been using Linux here. dtruss is vastly inferior to strace, and in fact in its default configuration, it doesn't work at all! That -b128m is necessary to give its kernel component enough of a scratchpad that it won't run out of space while sampling.)

The interrupted system call was connect, and sure enough, reading the library source code, we can see that the problem lies in the my_connect function:

If they passed us a timeout of zero, we should behave
exactly like the normal connect() call does.

if (timeout == 0)
return connect(fd, (struct sockaddr*) name, namelen);

The comment is more or less accurate, but the library should be more careful in its use of the connect function: the caller of my_connect doesn't check for EINTR, and so the connection will fail if the thread receives a signal.

Why is the thread receiving a signal in the first place, though? GHC's threaded RTS sets up either a SIGALRM or SIGVTALRM signal to perform some internal book-keeping at a fairly high frequency, and it's the arrival of this signal that interrupts connect. Failure to check for EINTR and retry is a widespread problem in C code that uses system calls directly.

To work around this, I wrote a simple module that masks the RTS signals that the MySQL client library fails to handle, then performs an action. It ensures that it's running in a bound thread (GHC terminology for a lightweight thread that's tied to a heavyweight system thread) for the duration of the action.

{-# LANGUAGE EmptyDataDecls, ForeignFunctionInterface #-}

module RTSHack (withRTSSignalsBlocked) where

import Control.Concurrent (runInBoundThread)
import Control.Exception (finally)
import Foreign.C.Types (CInt)
import Foreign.Marshal.Alloc (alloca)
import Foreign.Ptr (Ptr, nullPtr)
import Foreign.Storable (Storable(..))

#include <signal.h>

withRTSSignalsBlocked :: IO a -> IO a
withRTSSignalsBlocked act = runInBoundThread . alloca $ \set -> do
sigemptyset set
sigaddset set (#const SIGALRM)
sigaddset set (#const SIGVTALRM)
pthread_sigmask (#const SIG_BLOCK) set nullPtr
act `finally` pthread_sigmask (#const SIG_UNBLOCK) set nullPtr

data SigSet

instance Storable SigSet where
sizeOf _ = #{size sigset_t}
alignment _ = alignment (undefined :: Ptr CInt)

foreign import ccall unsafe "signal.h sigaddset" sigaddset
:: Ptr SigSet -> CInt -> IO ()

foreign import ccall unsafe "signal.h sigemptyset" sigemptyset
:: Ptr SigSet -> IO ()

foreign import ccall unsafe "signal.h pthread_sigmask" pthread_sigmask
:: CInt -> Ptr SigSet -> Ptr SigSet -> IO ()
Posted in haskell, linux, open source
12 comments on “Dealing with fragile C libraries (e.g. MySQL) from Haskell
  1. Benoit says:

    Funny the article following this post in my RSS was about the same issue 🙂

  2. Gabriel Wicke says:

    An easier solution is available for idempotent syscalls directly wrapped through the FFI: Foreign.C.Error has some handy retry wrappers.

    Not that this would help for signal-unsafe C libraries, but might still be useful to somebody new to the FFI.

  3. I had this MySQL error (2013 in my case) using HDBC-mysql even when compiling without -threaded. I then use your function included in HDBC-mysql to protect against the signal issue with libmysqlclient. I had to explicitly compile with -threaded even if I don’t need it… probably because of the runInBoundThread I imagine (it throws an exception at runtime).

    When I did strace on Ubuntu 14.04 of a simple program, I saw that SA_RESTART when setting SIGVTALRM handler. Should it not allow the libmysqlclient library to not receive EINTR on long syscalls and be restarted automatically?

    Does that mean that other libraries like MySQL-simple are vulnerable too?

  4. Esteban says:

    July 21, 2012 at 9:43 amI donb4t understand where to or how to cgnhae the ebmail type optiona8. It seems if I donb4t set a type option mailchimp automatically sets it to HTML. This is the error message I get. Thanks in advance.There was an error creating your campaign. Oh snap! 311 Your list has an email type option, so the text part is required Reply

  5. Metin says:

    Actually, the non-recursive solution DOES work if you pass in the head of the list as the fuictnon argument. A good way to visualize it is to draw some arrows between some numbered nodes and keep track of which node each node pointer is currently pointing to. As you iterate through the loop, your arrows will change direction, hence reversing your list. When the loop is done, ptr and temp will both point to null, and previous will point to the start(head) of your reversed list.I haven’t worked through the recursive solution yet, so no comment on that.

  6. That’s going to make things a lot easier from here on out.

  7. did you do a clear/cache? that solved my problem. and my 4g is even better now the only thing that I’ve realized is when I have the 4G on, my Facebook will not load. In other words, the only way I can check facebook with my phone is either switching to 3g or using wifi. other than that, my 4g is almost perfect

  8. Excellent piece of writing and easy to fully understand story. How do I go about getting agreement to post component of the page in my upcoming newsletter? Offering proper credit to you the source and weblink to the site will not be a problem.

  9. Olin Botros says:

    This is jut a test on page

  10. For non-nostalgic I’d say a Y34 Cedric/Cima with the VQ30DET, just so much power and torque available, and a comfortable ride for touring. Possibly a Y34 4WD with the RB25DET if I was going in winter…..And nostalgic, of course it’d be my ’82 Celica XX 2.8GT. Ample power and torque, great seats to keep you comfortable, and a reasonable ride without being too blobby. Add in that large hatchback for plenty of luggage space and you’ve got a winner.

  11. Thanks a bunch for sharing this with all people you actually recognize what you’re talking approximately! Bookmarked. Please additionally talk over with my site =). We could have a hyperlink exchange contract among us!

  12. The router no longer brags about itself, but what does the laptop will show you the detail steps about Wi – Fi connection settings for thisHuawei LTE CPE B593. You place the booster inside ofthe battery compartment of your phone. Don’t continue to be frustrated by the limitedrange of your current wireless network when one simple productcan help you achieve so much more.

Leave a Reply

Your email address will not be published. Required fields are marked *