What’s good for C++ is good for … Haskell!?

A few days ago, my Facebook colleague Andrei Alexandrescu posted a note entitled Three Optimization Tips for C++, which reminded me that I had unfinished business with Haskell’s text package.

I took his code, applied it to the text package, and this is the story of what happened.

The text package provides a type named Builder for efficiently constructing Unicode strings from smaller fragments. A couple of years after Johan contributed the initial Builder implementation, I wrote some helper functions to perform recurring tasks, such as rendering numbers.

In my initial number renderer, I put no effort into making my code fast—as this snippet demonstrates. Simplicity first.

positive :: (Integral a) => a -> Builder
positive = go
    go n | n < 10    = digit n
         | otherwise = go (n `quot` 10) <> digit (n `rem` 10)
    digit n = singleton . intTodigit . fromIntegral

Having gotten the code working, I promptly forgot about it for a while. When I saw Andrei’s note nine months later, it tickled my memory: didn’t I have code that I could possibly improve? Indeed, when I took a look at my number rendering code, I found that I hadn’t even bothered to write benchmarks to measure its performance.

Before I discuss how I improved it, a brief description of how a Builder works is in order. A Builder provides a safe way to destructively write data into a list of fixed-size buffers, and to convert the result into an immutable Text value. External users of Builder never see the mutable buffers; they can only use safe access methods.

The code above uses the safe API. Each use of singleton is bounds-checked internally: if a write will not fit into the current buffer, Builder “finalizes” that buffer (putting it at the tail of the list), allocates a new one, and starts writing there. The <> operator sequences the writes.

While this implementation is correct, it is far from fast, partly due to the overhead of performing a bounds check for every character to be written out. In fact, this approach is almost always slower than simply using the show function, then converting the resulting [Char] value to a Builder!

My first observation was that if I knew the number of digits I needed up front, I could perform just one buffer-size check, instead of a separate check prior to rendering each digit.

countDigits :: (Integral a) => a -> Int
countDigits v0 = go 1 (fromIntegral v0 :: Word64)
  where go !k v
           | v < 10    = k
           | v < 100   = k + 1
           | v < 1000  = k + 2
           | v < 10000 = k + 3
           | otherwise = go (k+4) (v `quot` 10000)

This is almost identical to Andrei’s second digits10 function.

Once I was able to count digits, I could use the internal writeN function to destructively update the buffer. writeN ensures that a buffer is available with at least the requested amount of space; it then calls a user-supplied function, giving it the buffer to write to (marr) and the position to start writing at (off).

positive :: (Integral a) => a -> Builder
positive i
    -- we win when we special-case single-digit numbers
    | i < 10    = writeN 1 $ \marr off ->
                  unsafeWrite marr off (i2w i)
    | otherwise = let !n = countDigits i
                  in writeN n $ \marr off ->
                     posDecimal marr off n i

posDecimal :: (Integral a) =>
              forall s. MArray s -> Int -> Int -> a -> ST s ()
posDecimal marr off0 ds v0 = go (off0 + ds - 1) v0
  where go off v
          | v < 10 = unsafeWrite marr off (i2w v)
          | otherwise = do
              unsafeWrite marr off (i2w (v `rem` 10))
              go (off-1) (v `div` 10)

i2w :: (Integral a) => a -> Word16
i2w v = 48 + fromIntegral v

I then took Andrei’s very clever third digits10 function and translated that to Haskell. Syntax apart, there is a small difference between his function and mine: his is recursive, while mine is tail recursive (i.e. a loop).

countDigits :: (Integral a) => a -> Int
countDigits v0 = go 1 (fromIntegral v0 :: Word64)
    go !k v
      | v < 10    = k
      | v < 100   = k + 1
      | v < 1000  = k + 2
      | v < 1000000000000 =
          k + if v < 100000000
              then if v < 1000000
                   then if v < 10000
                        then 3
                        else 4 + fin v 100000
                   else 6 + fin v 10000000
              else if v < 10000000000
                   then 8 + fin v 1000000000
                   else 10 + fin v 100000000000
      | otherwise = go (k + 12) (v `quot` 1000000000000)
   fin v n = if v >= n then 1 else 0

(To be sure my intuition was correct, I did indeed measure recursive against tail recursive versions of my Haskell translation, and tail recursion wins by a few percent here.)

While this countDigits function helped performance by quite a bit, there was another step remaining in following Andrei’s example: converting two digits at a time.

posDecimal :: (Integral a) =>
              forall s. MArray s -> Int -> Int -> a -> ST s ()
posDecimal marr off0 ds v0 = go (off0 + ds - 1) v0
  where go off v
           | v >= 100 = do
               let (q, r) = v `quotRem` 100
               write2 off r
               go (off - 2) q
           | v < 10    = unsafeWrite marr off (i2w v)
           | otherwise = write2 off v
        write2 off i0 = do
          let i = fromIntegral i0; j = i + i
          unsafeWrite marr off $ get (j + 1)
          unsafeWrite marr (off - 1) $ get j
        get = fromIntegral . B.unsafeIndex digits

A final surprise came when I decided to try an experiment: what if I replaced the separate uses of quot and rem with a single use of quotRem? This improved performance by a further 30% on large 64-bit numbers! Why such a big difference? Because quotRem can often be emitted as a single machine instruction instead of two, and division is expensive enough that in a hot loop, this helps a lot.

Many modern optimizing compilers can spot this kind of opportunity automatically. Although GHC’s optimizer performs many complex high-level transformations, its machinery for handling low-level optimizations is currently weak. (This is why you’ll see a few cases of v+v instead of v*2 above, where I’m strength-reducing operations by hand instead of trusting the compiler.)

I was not at all surprised that Andrei’s optimization tips should translate perfectly to Haskell, as most of what he says has nothing to do with C++ itself. His advice should apply well to any language that provides low-level access to the machine and ends up running as native code. Since Haskell is often not understood to be a perfectly good imperative language, it’s easy for less experienced programmers to overlook the relevance of low-level concerns to Haskell performance.

The end result of all this tweaking is that the new number renderer in the text package is not just much faster than its predecessor, it is also a lot faster than the venerable show function. We retain the same API, external immutability, and type safety as before, but have a very nice five-fold increase in performance to show for our efforts!

Posted in haskell
16 comments on “What’s good for C++ is good for … Haskell!?
  1. Kirill Zaborsky says:

    I suppose that “Access denied” – https://docs.google.com/file/d/0B_CUoZk6rWOIejN5NDZYRmxGcTA/edit isn’t the right picture?

  2. Oops, fixed. Thanks!

  3. Boris Lykah says:

    New primops were created for quotRem as a solution for GHC issue 5598 with duplicate divisions. So, if quot and rem are used separately, there will be still two primops and two calculations which might be joined by LLVM, but not by NCG.

  4. Yves Pares says:

    This may be overrated, depending on how log performs but you can count digits this way :

    baseLog = log 10 — computed once and reused

    countDigits :: (Integral a) => a -> Int
    countDigits x = log x/baseLog + 1

  5. A picture of the bride smiling at you may be pretty, but it is also
    mundane. Another reason for the myth is that
    it seems logical that sitting on a seat and then repeatedly closing and opening your legs against resistance
    will burn up fat in your thighs, the very areas where you feel “the burn. ” As we mentioned earlier, everybody has cellulite, even the skinniest of people.

    My blog post – how to get thigh gap fast

  6. These give you an idea of what is meant by the requirements of the plan. This a match made in heaven as grains contain low quantities of another essential amino acid lysine, which pulses do contain.
    Allot of people have a hard time starting a diet journey, first step is to buy walking shoes.

    My website how to lose belly fat

  7. Let me start this by stating that I have never performed an Indignant Birds game before doing
    this overview. Do not use more or less effort on longer
    or shorter putts simple make your pendulum -type stroke larger or smaller.
    Mental condition characterized by perceptions that are abnormal and expressions of reality is called as Schizophrenia.

    Here is my blog: geometry dash hack

  8. It would be nice if a new game was released with no
    problems, no glitches, and no issues of any kind. But eventually, its supporters
    were able to spread the news of the tasty new treat
    and won over its detractors. Dialing this number provides you
    with the baseball bat, handgun, shotgun, MP5, M4, sniper rifle, RPG, and grenades.

    Feel free to surf to my web site … Telecharger Gta 5 gratuit pc complet

  9. For this reason it is the best to schedule the 3D ultrasound exam after the 25th week of pregnancy.
    Players can also own two properties at the same time with the upcoming ‘GTA Online’ High Life Update.
    Live concerts often incorporate a wide range of visual effects and lighting
    systems which enhance the excitement of an artist

    Feel free to visit my web site; telecharger gta 5 pc gratuit complet en francais utorrent

  10. funny jokes says:

    Also, some of their wittiness is sure to brush off on you.
    These are exciting gag gift suggestions you may consider that will certainly
    entertain anyone on his or her 75th birthday.

    You can have a superhero costume party in which the birthday celebrant
    himself wears a groovy costume; an old-fashioned children’s celebration where magicinas, clowns, mascots, and mimes, and other entertainers perform; and many other interesting ideas.

    Feel free to visit my web-site :: funny jokes

  11. Humanity may not find it easy to eat what ancestors ate, but remember old is
    gold and will always remain no, no matter
    what is the era. Comparing Organic, Free-Range and Commercial Chickens.
    Crossfit is sweeping the globe and athletes are picking up this new method of
    working out to train their bodies more efficiently.

    My webpage: paleo diet recipes

  12. My “personal” model ran less than $70 and my
    older unit ran about $250 at the time, but has come down in price over the years and is around $200-$220 range now.
    For a free option, Celtx is a favorite of penniless screenwriters and film students.

    Due to frequent updates, Malwarebytes’ Anti-Malware can find and
    usually remove infections and rouge programs soon after they become
    an issue.

  13. Longchamp Le Pliage Tote With Eiffel Tower.Discount Longchamp Paris Longchamp Bag vSTOb Ricordate che intorno
    al collo e fatto il segno per la torta (non aveva posto per scrivere la
    parola che si vede). Utilizzare questo come un coiite premier qui passo il progresso, anche se
    lentamente in un primo momento, sia direttamente
    e cumulativamente. Online Shop FüR Longchamp Taschen Paris
    Department Store Longchamp yMkFO Nei prossimi
    mesi, Sony Playstation e Nintendo Game Boy è una
    serie di videogiochi sul mercato per i KISS. BACIO fumetti sarà presto invaderanno il mercato, e la
    band e fan scatenati (Detroit Rock City), un grande film dovrebbe essere rilasciato questa primavera..
    Longchamp Shoulder Bag Sales Longchamp Outlet United States2013 Longchamp Bags URpHJ Sito Nike crea un senso di unità dove tutti connessi e tutto scorre
    insieme. Questa è una zona che è anche semplice e sottile quando presentano la loro tesi.
    Longchamp Bags Shop Online Prices For Longchamp Racecourse xlGTZ Un’altra cosa che abbiamo davvero
    bisogno era banalità fumetti cliché, ma abbiamo fatto comunque.
    Sì, è sottolineato in innumerevoli storie di Sci Fi / Avventura / Fantasy
    prima, ha una grande capacità di violenza è grande, ma
    la capacità dei salti sono più grandi. Paris Longchamp Store Paris
    Longchamps Hippodrome hRFiL (In Giappone e Corea del Sud, il marchio North Face di proprietà di terzi.) North dispone di abbigliamento
    composto da tuta sportiva, sport invernali e sportswear funzionale e calzature per
    uomo, donna e bambini. La sua linea apparecchiatura consiste di tende, sacchi a pelo, zaini,
    Daypacks e accessori.

    Pink Longchamp Backpack Paris Longchamp Flagship Store XdXoV 11/20/10, p.19)!
    Produce cancellare una sentenza a favore di Bohr, Bell, Aspect, et al.
    Anche se leggera sfocatura di scarpe di Einstein può essere liquidato come
    un difetto lente, le scarpe di Bohr, ovviamente privo posizione
    specifica e momentum.I goduto i due lunghi articoli su meccanica quantistica.
    Shop For Longchamp Bags Longchamp Pliage Bag Sizes gyzsv Fortunatamente per gli acquirenti in città
    ragazzo, libro Peres ‘, Manuale Dettagli uomini di stile (Gotham,
    $ 30), ricco di consigli su come scegliere le basi giuste.
    Come integratore, abbiamo chiesto a Peres
    di scegliere i loro cinque dischi preferiti articoli supplementari.
    Small Longchamp Tote Purses Longchamp Le Pliage xBsyb
    Mentre stivali invernali possono essere proprio questo, gli stivali indossati in inverno,
    stivali da neve hanno fatto il termine che descrive un certo tipo di stivali appositamente per l’utilizzo
    in condizioni di neve o bagnati. Conoscere la differenza se
    chiedete aiuto in un negozio di scarpe, o quando si
    acquista online..

  14. Bill says:

    Today, I went to the beach front with my children. I found a sea shell
    and gave it to my 4 year old daughter and said “You can hear the ocean if you put this to your ear.” She placed the shell to her ear and screamed.
    There was a hermit crab inside and it pinched her ear.
    She never wants to go back! LoL I know this is entirely off topic
    but I had to tell someone!

  15. I was recommended this web site by my cousin. I am not sure whether this post is
    written by him as nobody else know such detailed about my difficulty.
    You’re wonderful! Thanks!

  16. xxx says:

    Do you have any video of that? I’d want to find out more details.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>