The basics of applicative functors, put to practical work

Applicative functors are gorgeous and versatile creatures, but as is common in Haskell, they lack a little in documentation. The paper that Conor and Ross wrote introducing them is good, but dense. What if we were to skip all the scene-setting kerfuffle, and plunge into using them by example?

I won’t attempt to describe what applicative functors actually are, because the idea is easy to absorb: we’ll pick it up as an incidental product of figuring out how to use them. Bear with me.

To begin with, we need a motivating example, and we’ll choose something simple, yet chewy: parsing an application/x-www-form-urlencoded string. Remember, they look like this:

name=bryan+o%27sullivan&city=san+francisco

While we’ll use Parsec as our parsing library, we’ll give it a completely new skin.

module ApplicativeParsec
    (
      -- Re-export the contents of these modules, for convenience.
      module Control.Applicative
    , module Text.ParserCombinators.Parsec
    ) where

import Control.Applicative
import Control.Monad (MonadPlus(..), ap)
-- Hide Parsec's definitions of some Applicative functions.
import Text.ParserCombinators.Parsec hiding (many, optional, (<|>))

-- Every Monad is an Applicative.
instance Applicative (GenParser s a) where
    pure = return
    (<*>) = ap

-- Every MonadPlus is an Alternative.
instance Alternative (GenParser s a) where
    empty = mzero
    (<|>) = mplus

We’ve made two claims in the code above. We’ve said that a Parsec parser is an applicative functor (you know, whatever that is), and that we can chain parsers into a series of alternatives.

There doesn’t seem to be general agreement about whether it’s okay for a key in an urlencoded string to have no associated value, so we’ll assume that it is by wrapping the value up in Maybe. If a key has a value (including an empty string), we’ll use Just, otherwise Nothing.

p_query :: CharParser () [(String, Maybe String)]

I’ll give the complete definition of the parser here, then walk through what the various bits mean.

p_query = pair `sepBy` char '&'
  where pair = liftA2 (,) (many1 safe)
                          (optional (char '=' *> many safe))
        safe = oneOf urlBaseChars
           <|> char '%' *> liftA2 diddle hexDigit hexDigit
           <|> ' ' <$ char '+'
        diddle a b = toEnum . fst . head . readHex $ [a,b]

urlBaseChars = ['a'..'z']++['A'..'Z']++['0'..'9']++"$-_.!*'(),"

First of all, notice the complete lack of variables in the code. This is the reason behind calling these beasts applicative: we construct our code by combining applications of functions.

Now for a piecewise decomposition of the code.

pair `sepBy` char '&'

The sepBy combinator takes two parsers as its arguments. It applies the left parser, then the right, then the first, accumulating the results of each left-hand parse in a list.

The definition of pair is where things begin to become interesting.

pair = liftA2 (,) (many1 safe)
                  (optional (char '=' *> many safe))

We’ll dissect this from the inside out, which leads us to consider this first.

char '=' *> many safe

The *> combinator applies the parser on the left, throws away its result, then applies the parser on the right. You can think of the > as “pointing at” the parser whose value is really to be returned. This implies that there’s probably a <* combinator which throws away the result on the right, and indeed there is.

As for the parser on the right, many applies a parser repeatedly, building up a list of results as it goes. It terminates as soon as the parser fails.

This snippet amounts to “match an equals sign, then run the safe parser repeatedly until it fails, accumulating each result into a list”.

The optional combinator turns a parser that must succeed into one that might fail.

optional :: (Alternative f) => f a -> f (Maybe a)

By wrapping our earlier snippet with optional, we’ve turned it into a parser that will return Nothing if our key isn’t followed by a value, or Just wrapping the value if there is one.

The meaning of the many1 safe snippet should be easy to guess: many1 applies the safe parser repeatedly, like many, but it fails if its parser doesn’t succeed at least once. (If you’re used to thinking in terms of regular expressions, many is like *, and many1 like +.)

Now to liftA2 (,). This takes two parsers as arguments. It applies the first parser, then the second, then applies a function to their results. Here, the function is (,), which constructs a tuple. Considered as a whole, this gives us a parser that applies two other parsers, and tuples up their results, returning the tuple as its result.

The safe parser handles a single character of a key or value. Now that we’re beginning to become familiar with the notation, we can gloss a little over its details and concentrate on what’s new.

safe = oneOf urlBaseChars
   <|> char '%' *> liftA2 diddle hexDigit hexDigit
   <|> ' ' <$ char '+'

Most obvious is the <|> combinator. It takes two parsers as arguments. If the left succeeds, its result becomes the result of <|>. If it fails, the result becomes the result of the right parser. As the above code implies, we can chain uses of <|>.

The first “arm” of safe parses a single character, taken from a set of acceptable characters. The second consumes a percent-encoded character, using the liftA2 combinator that we saw earlier: it consumes a “%” character, followed by two hex digits, then applies diddle to the digits to turn them into a single Haskell character.

diddle a b = toEnum . fst . head . readHex $ [a,b]

The readHex function lives in the Numeric module, and parses a hex string as an integer.

readHex :: (Num a) => String -> [(a, String)]

We’re now faced with a new combinator, <$.

' ' <$ char '+'

This applies the parser on the right: if it succeeds, it throws away its result and instead returns the value on the left. In other words, if we match “+”, we return a space.

Here’s another example: here, we parse a series of RFC2822 or HTTP (or MIME, or what have you) headers into a list of name/value pairs, including the proper handling of continuation lines.

p_headers :: CharParser st [(String, String)]
p_headers = header `manyTill` crlf
  where header = liftA2 (,) fieldName (char ':' *> spaces *> contents)
        fieldName = liftA2 (:) letter (many fieldChar)
        fieldChar = letter <|> digit <|> oneOf "-_"
        contents = liftA2 (++) (many1 notEOL <* crlf)
                               (continuation <|> pure [])
        continuation = liftA2 (:) (' ' <$ many1 (oneOf " \t")) contents

crlf :: CharParser st ()
crlf = (() <$ string "\r\n") <|> (() <$ newline)

notEOL :: CharParser st Char
notEOL = noneOf "\r\n"

Notice our use of the <* combinator that we mentioned earlier, and once again the lack of let- or lambda-bound variables.

It can take a little while to get used to constructing and reading parsers in a wholly applicative style, but it does quickly start to feel both natural and appealing.

Compared to writing a parser in monadic style, the notation of the Control.Applicative module leads to significantly more compact code. Some of the big wins in expressivity come from the attention of the authors to surprisingly small details, like the <* and <$ operators, and their precedences and associativities. These let us write chains of “pointy” functors, where we visually follow the arrows to see what’s really being used.

By the way, the complete name for applicative functors is a mouthful: they’re strong lax monoidal functors. These structures come to us from abstract algebra and category theory. If you’re a functional programmer and you’re not following the work of people like Conor McBride and Jeremy Gibbons, you really should: it’s a whole lot of fun when the seemingly ethereal world of abstract mathematics comes home to roost in Haskell.

Posted in haskell
9 comments on “The basics of applicative functors, put to practical work
  1. Brent Yorgey says:

    Instead of liftA2, you can also use another nice set of Applicative combinators, and .

    liftA2 foo x y

    becomes

    foo x y

    One of the nice things about this is that it generalizes easily to liftA3, liftA4, … that is,

    liftAn foo x1 x2 … xn

    becomes

    foo x1 x2 … xn.

    (For those who might care, note that is just a synonym for fmap! And applies a function “inside” an applicative to a value inside another applicative.)

  2. Brent Yorgey says:

    Argh! it ate all my angle brackets! ok, take two, fingers crossed…

    Instead of liftA2, you can also use another nice set of Applicative combinators, <$> and <*>.

    liftA2 foo x y

    becomes

    foo <$> x <*> y

    One of the nice things about this is that it generalizes easily to liftA3, liftA4, … that is,

    liftAn foo x1 x2 … xn

    becomes

    foo <$> x1 <*> x2 <*> … <*> xn.

    (For those who might care, note that <$> is just a synonym for fmap! And <*> applies a function “inside” an applicative to a value inside another applicative.)

  3. augustss says:

    It’s interesting how the point free style of parser combinators are coming back into vogue after being displaced by monadic combinators for a while.
    The point free combinators was how I started (as pioneered(?) by Burge in “Recursive Programming Techniques” in the 70s).

  4. Thanks for this example, which I found through Brent Yorgey’s ‘The Typeclassopedia’ article in ‘The Monad Reader’, Issue 13.
    For those not too familiar with Parsec (like myself), here are some examples, using Hugs:

    ApplicativeParsec> parseTest p_query “name=bryan+o%27sullivan&city=san+francisco”
    [(“name”,Just “bryan o’sullivan”),(“city”,Just “san francisco”)]

    You need the parseTest function to run the example interactively.

    ApplicativeParsec> parseTest p_headers “Subject: RFC2822 or HTTP (or MIME, or what have you\n\n”
    [(“Subject”,”RFC2822 or HTTP (or MIME, or what have you”)]

    Note the 2 newlines at the end, which terminate the input in this text format.

    I replaced the liftA2 in the source code with and , as explained above.

  5. and forgot all about the < and > escape sequences, also as above. Tss,Tss…sorry!

  6. But to parse the non-context free string of n ‘a’ followed by n ‘b’ followed by n ‘c’ characters you do need the monadic structure. Example taken from:
    ‘Combinator Parsing: A Short Tutorial’ by S.Doaitse Swierstra

    p_abc :: CharParser st String
    p_abc = do as <- many (char ‘a’)
    let n = length as
    bs <- pnTimes n (char ‘b’)
    cs <- pnTimes n (char ‘c’)
    return (as ++ bs ++ cs)

    pnTimes :: Int -> CharParser st a -> CharParser st [a]
    pnTimes 0 p = return []
    pnTimes n p = (:) <$> p <*> pnTimes (n-1) p

    ApplicativeParsec> parseTest p_abc “aaabbbccc”
    “aaabbbccc”

  7. Gene Arthur says:

    Harkening back to the original premise of this entry, I would like to plug the versatility aspect of the applicatives on functors as opposed to using the monadic machinations.
    The problem with using monads and bind comes in the type situation. The monads have to match if outside of bind for the next level monad down, as in this situation:

    catMaybes $ [1..12] >>= ((\x -> if x >> return) >>= (liftM (* 8) >>> return)

    [40,48,56,64,72,80,88,96]

    Using applicative operators along with arrows you get the above with the catMaybe on the inside of the computational unit which is not possible as far as I can see with a single monad construction, without resorting to using the ($) outside of the application. Here is the same thing using the () from Control.Applicative and a new operator that I built, () that just extends down another functor layer:
    ()
    :: forall a b (f1 :: * -> *) (f :: * -> *).
    (Functor f, Functor f1) =>
    (a -> b) -> f1 (f a) -> f1 (f b)

    *Big3> ((\x -> if x < 5 then Nothing else Just x) ) >>> ((* 8) ) >>> catMaybes $ [1..12]

    [40,48,56,64,72,80,88,96]

    I will have to delve into Parsec module someday, but been having WAY to much fun, mucking about with other things.

    Cheers,
    gene

  8. Gene Arthur says:

    (* 8) that is supposed to be times 8 — eight and came out a smiley on my monitor in my comment above.. oh well.

  9. Jon P says:

    Very informative article. I like is especially b’coz it has contained all the key points very efficiently. By going through the info, I can see the research behind the information.
    Thanks a lot for sharing such a good stuff.

Leave a Reply

Your email address will not be published. Required fields are marked *

*