Applicative functors are gorgeous and versatile creatures, but as is common in Haskell, they lack a little in documentation. The paper that Conor and Ross wrote introducing them is good, but dense. What if we were to skip all the scene-setting kerfuffle, and plunge into using them by example?

I won’t attempt to describe what applicative functors actually are, because the idea is easy to absorb: we’ll pick it up as an incidental product of figuring out how to use them. Bear with me.

To begin with, we need a motivating example, and we’ll choose something simple, yet chewy: parsing an `application/x-www-form-urlencoded`

string. Remember, they look like this:

name=bryan+o%27sullivan&city=san+francisco

While we’ll use Parsec as our parsing library, we’ll give it a completely new skin.

module ApplicativeParsec ( -- Re-export the contents of these modules, for convenience. module Control.Applicative , module Text.ParserCombinators.Parsec ) where import Control.Applicative import Control.Monad (MonadPlus(..), ap) -- Hide Parsec's definitions of some Applicative functions. import Text.ParserCombinators.Parsec hiding (many, optional, (<|>)) -- Every Monad is an Applicative. instance Applicative (GenParser s a) where pure = return (<*>) = ap -- Every MonadPlus is an Alternative. instance Alternative (GenParser s a) where empty = mzero (<|>) = mplus

We’ve made two claims in the code above. We’ve said that a Parsec parser is an applicative functor (you know, whatever that is), and that we can chain parsers into a series of alternatives.

There doesn’t seem to be general agreement about whether it’s okay for a key in an urlencoded string to have no associated value, so we’ll assume that it is by wrapping the value up in `Maybe`

. If a key has a value (including an empty string), we’ll use `Just`

, otherwise `Nothing`

.

p_query :: CharParser () [(String, Maybe String)]

I’ll give the complete definition of the parser here, then walk through what the various bits mean.

p_query = pair `sepBy` char '&' where pair = liftA2 (,) (many1 safe) (optional (char '=' *> many safe)) safe = oneOf urlBaseChars <|> char '%' *> liftA2 diddle hexDigit hexDigit <|> ' ' <$ char '+' diddle a b = toEnum . fst . head . readHex $ [a,b] urlBaseChars = ['a'..'z']++['A'..'Z']++['0'..'9']++"$-_.!*'(),"

First of all, notice the complete lack of variables in the code. This is the reason behind calling these beasts *applicative*: we construct our code by combining applications of functions.

Now for a piecewise decomposition of the code.

pair `sepBy` char '&'

The `sepBy`

combinator takes two parsers as its arguments. It applies the left parser, then the right, then the first, accumulating the results of each left-hand parse in a list.

The definition of `pair`

is where things begin to become interesting.

pair = liftA2 (,) (many1 safe) (optional (char '=' *> many safe))

We’ll dissect this from the inside out, which leads us to consider this first.

char '=' *> many safe

The `*>`

combinator applies the parser on the left, *throws away its result*, then applies the parser on the right. You can think of the `>`

as “pointing at” the parser whose value is really to be returned. This implies that there’s probably a `<*`

combinator which throws away the result on the right, and indeed there is.

As for the parser on the right, `many`

applies a parser repeatedly, building up a list of results as it goes. It terminates as soon as the parser fails.

This snippet amounts to “match an equals sign, then run the `safe`

parser repeatedly until it fails, accumulating each result into a list”.

The `optional`

combinator turns a parser that must succeed into one that might fail.

optional :: (Alternative f) => f a -> f (Maybe a)

By wrapping our earlier snippet with `optional`

, we’ve turned it into a parser that will return `Nothing`

if our key isn’t followed by a value, or `Just`

wrapping the value if there is one.

The meaning of the `many1 safe`

snippet should be easy to guess: `many1`

applies the `safe`

parser repeatedly, like `many`

, but it fails if its parser doesn’t succeed at least once. (If you’re used to thinking in terms of regular expressions, `many`

is like `*`

, and `many1`

like `+`

.)

Now to `liftA2 (,)`

. This takes two parsers as arguments. It applies the first parser, then the second, then applies a function to their results. Here, the function is `(,)`

, which constructs a tuple. Considered as a whole, this gives us a parser that applies two other parsers, and tuples up their results, returning the tuple as its result.

The `safe`

parser handles a single character of a key or value. Now that we’re beginning to become familiar with the notation, we can gloss a little over its details and concentrate on what’s new.

safe = oneOf urlBaseChars <|> char '%' *> liftA2 diddle hexDigit hexDigit <|> ' ' <$ char '+'

Most obvious is the `<|>`

combinator. It takes two parsers as arguments. If the left succeeds, its result becomes the result of `<|>`

. If it fails, the result becomes the result of the right parser. As the above code implies, we can chain uses of `<|>`

.

The first “arm” of `safe`

parses a single character, taken from a set of acceptable characters. The second consumes a percent-encoded character, using the `liftA2`

combinator that we saw earlier: it consumes a “%” character, followed by two hex digits, then applies `diddle`

to the digits to turn them into a single Haskell character.

diddle a b = toEnum . fst . head . readHex $ [a,b]

The `readHex`

function lives in the `Numeric`

module, and parses a hex string as an integer.

readHex :: (Num a) => String -> [(a, String)]

We’re now faced with a new combinator, `<$`

.

' ' <$ char '+'

This applies the parser on the right: if it succeeds, it throws away its result and instead returns the value on the left. In other words, if we match “+”, we return a space.

Here’s another example: here, we parse a series of RFC2822 or HTTP (or MIME, or what have you) headers into a list of name/value pairs, including the proper handling of continuation lines.

p_headers :: CharParser st [(String, String)] p_headers = header `manyTill` crlf where header = liftA2 (,) fieldName (char ':' *> spaces *> contents) fieldName = liftA2 (:) letter (many fieldChar) fieldChar = letter <|> digit <|> oneOf "-_" contents = liftA2 (++) (many1 notEOL <* crlf) (continuation <|> pure []) continuation = liftA2 (:) (' ' <$ many1 (oneOf " \t")) contents crlf :: CharParser st () crlf = (() <$ string "\r\n") <|> (() <$ newline) notEOL :: CharParser st Char notEOL = noneOf "\r\n"

Notice our use of the `<*`

combinator that we mentioned earlier, and once again the lack of let- or lambda-bound variables.

It can take a little while to get used to constructing and reading parsers in a wholly applicative style, but it does quickly start to feel both natural and appealing.

Compared to writing a parser in monadic style, the notation of the `Control.Applicative`

module leads to significantly more compact code. Some of the big wins in expressivity come from the attention of the authors to surprisingly small details, like the `<*`

and `<$`

operators, and their precedences and associativities. These let us write chains of “pointy” functors, where we visually follow the arrows to see what’s really being used.

By the way, the complete name for applicative functors is a mouthful: they’re strong lax monoidal functors. These structures come to us from abstract algebra and category theory. If you’re a functional programmer and you’re not following the work of people like Conor McBride and Jeremy Gibbons, you really should: it’s a whole lot of fun when the seemingly ethereal world of abstract mathematics comes home to roost in Haskell.

Instead of liftA2, you can also use another nice set of Applicative combinators, and .

liftA2 foo x y

becomes

foo x y

One of the nice things about this is that it generalizes easily to liftA3, liftA4, … that is,

liftAn foo x1 x2 … xn

becomes

foo x1 x2 … xn.

(For those who might care, note that is just a synonym for fmap! And applies a function “inside” an applicative to a value inside another applicative.)

Argh! it ate all my angle brackets! ok, take two, fingers crossed…

Instead of liftA2, you can also use another nice set of Applicative combinators, <$> and <*>.

liftA2 foo x y

becomes

foo <$> x <*> y

One of the nice things about this is that it generalizes easily to liftA3, liftA4, â€¦ that is,

liftAn foo x1 x2 â€¦ xn

becomes

foo <$> x1 <*> x2 <*> â€¦ <*> xn.

(For those who might care, note that <$> is just a synonym for fmap! And <*> applies a function â€œinsideâ€ an applicative to a value inside another applicative.)

It’s interesting how the point free style of parser combinators are coming back into vogue after being displaced by monadic combinators for a while.

The point free combinators was how I started (as pioneered(?) by Burge in “Recursive Programming Techniques” in the 70s).

Thanks for this example, which I found through Brent Yorgey’s ‘The Typeclassopedia’ article in ‘The Monad Reader’, Issue 13.

For those not too familiar with Parsec (like myself), here are some examples, using Hugs:

ApplicativeParsec> parseTest p_query “name=bryan+o%27sullivan&city=san+francisco”

[(“name”,Just “bryan o’sullivan”),(“city”,Just “san francisco”)]

You need the parseTest function to run the example interactively.

ApplicativeParsec> parseTest p_headers “Subject: RFC2822 or HTTP (or MIME, or what have you\n\n”

[(“Subject”,”RFC2822 or HTTP (or MIME, or what have you”)]

Note the 2 newlines at the end, which terminate the input in this text format.

I replaced the liftA2 in the source code with and , as explained above.

and forgot all about the < and > escape sequences, also as above. Tss,Tss…sorry!

But to parse the non-context free string of n ‘a’ followed by n ‘b’ followed by n ‘c’ characters you do need the monadic structure. Example taken from:

‘Combinator Parsing: A Short Tutorial’ by S.Doaitse Swierstra

p_abc :: CharParser st String

p_abc = do as <- many (char ‘a’)

let n = length as

bs <- pnTimes n (char ‘b’)

cs <- pnTimes n (char ‘c’)

return (as ++ bs ++ cs)

pnTimes :: Int -> CharParser st a -> CharParser st [a]

pnTimes 0 p = return []

pnTimes n p = (:) <$> p <*> pnTimes (n-1) p

ApplicativeParsec> parseTest p_abc “aaabbbccc”

“aaabbbccc”

Harkening back to the original premise of this entry, I would like to plug the versatility aspect of the applicatives on functors as opposed to using the monadic machinations.

The problem with using monads and bind comes in the type situation. The monads have to match if outside of bind for the next level monad down, as in this situation:

catMaybes $ [1..12] >>= ((\x -> if x >> return) >>= (liftM (* 8) >>> return)

[40,48,56,64,72,80,88,96]

Using applicative operators along with arrows you get the above with the catMaybe on the inside of the computational unit which is not possible as far as I can see with a single monad construction, without resorting to using the ($) outside of the application. Here is the same thing using the () from Control.Applicative and a new operator that I built, () that just extends down another functor layer:

()

:: forall a b (f1 :: * -> *) (f :: * -> *).

(Functor f, Functor f1) =>

(a -> b) -> f1 (f a) -> f1 (f b)

*Big3> ((\x -> if x < 5 then Nothing else Just x) ) >>> ((* 8) ) >>> catMaybes $ [1..12]

[40,48,56,64,72,80,88,96]

I will have to delve into Parsec module someday, but been having WAY to much fun, mucking about with other things.

Cheers,

gene

(* 8) that is supposed to be times 8 — eight and came out a smiley on my monitor in my comment above.. oh well.

Very informative article. I like is especially b’coz it has contained all the key points very efficiently. By going through the info, I can see the research behind the information.

Thanks a lot for sharing such a good stuff.