Every functional programmer worth their salt seems to end up with at least a few stories to tell about programming in CPS, also known as continuation passing style. Here’s my latest one.
As a user of it, you can’t tell that my attoparsec parsing library is implemented almost entirely using explicit continuations. Every combinator accepts two continuations:
the failure continuation is invoked if the current function fails
the success continuation is invoked if the current function succeeds
Making matters more complex, each continuation accepts three other parameters:
the input currently known to be available
any additional input that was received when parsing was suspended due to insufficient input (to support backtracking in the case of a failed parse)
an end-of-input marker, to record whether our caller has no more input to feed us incrementally
I originally implemented attoparsec as a very traditional state transformer monad. Every bind action would check to see if the previous action succeeded. If yes, it would invoke the next action, otherwise it would pass the failure forwards. While this worked, its performance was disappointing: my parsers didn’t run any faster than those built on the venerable parsec package!
The work of switching from a traditional state transformer over to CPS took just a couple of days. I’ve had my bacon repeatedly saved by Haskell’s type system, which ensures that I’m not passing the wrong kind of continuation in the wrong slot, or mixing up input-I’m-going-to-consume with input-I’ve-been-fed.
Switching over to CPS bought a factor-of-eight performance improvement, which delighted me. The code remains fairly easy to follow, because I was careful to write it cleanly (of all the places where excessive cleverness can bite you, working with CPS must come close to the top).
Looking at the Core generated by GHC, though, suggests that there’s plenty of room for improvement: it allocates tons of closures. And sure enough, they show up at runtime.
See those items marked THUNK in the chart above? Yeah, ouch.
Nevertheless, that 8x performance improvement is real, but I actually managed to further improve on it for the aeson JSON library. In that library, some careful profiling indicated that dealing with text in general, and Unicode escapes in particular, was a serious bottleneck. Having run out of obvious paths to speed the code up in its existing form, I wrote a tiny and highly focused module, Data.Attoparsec.Zepto, that foregoes CPS in favour of the state transformer approach. For non-recursive parsers that shouldn’t fail (e.g. dealing with escaped text), this module performs very well: I achieved a 35% performance boost in JSON parsing when I introduced it! That approach really only seems to work well in that single isolated instance, unfortunately: state transformers continue to kill performance if I try to use them more widely within attoparsec or aeson.
In the short to medium term, then, CPS is usually a win, but we can do better: I’m hoping to help the Simons ferret out whatever’s causing continuations to be compiled into closures instead of straightline code. GHC 7 currently contains some pretty big performance regressions for CPS-heavy code, but they’ve been partly ameliorated by a recent patch (regression dropped from 70% to 22%). I think we’ve plenty of scope to make performance substantially better just via compiler improvements.