I’ve spent some time over the past few weeks improving the performance of the attoparsec parsing library, and of the aeson JSON library. Since they’ve now reached a new plateau of performance and stability, I thought this would be a good time to release new versions.
The major advance in the new version of aeson is a considerable speed improvement.
The datasets I’m using are Twitter search results, from the Twitter JSON search API. For mostly-English results, 0.2.0.0 is up to 30% faster than before, while on Japanese data (which makes heavy use of Unicode escapes), I’ve bumped performance by more than 50%.
To see how well aeson performs compared to JSON parsers for other languages, I compared it against the json module in Python 2.7. That module’s JSON parser is written in C, so it’s very fast indeed, and the amount of actual Python being executed in my microbenchmark is tiny. How do we fare?
On mostly-English data, aeson is actually faster than Python’s native-code json parser. Nice! And on Japanese data, we’re a little slower, but still very competitive.
What if you’ve been using the Haskell json package, which was the first open source Haskell JSON parser to be published? Well, I do think that aeson is easier to use, but it’s also 3x faster than the json package:
The new version of aeson introduces some other useful improvements.
- There’s a new Generic module, which lets you convert almost any instance of the Data typeclass to and from JSON without writing boilerplate code. (Be warned: generics are slow. If performance is important to you, write that boilerplate!)
- We introduce a Number type that represents integers to full accuracy, and which handles floating point numbers efficiently.
- Instead of parsing via the Applicative typeclass, we now use a custom parsing monad, improving both ease of use and performance.