<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>teideal glic deisbhéalach</title>
	<atom:link href="http://www.serpentine.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.serpentine.com/blog</link>
	<description>Bryan O'Sullivan's blog</description>
	<lastBuildDate>Mon, 11 Jul 2011 08:06:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Fitter, happier, more productive UTF-8 decoding</title>
		<link>http://www.serpentine.com/blog/2011/07/11/fitter-happier-more-productive-utf-8-decoding/</link>
		<comments>http://www.serpentine.com/blog/2011/07/11/fitter-happier-more-productive-utf-8-decoding/#comments</comments>
		<pubDate>Mon, 11 Jul 2011 07:25:38 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=854</guid>
		<description><![CDATA[The other night, I had a random whim to spend a couple of minutes looking at the performance of UTF-8 decoding in the Haskell Unicode text package. Actually, rather than look at the actual performance, what I did was use Don Stewart's excellent ghc-core tool to inspect the high-level &#34;Core&#34; code generated by the compiler. [...]]]></description>
			<content:encoded><![CDATA[<p>The other night, I had a random whim to spend a couple of minutes looking at the performance of UTF-8 decoding in the Haskell <a href="http://hackage.haskell.org/package/text">Unicode text package</a>. Actually, rather than look at the actual performance, what I did was use Don Stewart's excellent <a href="http://hackage.haskell.org/package/ghc-core"><code>ghc-core</code></a> tool to inspect the high-level &quot;Core&quot; code generated by the compiler. Core is the last layer at which Haskell code is still somewhat intelligible, and although it takes quite a bit of practice to interpret, the effort is often worth it.</p>
<p>For instance, in this case, I could immediately tell by inspection that something bad was afoot in the inner loop of the UTF-8 decoder. A decoder more or less has to read a byte of input at a time, as this heavily edited bit of Core illustrates:</p>
<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">let</span><span class="ot"> x2 </span><span class="ot">::</span> <span class="dt">Word8</span><br />    x2 <span class="fu">=</span> <span class="kw">case</span> readWord8OffAddr<span class="fu">#</span> <span class="co">{- ... -}</span> <span class="kw">of</span><br />           (<span class="fu">#</span> s, x <span class="fu">#</span>) <span class="ot">-&gt;</span> <span class="dt">W8</span><span class="fu">#</span> x<br /><span class="kw">in</span> <span class="co">{- ... -}</span></code></pre>
<p>What's important in the snippet above is that the value <code>x2</code> is <em>boxed</em>, i.e. packaged up with a <code>W8#</code> constructor so that it must be accessed via a pointer indirection. Since a decoder must read up to 4 bytes to emit a single Unicode code point, the loop was potentially boxing up 4 bytes, then immediately <em>unboxing</em> them:</p>
<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">case</span> x2 <span class="kw">of</span> <br />  <span class="dt">W8</span><span class="fu">#</span> x2<span class="fu">#</span> <span class="ot">-&gt;</span><br />    <span class="kw">case</span> x3 <span class="kw">of</span><br />      <span class="dt">W8</span><span class="fu">#</span> x3<span class="fu">#</span> <span class="ot">-&gt;</span><br />        <span class="kw">case</span> x4 <span class="kw">of</span><br />          <span class="dt">W8</span><span class="fu">#</span> x4<span class="fu">#</span> <span class="ot">-&gt;</span> <span class="co">{- ... -}</span></code></pre>
<p>While both boxing and unboxing are cheap in Haskell, they're certainly not <em>free</em>, and we surely don't want to be doing either in the inner loop of an oft-used function.</p>
<p>We can see <em>why</em> this was happening at line 96 of the <a href="https://bitbucket.org/bos/text/src/cac7dbcbc392/Data/Text/Encoding.hs#cl-88"><code>decodeUtf8With</code> function</a>. I'd hoped that the compiler would be smart enough to unbox the values <code>x1</code> through <code>x4</code>, but it turned out not to be.</p>
<p>Fixing this excessive boxing and unboxing wasn't hard at all, but <a href="https://bitbucket.org/bos/text/src/71ead801296a/Data/Text/Encoding.hs#cl-88">it made the code uglier</a>. The rewritten code had identical performance on pure ASCII data, but was about 1.7 times faster on data that was partly or entirely non-ASCII. Nice! Right?</p>
<p>Not quite content with this improvement, I tried writing a decoder based on <a href="http://bjoern.hoehrmann.de/utf-8/decoder/dfa/">Björn Höhrmann's work</a>. My initial attempt looked promising; it was up to 2.5 times faster than my first improved Haskell decoder, but it fell behind on decoding ASCII, due to the extra overhead of maintaining the DFA state.</p>
<p>In English-speaking countries, ASCII is still the king of encodings. Even in non-English-speaking countries that use UTF-8, a whole lot of text is at least partly ASCII in nature. For instance, other European languages contain frequent extents of 7-bit-clean text. Even in languages where almost all code points need two or more bytes to be represented in UTF-8, data such as XML and HTML <em>still</em> contains numerous extents of ASCII text.</p>
<p>What would happen if we were to special-case ASCII? If we read a 32-bit chunk of data, mask it against 0x80808080, and get zero, we know that all four bytes must be ASCII, so we can just <a href="https://bitbucket.org/bos/text/src/d6b9108799ba/cbits/cbits.c#cl-64">write them straight out without going through the DFA</a> (see <a href="https://bitbucket.org/bos/text/src/d6b9108799ba/cbits/cbits.c#cl-82">lines 82 through 110</a>).</p>
<p>As <a href="https://spreadsheets0.google.com/a/serpentine.com/spreadsheet/pub?hl=en_US&amp;hl=en_US&amp;key=0AlCjMsgkVJXcdG11ZGNaa2FkX3gwZ241bV9IYTduWkE&amp;output=html">the numbers suggest</a>, this makes a big difference to performance! Decoding pure ASCII becomes <em>much</em> faster, while both HTML and XML see respectable improvements. Of course, even this approach comes with a tradeoff: we lose a little performance when decoding entirely non-ASCII text.</p>
<img src="https://spreadsheets.google.com/a/serpentine.com/spreadsheet/oimg?key=0AlCjMsgkVJXcdG11ZGNaa2FkX3gwZ241bV9IYTduWkE&#038;oid=5&#038;zx=nq9fqkp2ty6x" width=600 height=371 />
<p>Even in the slowest case, we can now decode upwards of 250MB of UTF-8 text per second, while for ASCII, we exceed 1.7GB per second!</p>
<p>These changes have made a big difference to decoding performance across the board: it is now always between 2 and 4 times faster than before.</p>
<img src="https://spreadsheets0.google.com/a/serpentine.com/spreadsheet/oimg?key=0AlCjMsgkVJXcdG11ZGNaa2FkX3gwZ241bV9IYTduWkE&#038;oid=3&#038;zx=muagi2e0n09u" width=600 height=371 />
<p>As a final note, I haven't released the new code quite yet - so keep an eye out!</p>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/07/11/fitter-happier-more-productive-utf-8-decoding/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Here be dragons: advances in problems you didn&#8217;t even know you had</title>
		<link>http://www.serpentine.com/blog/2011/06/29/here-be-dragons-advances-in-problems-you-didnt-even-know-you-had/</link>
		<comments>http://www.serpentine.com/blog/2011/06/29/here-be-dragons-advances-in-problems-you-didnt-even-know-you-had/#comments</comments>
		<pubDate>Wed, 29 Jun 2011 07:27:08 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=844</guid>
		<description><![CDATA[Here&#8217;s something I bet you never think about, and for good reason: how are floating-point numbers rendered as text strings? This is a surprisingly tough problem, but it&#8217;s been regarded as essentially solved since about 1990.Prior to Steele and White&#8217;s &#34;How to print floating-point numbers accurately&#34;, implementations of printf and similar rendering functions did their [...]]]></description>
			<content:encoded><![CDATA[<p
>Here&#8217;s something I bet you never think about, and for good reason: how are floating-point numbers rendered as text strings? This is a surprisingly tough problem, but it&#8217;s been regarded as essentially solved since about 1990.</p
><p
>Prior to Steele and White&#8217;s &quot;<a href="http://portal.acm.org/citation.cfm?id=93559"
  >How to print floating-point numbers accurately</a
  >&quot;, implementations of <code
  >printf</code
  > and similar rendering functions did their best to render floating point numbers, but there was wide variation in how well they behaved. A number such as 1.3 might be rendered as 1.29999999, for instance, or if a number was put through a feedback loop of being written out and its written representation read back, each successive result could drift further and further away from the original.</p
><p
>Steele and White effectively solved the problem with a clever algorithm named &quot;Dragon4&quot; (the fourth version of the &quot;Dragon&quot; algorithm, which acquired its name because the authors were inspired to obscure puns by Heighway's <a href="http://en.wikipedia.org/wiki/Dragon_curve"
  >dragon curve</a
  >).</p
><p
>The Dragon4 algorithm spread quickly across language runtimes, such that few programmers today understand that this was ever a problem, much less how hairy it was (and is). Indeed, prior to last year, there was almost no activity in this area: two papers proposed widely used refinements to Dragon4, and that was about it. (Alas, the problem was originally solved around a decade before Steele and White published their work, but nobody noticed. If you have a clever idea and sufficient chutzpah, try to enlist Guy Steele as a coauthor. Your work will be read.)</p
><p
>But how solved was the problem? Dragon4 and its derivatives are complicated and tricky, and they have a hefty performance cost, since they rely on arbitrary-precision integer arithmetic to compute their results. There might be a significant performance improvement to be gained if someone could figure out how to use native machine integers instead.</p
><p
>In 2010, Florian Loitsch published a wonderful paper in PLDI, &quot;<a href="http://florian.loitsch.com/publications/dtoa-pldi2010.pdf?attredirects=0"
  >Printing floating-point numbers quickly and accurately with integers</a
  >&quot;, which represents the biggest step in this field in 20 years: he mostly figured out how to use machine integers to perform accurate rendering! Why do I say &quot;mostly&quot;? Because although Loitsch's &quot;Grisu3&quot; algorithm is very fast, it <em
  >gives up</em
  > on about 0.5% of numbers, in which case you have to fall back to Dragon4 or a derivative.</p
><p
>If you're a language runtime author, the Grisu algorithms are a big deal: Grisu3 is about 5 times faster than the algorithm used by <code
  >printf</code
  > in GNU libc, for instance. A few language implementors have already taken note: Google hired Loitsch, and the Grisu family acts as the default rendering algorithms in both the V8 and Mozilla Javascript engines (replacing David Gay's 17-year-old <code
  >dtoa</code
  > code). Loitsch has kindly released implementations of his Grisu algorithms as a library named <a href="http://code.google.com/p/double-conversion"
  ><code
    >double-conversion</code
    ></a
  >.</p
><p
>And of course I can't talk about performance without mentioning Haskell somewhere :-) I've taken Loitsch's library and written a <a href="http://hackage.haskell.org/package/double-conversion"
  >Haskell interface</a
  >, which I've measured to be 30 times faster than the default renderer used in the Haskell runtime libraries. This has some nice knock-on effects: my <a href="http://hackage.haskell.org/package/aeson"
  ><code>aeson</code> JSON library</a
  > is now 10 times faster at rendering big arrays of floating point numbers, for instance. I accidentally noticed in the course of that work that my Haskell <a href="http://hackage.haskell.org/package/text"
  ><code
    >text</code
    > Unicode library</a
  >'s UTF-8 encoder wasn't as fast as it could be, so I improved its performance by about 50% along the way. Hooray for faster code!</p
><p
>(By the way, the punnery in algorithm naming continues: the Grisu algorithms are named for <a href="http://de.wikipedia.org/wiki/Grisu,_der_kleine_Drache"
  >Grisù, the little dragon</a
  >.)</p
>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/06/29/here-be-dragons-advances-in-problems-you-didnt-even-know-you-had/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>attoparsec 0.9, a major (and abortive) release [updated]</title>
		<link>http://www.serpentine.com/blog/2011/06/03/attoparsec-0-9-a-major-release/</link>
		<comments>http://www.serpentine.com/blog/2011/06/03/attoparsec-0-9-a-major-release/#comments</comments>
		<pubDate>Fri, 03 Jun 2011 19:10:08 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=829</guid>
		<description><![CDATA[Update: I just released attoparsec 0.9.1.0, which undoes all of the changes described below. The problem? While removing backtracking, I accidentally changed the semantics of the &#60;&#124;&#62; operator in an unforeseen and unfortunate way. The bug I introduced was that a parser of the form (char 'a' *&#62; char 'b') &#60;&#124;&#62; char 'c' would now [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Update</strong>: I just released attoparsec 0.9.1.0, which <em>undoes</em> all of the changes described below. The problem? While removing backtracking, I accidentally changed the semantics of the <code>&lt;|&gt;</code> operator in an unforeseen and unfortunate way. The bug I introduced was that a parser of the form <code>(char 'a' *&gt; char 'b') &lt;|&gt; char 'c'</code> would now accept as valid the input <code>"ac"</code>, which is clearly highly undesirable. Once my error was pointed out (see the comments at the end of this post), I quickly <a href="https://github.com/bos/attoparsec/commit/c2ec0a4fd956a8629327dbc7bbb924de45efe78e">came up with a fix</a>, but the fix itself had a problem: it regressed performance to being <em>slower</em> than when attoparsec performed backtracking! So now attoparsec backtracks in <code>&lt;|&gt;</code> again, until such time as I can come up with an approach that I&#8217;m happier with (and who knows when that will be). Whew.</p>

<p><strong>Original article</strong>, for posterity:</p>

<p>A couple of days ago, I released version 0.9 of my <a href="http://hackage.haskell.org/package/attoparsec">attoparsec parsing library</a>. Although its visible API remains completely unchanged over 0.8, I made a large and important change to the semantics. So if you use attoparsec, read on before you bump your build dependencies!</p>

<p>Prior to 0.9, attoparsec backtracked too aggressively, and hence held onto too much memory.</p>

<p>Consider the following parser:</p>

<pre><code>(char 'f' *&gt; char 'o') &lt;|&gt; (char 'f' *&gt; char 'i')
</code></pre>

<p>Under attoparsec 0.8 and earlier, the <code>&lt;|&gt;</code> (choice) operator saves the input state before executing the left branch. If the left branch fails, it restores the input state, then executes the right branch. So if we supply an input of <code>"fo"</code>, the left branch will succeed, and so the whole parser succeeds. But if we supply an input of <code>"fi"</code>, the left branch will fail. The <code>&lt;|&gt;</code> operator will restore the input to <code>"fi"</code>, and then the right branch will succeed.</p>

<p>This behaviour is very appealing, because it makes it easy to write readable parsers. Unfortunately, it comes with a big downside: it makes it too easy to write <em>inefficient</em> parsers, since <code>&lt;|&gt;</code> needs to hold onto the entire input of its left branch in case it fails.</p>

<p>In attoparsec 0.9, I&#8217;ve dropped that backtracking-by-default behaviour. This will require changes to some parsers that use attoparsec; I&#8217;ll describe what&#8217;s involved below. (By the way, this change makes attoparsec&#8217;s behaviour more compatible with the classic <a href="http://hackage.haskell.org/package/parsec">Parsec library</a>.)</p>

<p>Given the parser above, here&#8217;s how it will now behave when given an input of <code>"fi"</code>. The left branch will succeed on its first match against <code>'f'</code>, so the input will become <code>"i"</code> after <code>'f'</code> has been consumed. At this point, the left branch will fail because it cannot match <code>'o'</code> against the input of <code>"i"</code>. The <code>&lt;|&gt;</code> combinator will switch to executing the right branch, but it will <em>not</em> reset the input. At this point, the right branch will attempt to match <code>'f'</code> against an input that is still <code>"i"</code>, and will thus fail.</p>

<p>This change affects not just the <code>&lt;|&gt;</code> combinator, but also its synonyms <code>mplus</code> and <code>mappend</code>. It also affects the behaviour of higher-level combinators that are constructed using <code>&lt;|&gt;</code>, such as <code>many</code>.</p>

<p>If your parser is affected by this change, there are two good courses of action.</p>

<p>The easier path is to use the <a href="http://hackage.haskell.org/packages/archive/attoparsec/0.9.0.0/doc/html/Data-Attoparsec.html#v:try"><code>try</code> combinator</a> on the left hand side of a <code>&lt;|&gt;</code> branch. This combinator causes a parser to backtrack on failure: if a parser wrapped by <code>try</code> fails, it will reset the input before passing the failure along. What would this look like in practice?</p>

<pre><code>try (char 'f' *&gt; char 'o') &lt;|&gt; (char 'f' *&gt; char 'i')
</code></pre>

<p>This is certainly easy to do, and it won&#8217;t affect the performance of your code compared to attoparsec 0.8: you&#8217;ll have exactly the same backtracking and memory usage behaviours as before.</p>

<p>There is an alternative that doesn&#8217;t require backtracking, and which should make your parsers both faster and less memory-hungry: the classic technique of <em>left factoring</em> your grammar. For the toy parser above, this is easy to do:</p>

<pre><code>char 'f' *&gt; (char 'o' &lt;|&gt; char 'i')
</code></pre>

<p>It&#8217;s clear that we&#8217;ll see a small performance boost here because we only match against <code>'f'</code> once in this parser, while our first version did some unnecessary extra work by matching <code>'f'</code> on each side of the branch. Left factoring is not always this easy, and it can make parsers more difficult to read, so sometimes using <code>try</code> is the better course to pursue.</p>

<p>Given that this change in semantics introduces some awkwardness, you might reasonably wonder why I made it. By making the use of backtracking explicit, parsers that don&#8217;t need backtracking will become faster &#8220;for free&#8221;. A realistic instance of this is my <a href="http://hackage.haskell.org/package/aeson">aeson</a> JSON library, where by using the no-backtracking version of attoparsec, I saw <a href="https://github.com/mailrank/aeson/commit/51699e583baa76971cb31c316d3a55deffbd8a32">improvements of up to 10%</a> in parsing performance, and slightly larger reductions in memory footprint.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/06/03/attoparsec-0-9-a-major-release/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Exciting teaching news</title>
		<link>http://www.serpentine.com/blog/2011/05/11/exciting-teaching-news/</link>
		<comments>http://www.serpentine.com/blog/2011/05/11/exciting-teaching-news/#comments</comments>
		<pubDate>Wed, 11 May 2011 21:58:53 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=822</guid>
		<description><![CDATA[Looks like I&#8217;ve got a busy autumn ahead! Read on for two pieces of news that I&#8217;m very happy about. In September, I&#8217;ll be teaching a Haskell workshop at the Strange Loop Conference in St Louis. Here&#8217;s the abstract: Modern programming presents a daunting array of challenges: proliferating technologies, messy inputs, unreliable networks, huge volumes [...]]]></description>
			<content:encoded><![CDATA[<p>Looks like I&#8217;ve got a busy autumn ahead! Read on for two pieces of news that I&#8217;m very happy about.</p>
<p>In September, I&#8217;ll be teaching a Haskell workshop at the <a href="http://thestrangeloop.com/">Strange Loop Conference</a> in St Louis. Here&#8217;s the abstract:</p>
<blockquote>
<p>Modern programming presents a daunting array of challenges: proliferating technologies, messy inputs, unreliable networks, huge volumes of data, how to verify that results are correct, making it all fast enough. The Haskell programming language is well suited to addressing this broad range of needs, as it uniquely combines conciseness, safety, and high performance.</p>
</blockquote>
<blockquote>
<p>In this fast-moving, interactive tutorial we will learn Haskell by developing a realistic analytic application: we&#8217;ll crawl a web site and rank its pages in order of authority. This combines some very modern concerns: network programming; handling dodgy HTML; big data; and number crunching. With our emphasis on getting real work done, we&#8217;ll show off some of Haskell&#8217;s compelling features and demonstrate how they help us to develop dependable, easy to understand code.</p>
</blockquote>
<p>There&#8217;s a ton of other great speakers lined up for this year&#8217;s Strange Loop, so if you like learning from, and talking with, people who are excited about software, this will be a great conference to attend.</p>
<p>As if that weren&#8217;t enough, I&#8217;m excited to be working with <a href="http://www.scs.stanford.edu/~dm/">David Mazières</a> on a new class at Stanford University, &quot;CS240H: Functional Systems in Haskell&quot;.</p>
<p>As far as I know, this will be the first academic course of its kind, where we focus on the use of functional programming techniques to build solid, fast, secure systems software. Intrinsic merits of the class aside, it&#8217;s going to be a lot of fun to teach this class this in the educational cradle of Silicon Valley, where systems software is the bread and butter of a lot of research and commercial innovation.</p>
<p>We&#8217;ll start off with some familiar topics: the basics of Haskell, laziness, monads, parsers, and all that. But from there, we&#8217;ll integrate testing, reliability, and debugging; performance tuning; interfacing to native code; concurrency and I/O paradigms; language extensions; meta-programming; and applications to the web and security.</p>
<p>We&#8217;re also looking at bringing in some prominent guest lecturers from the Haskell community to talk about topics of interest to them. I&#8217;ll name names when we have them confirmed.</p>
<p>David and I are both very much motivated by writing code that&#8217;s clean and fast, so as much as we&#8217;ll spend time talking about theory, research, and academics, this will be a very hands-on &quot;let&#8217;s bang out some awesome software!&quot; kind of affair.</p>
<p>Here&#8217;s a rough outline of what the course will involve:</p>
<ul>
<li><p>A couple of &quot;Haskell basics&quot; classes.</p></li>
<li><p>Laziness: its applications and pitfalls.</p></li>
<li><p>Monads and parsers; programming in continuation-passing style.</p></li>
<li><p>Tricks for debugging and performance tuning.</p></li>
<li><p>Reliability: design for robustness; property-based testing with QuickCheck; code coverage.</p></li>
<li><p>The Iteratee/Enumerator paradigm for I/O.</p></li>
<li><p>How to use the foreign function interface, build interfaces to native code, deal efficiently with data in various contexts, and generally think about memory. Possibly also coupled with discussion of the Haskell networking API.</p></li>
<li><p>Concurrency and parallelism: event- vs thread-based concurrency, Software Transactional Memory, parallel Haskell, data parallel Haskell, repa.</p></li>
<li><p>Library-level optimization: function specialization, dictionary passing, RULES pragmas, stream fusion.</p></li>
<li><p>All the gnarly things you can do with functional dependencies and type families, as well as other relevant extensions such as GADTs.</p></li>
<li><p>How Haskell is implemented, to give students some idea of what is actually going on.</p></li>
<li><p>Generic programming: SYB and fun with Data.</p></li>
<li><p>Web-related topics? formlets, plus maybe some cool Continuation type tricks. Haskell database interfaces and interesting things you can do with types, HList/HaskellDB.</p></li>
<li><p>Language-level information flow control.</p></li>
<li><p>Case studies of real systems in Haskell.</p></li>
</ul>
<p>I know that some SF Bay Area locals have questions about the possibility of auditing the class, whether materials will be available online, and the like. I don&#8217;t have answers yet, but stay tuned.</p>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/05/11/exciting-teaching-news/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>A new week, a new JSON performance improvement</title>
		<link>http://www.serpentine.com/blog/2011/03/22/a-new-week-a-new-json-performance-improvement/</link>
		<comments>http://www.serpentine.com/blog/2011/03/22/a-new-week-a-new-json-performance-improvement/#comments</comments>
		<pubDate>Tue, 22 Mar 2011 06:53:09 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=817</guid>
		<description><![CDATA[It&#8217;s been a few weeks since I last wrote about the aeson library for working with JSON in Haskell, but this isn&#8217;t because I&#8217;ve been idle. In fact, just tonight I put out a new release. Where the previous releases focused on parsing performance, this one focuses on encoding performance. And the performance news is [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been a few weeks since I last wrote about the <a href="http://hackage.haskell.org/package/aeson">aeson library</a> for working with JSON in Haskell, but this isn&#8217;t because I&#8217;ve been idle. In fact, just tonight I put out a new release. Where the previous releases focused on parsing performance, this one focuses on encoding performance.</p>

<p>And the performance news is good: on real-world data, I&#8217;ve improved encoding performance by about a factor of 4. Why don&#8217;t we let the graphs do the talking.</p>

<a href="http://www.serpentine.com/wordpress/wp-content/uploads/2011/03/shot.png"><img src="http://www.serpentine.com/wordpress/wp-content/uploads/2011/03/shot.png" alt="Encoding performance" title="shot" width="308" height="257" class="aligncenter size-full wp-image-818" /></a>

<p>Enjoy!</p>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/03/22/a-new-week-a-new-json-performance-improvement/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>A little care and feeding can go a long way</title>
		<link>http://www.serpentine.com/blog/2011/03/18/a-little-care-and-feeding-can-go-a-long-way/</link>
		<comments>http://www.serpentine.com/blog/2011/03/18/a-little-care-and-feeding-can-go-a-long-way/#comments</comments>
		<pubDate>Fri, 18 Mar 2011 06:36:09 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=805</guid>
		<description><![CDATA[Sometimes, when a software package meets a certain level of maturity (or the desire to hack on it fades sufficiently), it's tempting to consider it &#34;done&#34;. Here's a little tale of when done isn't really done.About a week ago, I received a message from Finlay Thompson asking about my Haskell statistics package: he wanted to [...]]]></description>
			<content:encoded><![CDATA[<p
>Sometimes, when a software package meets a certain level of maturity (or the desire to hack on it fades sufficiently), it's tempting to consider it &quot;done&quot;. Here's a little tale of when done isn't really done.</p
><p
>About a week ago, I received a message from Finlay Thompson asking about my Haskell <a href="http://hackage.haskell.org/package/statistics"
  >statistics</a
  > package: he wanted to know how to generate pseudo-random variables using it. I redirected him from that to my <a href="http://hackage.haskell.org/package/mwc-random"
  >mwc-random</a
  > package, where my pseudo-random number generation code lives.</p
><p
>The mwc-random package currently provides generators for two widely used distributions: uniform and normal. When I was originally writing it, I paid particular attention to making it high quality, fast, and easy to use.</p
><p
>&quot;High quality&quot; sounds a little nebulous, but in the world of pseudo-random number generation, it's actually pretty well defined: a good PRNG should have a large period (the number of samples you need to pull out of it before it repeats itself, assuming a good seed), and the numbers it generates should withstand stringent tests of apparent independence (simply put, given one datum, you shouldn't be able to predict the next).</p
><p
>One algorithm that satisfies these criteria of quality is George Marsaglia's <a href="http://en.wikipedia.org/wiki/Multiply-with-carry_(random_number_generator)"
  >multiply-with-carry</a
  > algorithm MWC256 (also known as MWC8222). It has a period of about 2<sup>8222</sup> (huge enough for all conceivable practical purposes), and stands up well to the &quot;testu01&quot;, &quot;diehard&quot; and &quot;big crush&quot; statistical tests.</p
><p
>Due to its simplicity, MWC256 is also very fast, and under appropriate circumstances (e.g. on a 64-bit machine) it can be even faster than the well known Mersenne Twister algorithm (which also fails some statistical tests that MWC256 passes).</p
><p
>The Mersenne Twister is itself available for Haskellers to use, in the form of the <a href="http://hackage.haskell.org/package/mersenne-random"
  >mersenne-random</a
  > package. This package is a wrapper around the Mersenne Twister library, and unfortunately it imposes on its users the underlying library's typically horrible constraints borne of too much Fortran programming: you can only have one PRNG per application, and it can only be used from a single thread! The mwc-random package is less restrictive: fire up as many PRNGs in different threads as you like, and they'll all operate independently. You can also use the PRNGs in either the ST or the IO monad, for further convenience.</p
><p
>When generating normally distributed random variables, the mwc-random package uses an algorithm known as the &quot;modified ziggurat&quot;. One of the more popular algorithms for generating normally distributed variables is called the ziggurat, but its popularity belies an ill-understood quality problem: the numbers it generates aren't independent enough! It turns out that they are noticeably correlated. The modified ziggurat is almost as fast, and it sacrifices a little speed in the name of improved independence.</p
><p
>The base-level performance of the random number generators looks like this on my Mac using 32-bit GHC 6.12.3, where the time quoted is to generate a single double-precision floating point number:</p
><ul
><li
  ><p
    >Uniform: 142.6 nanoseconds</p
    ></li
  ><li
  ><p
    >Normal: 15149 nanoseconds</p
    ></li
  ></ul
><p
>Where does the question of being &quot;done&quot; or not come in? Well, while poking around tonight, I was a little surprised at the large difference in speed between the uniform and normal PRNGs, so I investigated. The <a href="http://en.wikipedia.org/wiki/Ziggurat_algorithm"
  >ziggurat algorithm</a
  > gets its name from the precomputed lookup tables it uses to gain its speed. It turns out that GHC's inliner was being too aggressive with the table-related code, causing the ziggurat tables to be regenerated over and over instead of precomputed just once. Ouch!</p
><p
>One <a href="https://bitbucket.org/bos/mwc-random/changeset/123ccdb62a3a"
  >small and very quick change</a
  >, and the performance of the PRNG for normally distributed variables changed dramatically:</p
><ul
><li
  ><p
    >Before: 15149 nanoseconds</p
    ></li
  ><li
  ><p
    >After: 246.8 nanoseconds</p
    ></li
  ></ul
><p
>That's a little over 61 times faster. Not bad for a couple of lines of changes!</p
><p
>As a final note, now that GHC can build 64-bit programs on a Mac, you might wonder how it performs. Here's a comparison between 32-bit and 64-bit versions of GHC 7.0.2 (times in nanoseconds):</p
><table>
  <tr>
    <th align="left">type</th><th>32-bit</th><th>64-bit</th><th>speedup</th>
  </tr>
  <tr>
    <td>uniform Double</td><td align="right">148</td><td align="right">28</td><td align="right">5.3</td>
  </tr>
  <tr>
    <td>uniform Int32</td><td align="right">53</td><td align="right">16.7</td><td align="right">3.2</td>
  </tr>
  <tr>
    <td>normal Int32</td><td align="right">252</td><td align="right">62</td><td align="right">4.1</td>
  </tr>
</table>
<p
>Those are some pretty nice performance improvements! Of course, not all applications come in the form of nice tight numeric kernels, so don't take it as given that you'll see improvements like this in your code.</p
>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/03/18/a-little-care-and-feeding-can-go-a-long-way/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>CPS is great! CPS is terrible!</title>
		<link>http://www.serpentine.com/blog/2011/02/25/cps-is-great-cps-is-terrible/</link>
		<comments>http://www.serpentine.com/blog/2011/02/25/cps-is-great-cps-is-terrible/#comments</comments>
		<pubDate>Fri, 25 Feb 2011 20:29:27 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=798</guid>
		<description><![CDATA[Every functional programmer worth their salt seems to end up with at least a few stories to tell about programming in CPS, also known as continuation passing style. Here's my latest one. As a user of it, you can't tell that my attoparsec parsing library is implemented almost entirely using explicit continuations. Every combinator accepts [...]]]></description>
			<content:encoded><![CDATA[<p>Every functional programmer worth their salt seems to end up with at least a few stories to tell about programming in CPS, also known as <a href="http://en.wikipedia.org/wiki/Continuation-passing_style">continuation passing style</a>. Here's my latest one.</p>

<p>As a user of it, you can't tell that my <a href="http://hackage.haskell.org/package/attoparsec">attoparsec</a> parsing library is implemented almost entirely using explicit continuations. Every combinator accepts two continuations:</p>

<ul>
<li><p>the <em>failure</em> continuation is invoked if the current function fails</p></li>
<li><p>the <em>success</em> continuation is invoked if the current function succeeds</p></li>
</ul>

<p>Making matters more complex, each continuation accepts three other parameters:</p>

<ul>
<li><p>the input currently known to be available</p></li>
<li><p>any additional input that was received when parsing was suspended due to insufficient input (to support backtracking in the case of a failed parse)</p></li>
<li><p>an end-of-input marker, to record whether our caller has no more input to feed us incrementally</p></li>
</ul>

<p>I originally implemented attoparsec as a very traditional state transformer monad. Every bind action would check to see if the previous action succeeded. If yes, it would invoke the next action, otherwise it would pass the failure forwards. While this worked, its performance was disappointing: my parsers didn't run any faster than those built on the venerable parsec package!</p>

<p>The work of switching from a traditional state transformer over to CPS took just a couple of days. I've had my bacon repeatedly saved by Haskell's type system, which ensures that I'm not passing the wrong kind of continuation in the wrong slot, or mixing up input-I'm-going-to-consume with input-I've-been-fed.</p>

<p>Switching over to CPS bought a factor-of-eight performance improvement, which delighted me. The code remains fairly easy to follow, because I was careful to write it cleanly (of all the places where excessive cleverness can bite you, working with CPS must come close to the top).</p>

<p>Looking at the Core generated by GHC, though, suggests that there's plenty of room for improvement: it allocates <em>tons</em> of closures. And sure enough, they show up at runtime.</p>

<p><a href="http://www.serpentine.com/wordpress/wp-content/uploads/2011/02/hp.png"><img src="http://www.serpentine.com/wordpress/wp-content/uploads/2011/02/hp.png" alt="Heap profile" title="Heap profile" width="600" height="398" class="aligncenter size-full wp-image-802" /></a></p>

<p>See those items marked THUNK in the chart above? Yeah, ouch.</p>

<p>Nevertheless, that 8x performance improvement is real, but I actually managed to further improve on it for the <a href="http://hackage.haskell.org/package/aeson">aeson</a> JSON library. In that library, some careful profiling indicated that dealing with text in general, and Unicode escapes in particular, was a serious bottleneck. Having run out of obvious paths to speed the code up in its existing form, I wrote a tiny and highly focused module, Data.Attoparsec.Zepto, that foregoes CPS in favour of the state transformer approach. For non-recursive parsers that shouldn't fail (e.g. dealing with escaped text), this module performs very well: I achieved a 35% performance boost in JSON parsing when I introduced it! That approach really only seems to work well in that single isolated instance, unfortunately: state transformers continue to kill performance if I try to use them more widely within attoparsec or aeson.</p>

<p>In the short to medium term, then, CPS is usually a win, but we can do better: I'm hoping to help the Simons ferret out whatever's causing continuations to be compiled into closures instead of straightline code. GHC 7 currently contains some pretty big performance regressions for CPS-heavy code, but they've been partly ameliorated by a recent patch (regression dropped from 70% to 22%). I think we've plenty of scope to make performance substantially better just via compiler improvements.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/02/25/cps-is-great-cps-is-terrible/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Faster, better, cleaner: new aeson and attoparsec releases</title>
		<link>http://www.serpentine.com/blog/2011/02/25/faster-better-cleaner-new-aeson-and-attoparsec-releases/</link>
		<comments>http://www.serpentine.com/blog/2011/02/25/faster-better-cleaner-new-aeson-and-attoparsec-releases/#comments</comments>
		<pubDate>Fri, 25 Feb 2011 19:24:51 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=785</guid>
		<description><![CDATA[I&#8217;ve spent some time over the past few weeks improving the performance of the attoparsec parsing library, and of the aeson JSON library. Since they&#8217;ve now reached a new plateau of performance and stability, I thought this would be a good time to release new versions. The major advance in the new version of aeson [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve spent some time over the past few weeks improving the performance of the <a href="http://hackage.haskell.org/package/attoparsec">attoparsec</a> parsing library, and of the <a href="http://hackage.haskell.org/package/aeson">aeson</a> JSON library. Since they&#8217;ve now reached a new plateau of performance and stability, I thought this would be a good time to release new versions.</p>

<p>The major advance in the new version of aeson is a considerable speed improvement.</p>

<a href="http://www.serpentine.com/wordpress/wp-content/uploads/2011/02/flump.png"><img src="http://www.serpentine.com/wordpress/wp-content/uploads/2011/02/flump.png" alt="Performance improvement" title="Performance improvement" width="356" height="307" class="size-full wp-image-789" /></a>

<p>The datasets I&#8217;m using are Twitter search results, from the Twitter JSON search API. For mostly-English results, 0.2.0.0 is up to 30% faster than before, while on Japanese data (which makes heavy use of Unicode escapes), I&#8217;ve bumped performance by more than 50%.</p>

<p>To see how well aeson performs compared to JSON parsers for other languages, I compared it against the <tt>json</tt> module in Python 2.7. That module&#8217;s JSON parser is written in C, so it&#8217;s very fast indeed, and the amount of actual Python being executed in my microbenchmark is tiny. How do we fare?</p>

<a href="http://www.serpentine.com/wordpress/wp-content/uploads/2011/02/bumf.png"><img src="http://www.serpentine.com/wordpress/wp-content/uploads/2011/02/bumf.png" alt="JSON parsing performance" title="JSON parsing performance" width="439" height="399" class="size-full wp-image-787" /></a>

<p>On mostly-English data, aeson is actually <i>faster</i> than Python&#8217;s native-code <tt>json</tt> parser. Nice! And on Japanese data, we&#8217;re a little slower, but still very competitive.</p>

<p>What if you&#8217;ve been using the Haskell <a href="http://hackage.haskell.org/package/json">json</a> package, which was the first open source Haskell JSON parser to be published? Well, I do think that aeson is easier to use, but it&#8217;s also 3x faster than the json package:</p>

<a href="http://www.serpentine.com/wordpress/wp-content/uploads/2011/02/flurb.png"><img src="http://www.serpentine.com/wordpress/wp-content/uploads/2011/02/flurb.png" alt="aeson vs json" title="aeson vs json" width="340" height="312" class="size-full wp-image-792" /></a>

<p>The new version of aeson introduces some other useful improvements.</p>
<ul>
<li>There&#8217;s a new Generic module, which lets you convert almost any instance of the Data typeclass to and from JSON without writing boilerplate code. (Be warned: generics are slow. If performance is important to you, write that boilerplate!)</li>
<li>We introduce a Number type that represents integers to full accuracy, and which handles floating point numbers efficiently.</li>
<li>Instead of parsing via the Applicative typeclass, we now use a custom parsing monad, improving both ease of use and performance.</li>
</ul>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/02/25/faster-better-cleaner-new-aeson-and-attoparsec-releases/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>So I went and started a company</title>
		<link>http://www.serpentine.com/blog/2011/01/28/so-i-went-and-started-a-company/</link>
		<comments>http://www.serpentine.com/blog/2011/01/28/so-i-went-and-started-a-company/#comments</comments>
		<pubDate>Fri, 28 Jan 2011 00:14:15 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=781</guid>
		<description><![CDATA[I&#8217;m delighted to say that after a couple of years of a break from the startup world (which I&#8217;ve inhabited for most of the past decade), I&#8217;ve decided to throw my hat back into the ring. Together with Bethanye Blount, I&#8217;ve started a company named MailRank. We&#8217;re working on helping people to manage the all-too-common [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m delighted to say that after a couple of years of a break from the startup world (which I&#8217;ve inhabited for most of the past decade), I&#8217;ve decided to throw my hat back into the ring. Together with Bethanye Blount, I&#8217;ve started a company named MailRank. We&#8217;re working on helping people to manage the all-too-common problem of email overload. You can read a little more about what we&#8217;re up to in our <a href="http://blog.mailrank.com/hello-world">initial announcement</a>, and we&#8217;ll have more details to share soon.</p>

<p>Since I&#8217;ve spent the past two decades in the world of open source, of course <a href="http://engineering.mailrank.com/introducing-some-open-source-technologies">I have goodies to show off</a>:</p>

<ul>
<li>A fast, powerful library for working with the Riak decentralized data store (<a href="https://github.com/mailrank/riak-haskell-client">mailrank/riak-haskell-client</a> on github). I think it&#8217;s the best Riak client library available for any programming language <img src='http://www.serpentine.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </li>
<li>An efficient, easy to use JSON library for Haskell (<a href="https://github.com/mailrank/aeson">mailrank/aeson</a> on github). Twice as fast as the competition.</li>
</ul>

<li>I&#8217;m excited to be working with Bethanye once again, and thrilled that we&#8217;re going to have our first employee join in ten days.</li>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2011/01/28/so-i-went-and-started-a-company/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>A small matter of illegal characters</title>
		<link>http://www.serpentine.com/blog/2010/11/29/a-small-matter-of-illegal-characters/</link>
		<comments>http://www.serpentine.com/blog/2010/11/29/a-small-matter-of-illegal-characters/#comments</comments>
		<pubDate>Mon, 29 Nov 2010 22:44:20 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=774</guid>
		<description><![CDATA[I received an interesting note from Michael Snoyman and Felipe Lessa today, noting a peculiar problem. If you run the ghci interpreter and load the Data.Text module, you can provoke some unpleasant behaviour: &#62; pack "\55296" "\9216" Hmm, that doesn't look right. What happens if we try again? &#62; pack "\55296" "\74488" Yikes! Run the [...]]]></description>
			<content:encoded><![CDATA[<p>I received an interesting note from Michael Snoyman and Felipe Lessa today, noting a peculiar problem. If you run the <code>ghci</code> interpreter and load the <code>Data.Text</code> module, you can provoke some unpleasant behaviour:</p>

<pre><code>&gt; pack "\55296"
"\9216"
</code></pre>

<p>Hmm, that doesn't look right. What happens if we try again?</p>

<pre><code>&gt; pack "\55296"
"\74488"
</code></pre>

<p>Yikes! Run the same function twice, and get not just wrong answers, but <em>different</em> wrong answers?</p>

<p>What is the <code>pack</code> function actually supposed to do, anyway?</p>

<pre><code>&gt; :t pack
pack :: String -&gt; Text
</code></pre>

<p>It takes a traditional Haskell <code>String</code> value and convert it into a tightly-packed, optimized <code>Text</code> value. Unfortunately, Unicode code points in the range 0xd800 through 0xdfff are <em>reserved</em>, but the Haskell interpreter isn't rejecting <code>'\55296'</code>, which happens to be 0xd800. The <code>pack</code> function is thus passed garbage, and responds with garbage itself.</p>

<p>That's a little inconvenient. I <em>think</em> that the right thing to do is to have <code>pack</code> check its input and reject it with an exception if it hits any reserved code points.</p>

<p><b>Update</b>: <a href="http://www.ozonehouse.com/mark/">Mark Lentczner</a> has the following comments on how to deal with this:</p>

<ol>
<li>error (<em>ick</em> — though catchable with IO exceptions)</li>
<li>reject the whole input to <code>pack</code> and just return an empty string</li>
<li>replace the offending code point with U+FFFD REPLACEMENT CHARACTER</li>
<li>just drop the offending code point on the floor silently</li>
</ol>

<p>Mark says "I vote #3, but I suppose there is argument for the others."</p>
]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2010/11/29/a-small-matter-of-illegal-characters/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
	</channel>
</rss>

