<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>teideal glic deisbhéalach &#187; science</title>
	<atom:link href="http://www.serpentine.com/blog/category/science/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.serpentine.com/blog</link>
	<description>Bryan O&#039;Sullivan&#039;s blog</description>
	<lastBuildDate>Thu, 01 Dec 2011 16:53:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>A new pseudo-random number generator for Haskell</title>
		<link>http://www.serpentine.com/blog/2009/09/19/a-new-pseudo-random-number-generator-for-haskell/</link>
		<comments>http://www.serpentine.com/blog/2009/09/19/a-new-pseudo-random-number-generator-for-haskell/#comments</comments>
		<pubDate>Sat, 19 Sep 2009 22:13:17 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=434</guid>
		<description><![CDATA[I just released version 0.3.3 of the Haskell statistics library, which contains a very fast pseudo-random number generator. The generator is an implementation of George Marsaglia&#8217;s MWC256 multiply-with-carry PRNG, which has a period of 28222 (for this reason, it&#8217;s sometimes referred to as MWC8222). It produces high-quality uniformly distributed pseudo-random numbers extremely quickly. Here is [...]]]></description>
			<content:encoded><![CDATA[<p>I just released version 0.3.3 of the Haskell <a href="http://hackage.haskell.org/package/statistics">statistics library</a>, which contains a very fast pseudo-random number generator.</p>

<p>The generator is an implementation of George Marsaglia&#8217;s MWC256 multiply-with-carry PRNG, which has a period of 2<sup><font style="font-size: 50%">8222</font></sup> (for this reason, it&#8217;s sometimes referred to as MWC8222). It produces high-quality uniformly distributed pseudo-random numbers extremely quickly.</p>

<p>Here is a brief performance comparison between the statistics library, the <a href="http://hackage.haskell.org/package/mersenne-random">mersenne-random library</a>, the <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/random/System-Random.html">normal Haskell PRNG</a>, and a <a href="http://darcs.serpentine.com/statistics/tests/mwc.c">C implementation</a> of MWC256. The numbers represent millions of variates generated and summed per second, so higher is better. (Measured on a 64-bit Core2 Duo laptop running Fedora 11.)</p>

<table cellspacing="7">
<tr><th>Package</td>  <th>Int64</th>  <th>Double</th>  <th>Word32</th></tr>
<tr><td>System.Random</td>  <td align="right">0.7</td>  <td align="right">1.6</td>  <td align="right"><em>n/a</em></td></tr>
<tr><td>mersenne-random</td>  <td align="right">101.2</td>  <td align="right">53.8</td>  <td align="right">96.3</td></tr>
<tr><td>statistics</td>  <td align="right">145.7</td>  <td align="right">116.5</td>  <td align="right">251.6</td>  </tr>
<tr><td>C (normal)</td>  <td align="right">131.0</td>  <td align="right">103.2</td>  <td align="right">292.9</td>  </tr>
<tr><td>C (inlined)</td>  <td align="right">162.0</td>  <td align="right">118.7</td>  <td align="right">375.6</td>  </tr>
<tr><td>C (inlined, unrolled)</td>  <td align="right">186.5</td>  <td align="right">171.7</td>  <td align="right">571.2</td>  </tr>
</table>

<p>As the numbers indicate, the MWC256 implementation in the statistics library is up to 3 times faster than the mersenne-random library, and often faster than the C code you&#8217;d expect to encounter in practice. And you shouldn&#8217;t use the standard Haskell PRNG if you care about performance.</p>

<p>(The last two rows in the table above indicate what you could expect from an aggressive C compiler that performed cross-module inlining and loop unrolling, such as almost works reliably in gcc these days. If you were using normal compilation flags and source file structure, you would be unlikely to see those kinds of numbers in practice.)</p>

<p>A nice feature of the PRNG in the statistics library is that it has a cleaner interface than mersenne-random. Due to the cheesy implementation of the underlying C code, the mersenne-random library enforces the use of a single PRNG for the entire program, which makes it onerous to work with. You have to create a single generator when your program starts, and pass it everywhere that you might eventually need random numbers. In a multi-threaded program, only one thread at a time can use the PRNG.</p>

<p>The PRNG in the statistics library runs inside the ST monad, so you can generate random variates from inside pure Haskell code. You can create or use a generator anywhere without worrying about threading issues, and there&#8217;s a handy function provided that lets you seed a generator from your system&#8217;s random number source.</p>

<p>As a final tweak, the new PRNG generates floating point numbers in the range (0,1], so the variates it generates are safe for using in statistical computations that are zero-phobic, such as those involving logarithms or division. I&#8217;ve also included a generator for normally-distributed floating point numbers that uses <a href="http://www.doornik.com/research/ziggurat.pdf">Doornik&#8217;s modified ziggurat algorithm</a>, so it&#8217;s both fast and trustworthy.</p>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2009/09/19/a-new-pseudo-random-number-generator-for-haskell/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Lazy functional yak shaving in Haskell</title>
		<link>http://www.serpentine.com/blog/2009/09/12/lazy-functional-yak-shaving-in-haskell/</link>
		<comments>http://www.serpentine.com/blog/2009/09/12/lazy-functional-yak-shaving-in-haskell/#comments</comments>
		<pubDate>Sun, 13 Sep 2009 01:01:08 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=424</guid>
		<description><![CDATA[A few weeks ago, I decided that I'd like to focus for a while on getting a 1.0 release of the Haskell text library ready. That work has gone fairly well so far. I've focused on making sure that I like the API, and that a few key functions have good performance characteristics. One such [...]]]></description>
			<content:encoded><![CDATA[<p>A few weeks ago, I decided that I'd like to focus for a while on getting a 1.0 release of the <a href="http://hackage.haskell.org/package/text">Haskell text library</a> ready. That work has gone fairly well so far. I've focused on making sure that I like the API, and that a few key functions have good performance characteristics.</p>

<p>One such function is substring search, which is the underpinning for a number of useful functions (split a string on a delimiter, find all substrings, etc). My initial cut at that function had an obvious API, but ran in O(<em>nm</em>) time, which complexity measure I found mildly embarrassing. So I looked around for an alternative that would be faster, but still simple to implement, and settled upon Fredrik Lundh's <a href="http://effbot.org/zone/stringlib.htm">elegant and trim adaptation</a> of Boyer-Moore-Sunday-Horspool, which has O(<em>n</em>+<em>m</em>) complexity in the typical case.</p>

<p>Using <a href="http://en.wikipedia.org/wiki/QuickCheck">QuickCheck</a>, I was quickly able to satisfy myself that my new code was correct, so now I was faced with demonstrating to myself that the code was <em>fast</em>, too. I looked on <a href="http://hackage.haskell.org/packages/archive/pkg-list.html">Hackage</a> for a good benchmarking package, but although there were several benchmarking packages available, they were all extremely simple, and I wasn't happy with any of them.</p>

<p>I have sort of a mutt's pedigree in computing, having spent several stints working in the high performance scientific computing world, so I felt antsy about measuring performance reliably. Over the course of a few days of hacking, I came up with a good framework that I am currently finishing off.</p>

<p>One of the things a good benchmarking framework needs is some statistical heavy lifting, to do things like autocorrelation analysis, kernel density function estimation, and bootstrapping, so I had to write some complicated statistical code along the way. I've now packaged that code up and released it as the <a href="http://hackage.haskell.org/package/statistics">statistics library</a> on Hackage. Some of the features implemented in this first release of the statistics library include:</p>
<ul>
	<li>Support for common discrete and continuous probability distributions (binomial, gamma, exponential, geometric, hypergeometric, normal, and Poisson)</li>
	<li>Kernel density estimation</li>
	<li>Autocorrelation analysis</li>
	<li>Functions over sample data</li>
	<li>Quantile estimation</li>
	<li>Resampling techniques: jackknife and bootstrap estimation</li>
</ul>
<p>The statistics library certainly isn't yet comprehensive, but it has some features that I think make it very attractive as a base for further work:</p>
<ul>
	<li>It's very fast, building on some of the fantastic software that's available on Hackage these days. I make heavy use of Don Stewart's <a href="http://hackage.haskell.org/package/uvector">uvector library</a> (itself a port of Roman Leshchinskiy's vector library), which means that many functions allocate no memory and execute tight loops using only machine registers. I use Dan Doel's <a href="http://hackage.haskell.org/package/uvector-algorithms">uvector-algorithms library</a> to perform fast partial sorts. I also use Don's <a href="http://hackage.haskell.org/package/mersenne-random">mersenne-random library</a> for fast random number generation when doing bootstrap analysis.</li>
	<li>I've put a fair amount of effort into finding and using algorithms that are numerically stable (trying to avoid problems like catastrophic cancellation). Whenever possible, I indicate which methods are used in the documentation. (For more information on numerical stability, see <a href="http://docs.sun.com/app/docs/doc/800-7895">What Every Scientist Should Know About Floating-Point Arithmetic</a>).</li>
</ul>

<p>For me, one of the killer features of Haskell for statistical computation is its combination of terseness and fabulous libraries. In particular, the modern collection libraries like uvector and text, which are built on stream fusion frameworks, provide a combination of expressiveness and high performance that is simply delightful to work with.</p>

<p>Speaking of improving the statistics library, I've already received a few patches to the statistics library even before announcing it anywhere, and I'm very much looking forward to receiving more. If you want to contribute, go get the source code and hack away:</p>

<pre>darcs get http://darcs.serpentine.com/statistics</pre>

<p>So now that the statistics library is in good enough shape, I can return to work on the benchmarking framework. Once that's done, I'll get back to finishing off the text library. It's been funny to see how rewriting 50 lines of code has spawned off weeks of enjoyable work on other projects! I hope that people will find all of this work useful.</p>

<p>By the way, the philosophy underlying the text, statistics, and benchmarking libraries is uniform across the lot of them: build and release something thatâ€”even though initially incompleteâ€”is thorough enough and of such high quality that other people will be drawn to improving your work instead of creating half a dozen tiny alternatives. So please, join in, and let the patches fly!</p>]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2009/09/12/lazy-functional-yak-shaving-in-haskell/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>BLDGBLOG interviews Kim Stanley Robinson</title>
		<link>http://www.serpentine.com/blog/2007/12/20/bldgblog-interviews-kim-stanley-robinson/</link>
		<comments>http://www.serpentine.com/blog/2007/12/20/bldgblog-interviews-kim-stanley-robinson/#comments</comments>
		<pubDate>Fri, 21 Dec 2007 03:59:59 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[reading]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/2007/12/20/bldgblog-interviews-kim-stanley-robinson/</guid>
		<description><![CDATA[Here is an absolute treat: a long, lively interview with Kim Stanley Robinson, conducted on one of my favourite blogs, BLDGBLOG. At its best (the Three Californias trilogy, Antarctica), Robinson&#8217;s writing is at once haunting and beautifully evocative of a sense of place.]]></description>
			<content:encoded><![CDATA[<p>Here is an absolute treat: a long, lively <a href="http://bldgblog.blogspot.com/2007/12/comparative-planetology-interview-with.html">interview with Kim Stanley Robinson</a>, conducted on one of my favourite blogs, <a href="http://bldgblog.blogspot.com/">BLDGBLOG</a>.</p>
<p>At its best (the <em>Three Californias</em> trilogy, <em>Antarctica</em>), Robinson&#8217;s writing is at once haunting and beautifully evocative of a sense of place.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2007/12/20/bldgblog-interviews-kim-stanley-robinson/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The obsessionally perfect news story</title>
		<link>http://www.serpentine.com/blog/2006/12/12/the-obsessionally-perfect-news-story/</link>
		<comments>http://www.serpentine.com/blog/2006/12/12/the-obsessionally-perfect-news-story/#comments</comments>
		<pubDate>Wed, 13 Dec 2006 04:35:39 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[reading]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/2006/12/12/the-obsessionally-perfect-news-story/</guid>
		<description><![CDATA[Local scientists, ancient reptiles, volcanic eruptions, and Antarctica! All in one story! Really, this article was written precisely and exactly for me. In brief, a paleontologist from Berkeley (across the Bay from me) was involved in a spectacular find: an almost complete skeleton (cartilage and all) of a juvenile plesiosaur, buried in 70 million year [...]]]></description>
			<content:encoded><![CDATA[Local scientists, ancient reptiles, volcanic eruptions, and Antarctica! <a target="_blank" href="http://www.mercurynews.com/mld/mercurynews/news/breaking_news/16215340.htm">All in one story</a>! Really, this article was written precisely and exactly for me.

In brief, a paleontologist from Berkeley (across the Bay from me) was involved in a spectacular find: an almost complete skeleton (cartilage and all) of a juvenile <a target="_blank" href="http://www.plesiosaur.com/">plesiosaur</a>, buried in 70 million year old volcanic ash. The team found it on their second trip to <a target="_blank" href="http://en.wikipedia.org/wiki/Vega_Island">Vega Island</a>, off the coast of the <a target="_blank" href="http://en.wikipedia.org/wiki/Antarctic_Peninsula">Antarctic Peninsula</a>.

Strangely, the coverage in the Valley&#8217;s local paper, the Mercury News (linked above), is of better quality than the corresponding <a target="_blank" href="http://www.nature.com/news/2006/061211/full/061211-4.html">news article in Nature</a>. That doesn&#8217;t happen often.]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2006/12/12/the-obsessionally-perfect-news-story/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Book review: Doug Macdougall, &#8220;Frozen Earth&#8221;</title>
		<link>http://www.serpentine.com/blog/2006/12/06/book-review-doug-macdougall-frozen-earth/</link>
		<comments>http://www.serpentine.com/blog/2006/12/06/book-review-doug-macdougall-frozen-earth/#comments</comments>
		<pubDate>Wed, 06 Dec 2006 18:00:52 +0000</pubDate>
		<dc:creator>Bryan O'Sullivan</dc:creator>
				<category><![CDATA[reading]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://www.serpentine.com/blog/2006/12/06/book-review-doug-macdougall-frozen-earth/</guid>
		<description><![CDATA[Some time ago, I read a Nature review (subscription required) of Doug Macdougall&#8217;s &#8220;Frozen Earth&#8220;. As is the way of such things, after I ordered my copy, the book suffered several months of neglect before I finally had a chance to pick it up. However, once I started reading it, I quickly became engrossed, and [...]]]></description>
			<content:encoded><![CDATA[Some time ago, I read a <a target="_blank" href="http://www.nature.com/nature/journal/v432/n7018/full/432673a.html">Nature review</a> (subscription required) of Doug Macdougall&#8217;s &#8220;<a target="_blank" href="http://www.ucpress.edu/books/pages/10091.html">Frozen Earth</a>&#8220;. As is the way of such things, after I ordered my copy, the book suffered several months of neglect before I finally had a chance to pick it up. However, once I started reading it, I quickly became engrossed, and polished it off in a matter of days.

The book opens with a few chapters on the history of scientific inquiry into climate change. This I consider its only weak part: it follows a formula of the past decade in popular science writing, of sketching the characters involved in the early study of the field. Many of the founders of the field that Macdougall introduces, and the controversies in which they were involved, merit multiple volumes of their own, and so the first few chapters are necessarily skeletal. I found these passages interesting, but not satisfying. The world probably has enough thumbnail sketches of <a target="_blank" href="http://en.wikipedia.org/wiki/Louis_Agassiz">Louis Agassiz</a> at this point.

Once Macdougall leaves behind the early historical narrative, the book kicks into intellectual high gear, and I found the entire rest of the book to be superb. In discussing the <a target="_blank" href="http://en.wikipedia.org/wiki/Missoula_Floods">Missoula Floods</a>, Macdougall touches on both the evidence for catastrophic flooding and the lengthy controversy that Harlen Bretz&#8217;s identification of the source of the <a target="_blank" href="http://en.wikipedia.org/wiki/Channeled_scablands">Channeled Scablands</a> engendered in the American geological community. This chapter is notable for its discussion of both the astounding physical evidence involved, the research that Bretz and subsequent generations of scientists performed, and how scientific controversy works.

Another notable feature of the book is Macdougall&#8217;s discussion of the effects of climate on the development of the human species and human culture, both during early human evolution and through the written records of the past few millenia. Among the surprising topics he touches on is a study of paintings from the <a target="_blank" href="http://en.wikipedia.org/wiki/Little_Ice_Age">Little Ice Age</a> (roughly the 13th through 19th centuries) in which someone noted light levels and counted the number of times in which the sky was represented as clear or overcast, as a proxy for weather conditions of the time.

I appreciated Macdougall&#8217;s discussion of the drawing together of data from a variety of sources to try to portray a consistent and continuous picture of climate over time. Since so many of the signals upon which paleoclimatologists rely are either faint, broken up, or distorted, they look for correlations between as many sources of data as possible, and correct for numerous possible errors as they go. Macdougall discusses many of the proxies used, such as isotopic stratigraphy and measurements of many properties of ice cores; why some of the data are unreliable; and how climate scientists identify and correct errors.

Most importantly, Macdougall writes clearly and engagingly throughout the book. He doesn&#8217;t shy away from complex topics, but he presents them clearly; he limits his use of jargon, and he remembers to introduce a new term by telling the reader what it means. Once I got over my dubious reaction to the first few chapters, I found the bulk of &#8220;Frozen Earth&#8221; to be intellectually exhilarating and thoroughly enjoyable.]]></content:encoded>
			<wfw:commentRss>http://www.serpentine.com/blog/2006/12/06/book-review-doug-macdougall-frozen-earth/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

