<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Using Bloom filters for large scale gene sequence analysis in Haskell</title>
	<atom:link href="http://www.serpentine.com/blog/2008/09/28/using-bloom-filters-for-large-scale-gene-sequence-analysis-in-haskell/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.serpentine.com/blog/2008/09/28/using-bloom-filters-for-large-scale-gene-sequence-analysis-in-haskell/</link>
	<description>Bryan O&#039;Sullivan&#039;s blog</description>
	<lastBuildDate>Wed, 08 Feb 2012 06:41:38 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Tim Yates</title>
		<link>http://www.serpentine.com/blog/2008/09/28/using-bloom-filters-for-large-scale-gene-sequence-analysis-in-haskell/comment-page-1/#comment-185877</link>
		<dc:creator>Tim Yates</dc:creator>
		<pubDate>Mon, 13 Oct 2008 15:25:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=275#comment-185877</guid>
		<description>Nice paper :)  I&#039;m evaluating using a Bloom filter for getting my 25mer probe sequences pre-filtered into sets per chromosome rather than searching for millions of them for each chromosome in turn..  How do you ensure you are not out of phase on the words you extract from the target sequence?  ie:  If I read 8mer words with an overlap of 2, how do I ensure I am just not out by one base, thereby missing the words existence when running it through my hashes?

I&#039;ve probably got the algorithm wrong in my head...  More reading required ;)</description>
		<content:encoded><![CDATA[<p>Nice paper <img src='http://www.serpentine.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />   I&#8217;m evaluating using a Bloom filter for getting my 25mer probe sequences pre-filtered into sets per chromosome rather than searching for millions of them for each chromosome in turn..  How do you ensure you are not out of phase on the words you extract from the target sequence?  ie:  If I read 8mer words with an overlap of 2, how do I ensure I am just not out by one base, thereby missing the words existence when running it through my hashes?</p>
<p>I&#8217;ve probably got the algorithm wrong in my head&#8230;  More reading required <img src='http://www.serpentine.com/wordpress/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeremy Leipzig</title>
		<link>http://www.serpentine.com/blog/2008/09/28/using-bloom-filters-for-large-scale-gene-sequence-analysis-in-haskell/comment-page-1/#comment-182871</link>
		<dc:creator>Jeremy Leipzig</dc:creator>
		<pubDate>Wed, 01 Oct 2008 16:11:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=275#comment-182871</guid>
		<description>Interesting paper. Aligning ESTs is a bit old-school, but there is a lot of interest in aligning many very short sequences (&lt;30bp) sequences to the genome at high or exact thresholds. Due to its k-mer based heuristics, BLAT has not been very good at finding these matches. A lot of researchers have been turning to suffix trees and as a result they are spending a lot more time at home with their families.
I think would be interesting to implement a short sequence alignment tool along these lines in Haskell using your Bloom filters. I&#039;m not sure the bottleneck is in storage, but perhaps the decreased footprint could make a distributed solution more attractive.</description>
		<content:encoded><![CDATA[<p>Interesting paper. Aligning ESTs is a bit old-school, but there is a lot of interest in aligning many very short sequences (&lt;30bp) sequences to the genome at high or exact thresholds. Due to its k-mer based heuristics, BLAT has not been very good at finding these matches. A lot of researchers have been turning to suffix trees and as a result they are spending a lot more time at home with their families.<br />
I think would be interesting to implement a short sequence alignment tool along these lines in Haskell using your Bloom filters. I&#8217;m not sure the bottleneck is in storage, but perhaps the decreased footprint could make a distributed solution more attractive.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Yatima</title>
		<link>http://www.serpentine.com/blog/2008/09/28/using-bloom-filters-for-large-scale-gene-sequence-analysis-in-haskell/comment-page-1/#comment-182423</link>
		<dc:creator>Yatima</dc:creator>
		<pubDate>Mon, 29 Sep 2008 16:47:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=275#comment-182423</guid>
		<description>You&#039;ll be pleased to hear that Haskell&#039;s laziness and type-inferencing, along with Anathem, were major topics of conversation at my date night last night. I adore geeks. That is all.</description>
		<content:encoded><![CDATA[<p>You&#8217;ll be pleased to hear that Haskell&#8217;s laziness and type-inferencing, along with Anathem, were major topics of conversation at my date night last night. I adore geeks. That is all.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan G</title>
		<link>http://www.serpentine.com/blog/2008/09/28/using-bloom-filters-for-large-scale-gene-sequence-analysis-in-haskell/comment-page-1/#comment-182314</link>
		<dc:creator>Dan G</dc:creator>
		<pubDate>Mon, 29 Sep 2008 03:47:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.serpentine.com/blog/?p=275#comment-182314</guid>
		<description>I think this project uses similar techniques for network forensics, http://isis.poly.edu/projects/fornet/</description>
		<content:encoded><![CDATA[<p>I think this project uses similar techniques for network forensics, <a href="http://isis.poly.edu/projects/fornet/" rel="nofollow">http://isis.poly.edu/projects/fornet/</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>

