Subscribe to
Posts
Comments

One of the pleasant new features in GHC 6.10 is the long-awaited addition of view patterns. This feature is usually advertised as making it possible to pattern match against the values of an abstract type. An essential aspect of modular software design is that we don't want to expose the implementation of complex code. Someone will surely come along and start making decisions based on bits of our code that they can see, thus limiting our future room to manoeuvre. This is all amply explained on the view pattern wiki page and in the GHC User's Guide.

However, view patterns give us some nice capabilities even if we're not building fancy abstract code. For instance, I often find myself wanting a function like this:

dropPrefix :: Eq a => [a] -> [a] -> ([a],[a])
dropPrefix left@(x:xs) right@(y:ys)
    | x == y    = dropPrefix xs ys
dropPrefix left right = (left,right)

Given two lists with common prefixes, this drops any common prefix from each, and returns the residual lists. Here are some examples from messing around with ghci:

Prelude> dropPrefix "foo" "foobar"
("","bar")
Prelude> dropPrefix "foo" "quux"
("foo","quux")
Prelude> dropPrefix "foolish" "football"
("lish", "tball")

A perennial source of mild annoyance with Haskell patterns is that matching a pattern against a string is a pain. Suppose we're doing some quick and dirty processing of a FASTA protein sequence:

>gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV

The ">" above indicates that a line is a header, while the data on the following line is the beginning of the cytochrome B protein sequence. If we we're reading a GenBank sequence, we know that the header will begin with ">gi|", but matching this with a normal Haskell pattern is a pain:

genbankHeader ('>':'g':'i':'|':rest) = muddle rest

All of the ticcy syntactic noise above masks the simplicity of our aim: we want to bind the name rest to the remainder of the string, if it begins with ">gi|".

An alternative formulation might be as follows:

genbankHeader str | ">gi|" `isPrefixOf` str =
    let rest = drop 4 str
    in muddle rest

This is easier to read, but it's quite wordy, and we have to traverse str twice: once to match, once to drop.

What can we do with our dropPrefix function?

genbankHeader str =
    case dropPrefix ">gi|" str of
      ("", rest) -> muddle rest

This is better, but still not ideal. What can view patterns offer here?

{-# LANGUAGE ViewPatterns #-}
genbankHeader (dropPrefix ">gi|" -> ("", rest)) = muddle rest

GHC will compile this last form into the equivalent of the case expression above, but this notation is much more compact, and hence (at least to my eyes) rather easier to read.

5 Responses to “Fun with Haskell view patterns”

  1. on 11 Jan 2009 at 06:15Joachim Breitner

    Is there a copy’n’paste error in
    Prelude> dropPrefix “foolish” “football”
    (“lish”, “ball”)
    which should rather be
    Prelude> dropPrefix “foolish” “football”
    (“lish”, “tball”)
    ?

  2. on 11 Jan 2009 at 07:31Marc

    I don’t know about newer GHCs, but in 6.8.1, dropPrefix “foo” “foobar” gives a pattern match error (since left becomes [] after some time).
    Also, what is the “after” function for?

  3. on 11 Jan 2009 at 10:20Alf

    Marc: presumably ‘after’ was the original name of dropPrefix. Rename it, and the function starts working.

    The ‘otherwise’ line is unnecessary, btw.

  4. on 11 Jan 2009 at 10:57Bryan O'Sullivan

    Oops, thanks for noticing those thinkos, gentlemen. That’s what I get for composing a post at midnight!

  5. on 11 Jan 2009 at 11:11Twan van Laarhoven

    Instead of writing your own ‘dropPrefix’, there is also the standard ‘stripPrefix :: Eq a => [a] -> [a] -> Maybe [a]‘, which you can also use with view patterns:

    {-# LANGUAGE ViewPatterns #-}
    genbankHeader (stripPrefix “>gi|” -> Just rest) = muddle rest

Leave a Reply