On the Haskell libraries mailing list, Cale Gibbard made an interesting suggestion for a couple of useful list-related functions that could be added to the standard list library.
The particular one I'm interested in is his separate function:
separate :: [a] -> [([a],a,[a])]
The idea is that this function should produce a list of all ways of separating a list into an initial segment, a single element, and a final segment. For instance:
separate [1,2,3,4]
==> [([],1,[2,3,4]),
([1],2,[3,4]),
([1,2],3,[4]),
([1,2,3],4,[])]
This is very similar to the find function I wrote for the text package, which has the following type:
find :: Text -> Text -> (Text, [(Text, Text)])
This behaves as follows:
find "/" "a/b/c/d"
==> ("a", [("/b","/b/c/d"),
("/c","/c/d"),
("/d","/d")])
The first element of each pair in the list is a span from the beginning of a match to the beginning of the next match, while the second is a span from the beginning of the match to the end of the input.
Even though I wrote this find function (after much helpful discussion with Duncan Coutts), I find its type signature and behaviour a little tricky to explain.
What's interesting to me is that Cale's proposed separate function has a much simpler type and, I believe, easier-to-understand behaviour.
In fact, I am thinking that I should change find to act similarly. In other words, its type would change to the following:
find :: Text -> [(Text,Text,Text)]
And its behaviour would change as follows:
find "/" "foo/bar/quux/"
==> [("foo","/","bar/quux/"),
("foo/bar","/","quux/"),
("foo/bar/quux","/","")]
From observation, it's easy to explain how this function behaves. The nth element in the list (counting from zero, because we're computer scientists, right?) consists of a triple:
At any match, you can concatenate all three elements and get the original string back.
The simplicity of this function greatly appeals to me. I also find it telling that when I changed the function's behaviour, I was able to simplify its implementation.
Another interesting way of looking at this proposed change to find is that it now sequentially encodes all the ways you can advance (without overlap) one step through a zipper over the text, where a step takes you from one match to the next.
I think the change of behaviour gives us a piece of functionality that is much more appealing than the original find function, and so I'm inclined to release a text 0.8.0.0 package in a few days with the rewrite. What do you think?
(Credit where it's due: my recollection is that Duncan tickled the back of my mind with a proposal similar to this a few months ago, but I've no idea where or when that exchange actually took place, or what the specifics of his proposal were.)