A few days ago, I wrote a Haskell library for building and working with suffix trees. It builds a suffix tree lazily, so even though its performance is *O(n log n)* on large input strings, it often has linear performance for many applications.

The implementation is simple and easy to read; the API is well-documented and easy to use; and the internals of the tree structure are exposed, so you can build your own traversal, inspection, and manipulation functions on top of it. The code is liberally licensed (BSD3), to encourage wide use.

You can download the stable version from HackageDB, or fetch the source repository:

`darcs get http://darcs.serpentine.com/suffixtree`

I’m surprised that construction is so limited; right now the library is only useful for searching for a sequence in a long list and not, for instance, for looking up identifiers.

That is, I think construct should have type

construct’ :: (Ord a) => [[a]] -> STree a

where the constructor you have now is

construct l = construct’ (tails l)

No, the library isn’t only useful for searching. It’s also usable for a number of other tasks, such as counting repeats, maximal unique matching, and so on.

If you wanted to be able to store arbitrary sequences, you’d need a trie, not a suffix tree. The two are related, but they’re not the same, and you can’t use the suffix tree construction algorithm to create a trie.