Finally! Fast Unicode support for Haskell

On behalf of the Data.Text team, I am delighted to announce the release of preview versions of two new packages:

text 0.1
Fast, packed Unicode text support, using a modern stream fusion framework.

text-icu 0.1
Augments the text package with comprehensive character set conversion support and normalization (and soon more), via bindings to the ICU library.

These packages fill out critical pieces of functionality for the Haskell platform, without compromising on either performance or safety. Stream fusion offers the possibility of writing text manipulation code in a clean, high-level way, with intermediate allocations and traversals being fused away.

We are referring to these as preview releases because although the text package in particular has been quite heavily tested, it has not been thoroughly tuned, and we have not yet implemented a chunked lazy text representation suitable for streaming gigabytes of data. The APIs are pretty conventional, but are still subject to change.

If you want to contribute, please get copies of the source trees from here:

Posted in climbing
6 comments on “Finally! Fast Unicode support for Haskell
  1. Josef Svenningsson says:

    Excellent work! This has been a weak spot for Haskell for ages. I’m glad to see somebody stepping up on the plate and fix this.

  2. Shae Erisson says:

    Awesome! That’s great! w00!

  3. Edward Kmett says:

    You rock.

    Now all I need is access to ICU normalizers. *hopeful puppydog look* =)

  4. Josef and Shae – thanks!

    Edward, grab text-icu and load up Data.Text.ICU.Normalizer 🙂

  5. Edward Kmett says:

    Hah! Much obliged.

  6. newsham says:

    Thank you, guys!

1 Pings/Trackbacks for "Finally! Fast Unicode support for Haskell"
  1. […] just released version 0.2 of the Haskell text library that I announced back in February. This version fixes a number of bugs, but much more significantly, it adds a streaming mode: you […]

Leave a Reply

Your email address will not be published. Required fields are marked *